Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation

we aim to predict trends in the Canadian market basket using sentiment analysis techniques. Sentiment analysis involves analyzing text data to determine the sentiment expressed, whether positive, negative, or neutral.
https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation

algorithms-and-data-structures data data-analysis data-science data-visualization feature-engineering machine-learning matplotlib-pyplot numerical-analysis numpy pandas pipelines python sklearn structured-data super unsupervised-learning

Last synced: about 2 months ago
JSON representation

we aim to predict trends in the Canadian market basket using sentiment analysis techniques. Sentiment analysis involves analyzing text data to determine the sentiment expressed, whether positive, negative, or neutral.

Awesome Lists containing this project

README

        

# **Prediction of Canadian Inflation through Machine Learning and Sentiment Analysis**

**Autores**
Sergio Torres
[Github User](https://github.com/xstorresm)

Yeison Montoya
[Github User](https://github.com/yeisonmontoya1815)

follow me on [Linkedin](https://www.linkedin.com/in/yeisonmontoya/)

## Description

This project analyzes inflation data and sentiment analysis from Google News to understand the relationship between news sentiment and inflation rates in Canada. It includes data preprocessing, exploratory data analysis, feature selection, model building, and visualization using Python libraries such as pandas, sci-kit-learn, NLTK, spaCy, and matplotlib.

## Table of Contents

- [Usage](#usage)
- [Data](#data)
- [Features](#features)
- [References](#references)
- [Websites](#websites)
- [Models](#models)
- [Visualization](#visualization)
- [Contributing](#contributing)
- [License](#license)

## Usage

Navigate to the project directory: `cd project`
Run the Jupyter Notebook: `Jupyter Notebook`
Open and run the analysis in the provided Jupyter Notebook.

## Data

The data used in this project includes:

- 'Canada_Inflation_Market_Basket.csv' file
- 'news.csv' file
- 'news2.csv' file
- 'Sentiment_results.csv' file

## References

- Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3), 90-95.
- Python Software Foundation. (n.d.). Matplotlib: Visualization with Python. Retrieved from [Matplotlib](https://matplotlib.org/)
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825-2830.
- Raschka, S., & Mirjalili, V. (2019). Python Machine Learning: Machine Learning and Deep Learning with Python, sci-kit-learn, and TensorFlow 2. Packt Publishing Ltd.

## Websites

- [Statistics Canada](https://news.google.com/articles/CBMiVGh0dHBzOi8vd3d3LnN0YXRjYW4uZ2MuY2EvbzEvZW4vcGx1cy8zMDk2LXNuYXBzaG90LWhvdy1pbmZsYXRpb24tYWZmZWN0aW5nLWNhbmFkaWFuc9IBAA?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [Bank of Canada](https://news.google.com/articles/CBMiVGh0dHBzOi8vd3d3LmJhbmtvZmNhbmFkYS5jYS8yMDIyLzEwL3doYXRzLWhhcHBlbmluZy10by1pbmZsYXRpb24tYW5kLXdoeS1pdC1tYXR0ZXJzL9IBAA?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [TD Economics](https://news.google.com/articles/CBMiMWh0dHBzOi8vZWNvbm9taWNzLnRkLmNvbS9jYS1pbmZsYXRpb24tbmV3LXZpbnRhZ2XSAQA?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [CBC News](https://news.google.com/articles/CBMiQWh0dHBzOi8vd3d3LmNiYy5jYS9uZXdzL2J1c2luZXNzL2luZmxhdGlvbi1qYW51YXJ5LTIwMjQtMS43MTE5Nzk20gEgaHR0cHM6Ly93d3cuY2JjLmNhL2FtcC8xLjcxMTk3OTY?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [CP24](https://news.google.com/articles/CBMidWh0dHBzOi8vd3d3LmNwMjQuY29tL25ld3MvdW5hbWJpZ3VvdXNseS1nb29kLWluZmxhdGlvbi1zbG93cy1pbi1mZWJydWFyeS1hcy1wcmljZS1ncm93dGgtdW5leHBlY3RlZGx5LWVhc2VzLTEuNjgxMzE0M9IBAA?hl=en-CA&gl=CA&ceid=CA%3Aen)

## Features

- Data Preprocessing: Handling missing values, converting data types, sorting, and filtering.
- Exploratory Data Analysis: Analyzing trends, correlations, and distributions.
- Feature Selection: Selecting relevant features for modelling using pipelines and feature selection techniques.
- Model Building: Implementing machine learning models such as Logistic Regression, Random Forest, Naive Bayes, etc.
- Visualization: Creating visualizations such as ROC curves, confusion matrices, word clouds, etc.

## Models

The project uses various machine learning models for analysis, including Logistic Regression, Random Forest, Naive Bayes, etc. These models are evaluated based on performance metrics such as accuracy, recall, and ROC-AUC score.

## Visualization

The project includes visualizations generated using libraries like Matplotlib, seaborn, and Word Cloud to represent data trends, model performance, and word frequency analysis.

## Contributing

Contributions to this project are welcome. You can contribute by opening a pull request or raising issues for bug fixes, feature requests, or improvements.

## License

This project is licensed under the Creative Commons Attribution 4.0 International License - see the [LICENSE](LICENSE) file for details.