Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation
we aim to predict trends in the Canadian market basket using sentiment analysis techniques. Sentiment analysis involves analyzing text data to determine the sentiment expressed, whether positive, negative, or neutral.
https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation
algorithms-and-data-structures data data-analysis data-science data-visualization feature-engineering machine-learning matplotlib-pyplot numerical-analysis numpy pandas pipelines python sklearn structured-data super unsupervised-learning
Last synced: about 2 months ago
JSON representation
we aim to predict trends in the Canadian market basket using sentiment analysis techniques. Sentiment analysis involves analyzing text data to determine the sentiment expressed, whether positive, negative, or neutral.
- Host: GitHub
- URL: https://github.com/yeisonmontoya1815/machine-learning_prediction_can_inflation
- Owner: yeisonmontoya1815
- License: cc0-1.0
- Created: 2024-04-04T20:58:27.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-06-28T06:44:01.000Z (7 months ago)
- Last Synced: 2024-06-28T07:25:47.592Z (7 months ago)
- Topics: algorithms-and-data-structures, data, data-analysis, data-science, data-visualization, feature-engineering, machine-learning, matplotlib-pyplot, numerical-analysis, numpy, pandas, pipelines, python, sklearn, structured-data, super, unsupervised-learning
- Language: HTML
- Homepage:
- Size: 28.3 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# **Prediction of Canadian Inflation through Machine Learning and Sentiment Analysis**
**Autores**
Sergio Torres
[Github User](https://github.com/xstorresm)Yeison Montoya
[Github User](https://github.com/yeisonmontoya1815)follow me on [Linkedin](https://www.linkedin.com/in/yeisonmontoya/)
## Description
This project analyzes inflation data and sentiment analysis from Google News to understand the relationship between news sentiment and inflation rates in Canada. It includes data preprocessing, exploratory data analysis, feature selection, model building, and visualization using Python libraries such as pandas, sci-kit-learn, NLTK, spaCy, and matplotlib.
## Table of Contents
- [Usage](#usage)
- [Data](#data)
- [Features](#features)
- [References](#references)
- [Websites](#websites)
- [Models](#models)
- [Visualization](#visualization)
- [Contributing](#contributing)
- [License](#license)## Usage
Navigate to the project directory: `cd project`
Run the Jupyter Notebook: `Jupyter Notebook`
Open and run the analysis in the provided Jupyter Notebook.## Data
The data used in this project includes:
- 'Canada_Inflation_Market_Basket.csv' file
- 'news.csv' file
- 'news2.csv' file
- 'Sentiment_results.csv' file## References
- Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3), 90-95.
- Python Software Foundation. (n.d.). Matplotlib: Visualization with Python. Retrieved from [Matplotlib](https://matplotlib.org/)
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825-2830.
- Raschka, S., & Mirjalili, V. (2019). Python Machine Learning: Machine Learning and Deep Learning with Python, sci-kit-learn, and TensorFlow 2. Packt Publishing Ltd.## Websites
- [Statistics Canada](https://news.google.com/articles/CBMiVGh0dHBzOi8vd3d3LnN0YXRjYW4uZ2MuY2EvbzEvZW4vcGx1cy8zMDk2LXNuYXBzaG90LWhvdy1pbmZsYXRpb24tYWZmZWN0aW5nLWNhbmFkaWFuc9IBAA?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [Bank of Canada](https://news.google.com/articles/CBMiVGh0dHBzOi8vd3d3LmJhbmtvZmNhbmFkYS5jYS8yMDIyLzEwL3doYXRzLWhhcHBlbmluZy10by1pbmZsYXRpb24tYW5kLXdoeS1pdC1tYXR0ZXJzL9IBAA?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [TD Economics](https://news.google.com/articles/CBMiMWh0dHBzOi8vZWNvbm9taWNzLnRkLmNvbS9jYS1pbmZsYXRpb24tbmV3LXZpbnRhZ2XSAQA?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [CBC News](https://news.google.com/articles/CBMiQWh0dHBzOi8vd3d3LmNiYy5jYS9uZXdzL2J1c2luZXNzL2luZmxhdGlvbi1qYW51YXJ5LTIwMjQtMS43MTE5Nzk20gEgaHR0cHM6Ly93d3cuY2JjLmNhL2FtcC8xLjcxMTk3OTY?hl=en-CA&gl=CA&ceid=CA%3Aen)
- [CP24](https://news.google.com/articles/CBMidWh0dHBzOi8vd3d3LmNwMjQuY29tL25ld3MvdW5hbWJpZ3VvdXNseS1nb29kLWluZmxhdGlvbi1zbG93cy1pbi1mZWJydWFyeS1hcy1wcmljZS1ncm93dGgtdW5leHBlY3RlZGx5LWVhc2VzLTEuNjgxMzE0M9IBAA?hl=en-CA&gl=CA&ceid=CA%3Aen)## Features
- Data Preprocessing: Handling missing values, converting data types, sorting, and filtering.
- Exploratory Data Analysis: Analyzing trends, correlations, and distributions.
- Feature Selection: Selecting relevant features for modelling using pipelines and feature selection techniques.
- Model Building: Implementing machine learning models such as Logistic Regression, Random Forest, Naive Bayes, etc.
- Visualization: Creating visualizations such as ROC curves, confusion matrices, word clouds, etc.## Models
The project uses various machine learning models for analysis, including Logistic Regression, Random Forest, Naive Bayes, etc. These models are evaluated based on performance metrics such as accuracy, recall, and ROC-AUC score.
## Visualization
The project includes visualizations generated using libraries like Matplotlib, seaborn, and Word Cloud to represent data trends, model performance, and word frequency analysis.
## Contributing
Contributions to this project are welcome. You can contribute by opening a pull request or raising issues for bug fixes, feature requests, or improvements.
## License
This project is licensed under the Creative Commons Attribution 4.0 International License - see the [LICENSE](LICENSE) file for details.