Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/neerajcodes888/data-science

This repository is a hub for data science enthusiasts, offering a diverse collection of projects, notebooks, and resources covering topics such as data analysis, machine learning, deep learning, and generative AI. Explore innovative ideas, contribute to cutting-edge research, and enhance your skills in the dynamic field of data science
https://github.com/neerajcodes888/data-science

data-analysis data-science data-visualization deep-learning deep-learning-algorithms eda genai jupyter-notebook machine-learning machine-learning-algorithms openai-api pandas plotting python3 sklearn-library streamlit

Last synced: about 1 month ago
JSON representation

This repository is a hub for data science enthusiasts, offering a diverse collection of projects, notebooks, and resources covering topics such as data analysis, machine learning, deep learning, and generative AI. Explore innovative ideas, contribute to cutting-edge research, and enhance your skills in the dynamic field of data science

Awesome Lists containing this project

README

        

# Data Science πŸ“ŠπŸ“ˆπŸ€–

![Data-Science](https://wallpapercave.com/wp/wp4748478.jpg)

## Table of Contents πŸ“‘

- [Scope of Learning](#scope-of-learning)
- [Deployed Link and Repo Link](#deployed-link-and-repo-link)
- [Ideas](#ideas)
- [Vision](#vision)
- [Innovative Ideas Description](#innovative-ideas-description)
- [Prerequisites](#prerequisites)
- [LLM (Gen AI)](#llm-gen-ai)
- [Index of Content](#index-of-content)
- [List of Contents](#list-of-contents)
- [Contributing](#contributing)
- [License](#license)

## Scope of Learning πŸŽ“

This repository is aimed at providing hands-on learning experiences in the following areas:
- Data Analysis
- Machine Learning
- Deep Learning
- LLM (Gen AI)

## Deployed Link and Repo Link 🌐

| Index | Project | Deployed Link | Repository Link | Tools Used |
|-------|------------------------|-----------------------------------------------------------|------------------------------------------------------|---------------|
| **1** | Car Price Prediction | [Deployed Link](https://carpricepredict-crlkxz3lbkn.streamlit.app/) | [Repo Link](https://github.com/neerajcodes888/Data-Science/tree/main/Machine%20Learning/Car%20Price%20Prediction) | Streamlit, Scikit-learn, Pandas, NumPy |
| **2** | Car Price Prediction | [Deployed Link](https://github.com/neerajcodes888/Data-Science/tree/main/Machine%20Learning/Car%20Price%20Prediction) | [Repo Link](https://yourcarprice.onrender.com/) | Flask, Scikit-learn, Pandas, NumPy |
| **3** | Loan Price Prediction | [Deployed Link](https://loan-approval-prediction-b5l5.onrender.com/) | [Repo Link](https://github.com/neerajcodes888/Data-Science/tree/main/Machine%20Learning/Loan%20Approval%20Prediction) | Flask, Scikit-learn, Pandas, NumPy |
| **4** | Diwali Sales Analysis | Not Deployed | [Repo Link](https://github.com/neerajcodes888/Data-Science/tree/main/Data%20Analysis/Diwali%20Sales%20Analysis) | Pandas, NumPy , PyPlot , Seaborn|
| **5** | Cat Vs Dog Image Classification | Not Deployed | [Repo Link](https://github.com/neerajcodes888/Data-Science/tree/main/Deep%20Learning/CatVsDog%20Image%20Classification) | Tensorflow , Keras , Matplotlib |
| **6** | Advanced Resume Tracking System | [Deployed Link](https://advancedresumetracking.onrender.com/)| [Repo Link](https://github.com/neerajcodes888/Data-Science/tree/main/LLM%20Generative%20AI/Adavanced%20Resume%20Tracking%20System) | LLM , Generative-AI , PyPDF , Streamlit |

## Ideas πŸ“‹

Here are your project ideas presented in a tabular format:

| Project Idea | Description | Domain |
|--------------------------------------|---------------------------------------------------------------------------------------------------------------|-----------------------------|
| Indian Economy Analysis | Analyze various economic indicators and trends to understand the current state and predict future scenarios. | Economics, Data Analysis |
| Diwali Sales Analysis | Analyze sales data before, during, and after Diwali to identify trends, patterns, and optimize marketing strategies. | Retail, Sales Analysis |
| Car Price Prediction | Develop a machine learning model to predict the price of cars based on various features such as mileage, brand, etc. | Machine Learning, Automotive |
| Loan Approval Prediction | Build a machine learning model to predict whether a loan application will be approved or rejected by a financial institution. | Machine Learning, Finance |
| Cat vs Dog Classification | Create a deep learning model to classify images of cats and dogs accurately. | Deep Learning, Computer Vision |
| Advanced Resume Tracking System | Implement a comprehensive system using LLM techniques to track and analyze resumes for job matching and recruitment. | LLM (Gen AI), Human Resources |

## Vision πŸ‘οΈ

Our vision is to facilitate learning and exploration in the field of data science by providing well-documented code, tutorials, and resources. We aim to empower individuals to understand and apply data science techniques to real-world problems.

## Innovative Ideas Description πŸ’‘

We strive to incorporate innovative approaches and ideas in our projects, pushing the boundaries of traditional data science methodologies. Some of the innovative ideas explored in this repository include:
- Novel feature engineering techniques
- Advanced model architectures
- Cutting-edge visualization methods

## Prerequisites πŸ› οΈ

Before running the code in this repository, ensure you have the following dependencies installed:
- pandas
- numpy
- scikit-learn (sklearn)
- seaborn
- matplotlib
- plotly

Additionally, for deep learning models, you will need:
- TensorFlow
- Keras

For LLM (Gen AI) models, you will also need:
- OpenAI library
- Gen AI libraries

You can install the required dependencies using pip:
```bash
pip install pandas numpy scikit-learn seaborn matplotlib plotly tensorflow keras openai gen_ai
```

## LLM (Gen AI) πŸ§ πŸ€–

LLM (Gen AI) extends the LLM framework to incorporate Generative AI techniques, enabling the generation of novel data, images, text, etc., and exploring the possibilities of AI-driven creativity.

## Index of Content πŸ“„

1. [Data Analysis](./data_analysis/README.md)
2. [Machine Learning](./machine_learning/README.md)
3. [Deep Learning](./deep_learning/README.md)

Each section contains detailed notebooks, code, and explanations for specific projects and concepts.

## List of Contents πŸ“‹

- `data_analysis`: Contains notebooks and code for data analysis projects.
- `machine_learning`: Includes notebooks and code for machine learning projects.
- `deep_learning`: Consists of notebooks and code for deep learning projects.
- `LLM`: Includes notebooks and code for projects related to the LLM (Data Analysis, Machine Learning, Deep Learning) framework.

Feel free to explore each section and dive into the projects to enhance your understanding of data science concepts.

## Credits πŸ™

I would like to express my gratitude to the developers of the various data science tools, libraries, and models that have been instrumental in the creation of this repository:

### Tools and Libraries

- [pandas](https://pandas.pydata.org/): Developed by Wes McKinney and contributors, pandas is a powerful data manipulation and analysis library for Python.
- [NumPy](https://numpy.org/): Created by Travis Oliphant, NumPy is the fundamental package for scientific computing with Python.
- [scikit-learn](https://scikit-learn.org/): Developed by a community of contributors, scikit-learn is a versatile machine learning library for Python.
- [seaborn](https://seaborn.pydata.org/): Developed by Michael Waskom and contributors, seaborn is a Python visualization library based on matplotlib for statistical graphics.
- [matplotlib](https://matplotlib.org/): Developed by John D. Hunter (and later Michael Droettboom and contributors), matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
- [plotly](https://plotly.com/python/): Developed by Plotly Technologies, plotly is a graphing library for Python that creates interactive, publication-quality graphs online.
- [TensorFlow](https://www.tensorflow.org/): Developed by the Google Brain team and contributors, TensorFlow is an open-source platform for machine learning and deep learning.
- [Keras](https://keras.io/): Developed by François Chollet and contributors, Keras is an open-source neural network library written in Python that serves as a high-level API for TensorFlow.
- [OpenAI](https://openai.com/): Developed by OpenAI, OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc.
- [Gen AI Libraries](https://gen.ai/): Developed by Gen AI, Gen AI Libraries provide tools and frameworks for Generative AI techniques, enabling the generation of novel data, images, text, etc.

### Machine Learning Models

- [XGBoost](https://xgboost.readthedocs.io/en/latest/): Developed by a community of contributors, XGBoost is an optimized distributed gradient boosting library designed for speed and performance.
- [LightGBM](https://lightgbm.readthedocs.io/en/latest/): Developed by Microsoft, LightGBM is a gradient boosting framework that uses tree-based learning algorithms.
- [CatBoost](https://catboost.ai/): Developed by Yandex, CatBoost is an open-source gradient boosting library that provides state-of-the-art results out of the box.
- [SciPy](https://www.scipy.org/): Developed by a community of contributors, SciPy is a scientific computing library that builds on NumPy and provides additional functionality.
- [StatsModels](https://www.statsmodels.org/stable/index.html): Developed by a community of contributors, StatsModels is a Python module that provides classes and functions for the estimation of many different statistical models.

### Deep Learning Models

- [PyTorch](https://pytorch.org/): Developed by Facebook's AI Research lab (FAIR) and contributors, PyTorch is an open-source machine learning library based on the Torch library.
- [fastai](https://docs.fast.ai/): Developed by fast.ai, fastai is a deep learning library built on top of PyTorch that provides high-level abstractions for training and deploying deep learning models.

We extend our sincere appreciation to these developers and the broader open-source community for their invaluable contributions to the field of data science.

## Contributing 🀝

Contributions to this repository are welcome! Whether it's fixing a bug, adding a new project, or improving documentation, your contributions help make this resource better for everyone.

Please refer to the [contribution guidelines](CONTRIBUTING.md) before submitting your contributions.

## License πŸ“

This repository is licensed under the MIT License. See the [LICENSE](https://github.com/neerajcodes888/Data-Science/blob/main/LICENSE) file for details.