Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jpcadena/autochain-bot
This project is dedicated to building an intelligent data processing pipeline using AutoChain LLM and BERT ML.
https://github.com/jpcadena/autochain-bot
algorithm artificial-intelligence autochain chatgpt data-engineering data-science large-language-model llm machine-learning model openai pandas python pytorch
Last synced: 9 days ago
JSON representation
This project is dedicated to building an intelligent data processing pipeline using AutoChain LLM and BERT ML.
- Host: GitHub
- URL: https://github.com/jpcadena/autochain-bot
- Owner: jpcadena
- License: mit
- Created: 2023-08-09T18:03:22.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-02-05T18:48:38.000Z (12 months ago)
- Last Synced: 2024-11-15T08:19:05.487Z (2 months ago)
- Topics: algorithm, artificial-intelligence, autochain, chatgpt, data-engineering, data-science, large-language-model, llm, machine-learning, model, openai, pandas, python, pytorch
- Language: Python
- Homepage: https://github.com/jpcadena/autochain-bot
- Size: 278 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# autochain-bot
Table of Contents
About The Project
Getting Started
- Usage
- Testing
- Contributing
- Security
- Code of Conduct
- License
- Contact
## About The Project
![Project][project-screenshot]
This project is dedicated to building an intelligent data processing pipeline,
integrating state-of-the-art machine learning models like BERT and AutoChain.
The pipeline encompasses various stages including data preparation, engineering,
analysis, and configuration of AutoChain, forming a comprehensive and robust
data analytics solution.The project involves several key phases:
Data Preparation: Extracting and loading the raw data, followed by cleaning, and
preprocessing.
Data Engineering: Features are extracted and transformed for better insights and
model compatibility.
Data Analysis: Including both numerical analysis and visual analytics for
in-depth data understanding.
Feature Engineering: Selecting and crafting the best attributes that will
enhance the modeling.
Model Integration: Utilizing the BERT model for sequence processing and
AutoChain for automating the machine learning pipeline.
The implemented system facilitates both exploratory data analysis (EDA) and
predictive modeling, aligning with clean code principles, design patterns, and
SOLID principles. Its modular architecture enables easy scalability and
maintainability.Designed for efficiency and reliability, this project combines cutting-edge
technologies and best practices to provide a powerful tool for data scientists,
analysts, and organizations seeking to derive actionable insights from complex
data sets. Its multifaceted nature makes it adaptable to various domains and
data types, showcasing the versatility and innovation at the heart of the
solution### Built with
[![Python][python-shield]][python-url] [![Pandas][pandas-shield]][pandas-url] [![OpenAI][openai-shield]][openai-url] [![Hugging Face Transformers][transformers-shield]][transformers-url] [![PyTorch][pytorch-shield]][pytorch-url] [![numpy][numpy-shield]][numpy-url] [![scikit-learn][scikit-learn-shield]][scikit-learn-url] [![Pydantic][pydantic-shield]][pydantic-url] [![Pytest][pytest-shield]][pytest-url] [![isort][isort-shield]][isort-url] [![Black][black-shield]][black-url] [![Ruff][ruff-shield]][ruff-url] [![MyPy][mypy-shield]][mypy-url][![pre-commit][pre-commit-shield]][pre-commit-url] [![GitHub Actions][github-actions-shield]][github-actions-url] [![Pycharm][pycharm-shield]][pycharm-url] [![Visual Studio Code][visual-studio-code-shield]][visual-studio-code-url] [![Markdown][markdown-shield]][markdown-url] [![License: MIT][license-shield]][license-url]
## Getting started
### Prerequisites
* [Python 3.11][python-url]
### Installation
1. Clone the **repository**
```
git clone https://github.com/jpcadena/autochain-bot.git
```
2. Change the directory to **root project**
```
cd autochain-bot
```
3. Create a **virtual environment** *venv*
```
python3 -m venv venv
```
4. Activate **environment** in Windows
```
.\venv\Scripts\activate
```
5. Install requirements with PIP
```
pip install -r requirements.txt
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
```## Usage
1. **Setting up environment variables:**
If you find a `.env.sample` in the project directory, make a copy of it and
rename to `.env`.```
cp .env.sample .env
```This `.env` file will be used to manage your application's environment
variables.2. **Configuring your credentials:**
Open the `.env` file in a text editor and replace the placeholder values with
your actual credentials.```
# .env file
POSTGRES_USER=your_database_user
SECRET_KEY=your_api_key
```Be sure to save the file after making these changes.
3. **Executing the main script:**
To start the local project on your machine, run the following command in
your terminal:```
python main.py
```## Testing
1. **Running tests:**
To run all tests, you can simply run the following command in the root
directory of the project:```
pytest
```2. **Running a specific test:**
If you want to run a specific test, you can do so by specifying the file and
test name. For example, the following command will only run
the `test_get_users` test in the `test_main.py` file:```
pytest tests/test_main.py::test_get_users
```3. **Understanding test results:**
Pytest will provide a summary of the test results in the console. It will
tell you how many tests passed and how many failed. For each failed test,
Pytest will provide a detailed error message that can help you identify what
went wrong.4. **Writing new tests:**
When you add new features to the application, you should also write
corresponding test cases. Each test case should be a function that starts
with the word 'test'. Inside the function, you can use `assert` statements to
check that your code is working as expected. For example:```python
def test_add_user():
user = add_user("testuser", "testpass")
assert user.name == "testuser"
assert user.password == "testpass"
```This function tests that the `add_user` function correctly creates a new user
with the given name and password.Remember to update your tests whenever you update your code. Maintaining a
comprehensive test suite will help ensure the reliability and robustness of your
application.## Contributing
[![GitHub][github-shield]][github-url]
Please read our [contributing guide](CONTRIBUTING.md) for details on our code of
conduct, and the process for submitting pull requests to us.## Security
For security considerations and best practices, please refer to
our [Security Guide](SECURITY.md) for a detailed guide.## Code of Conduct
We enforce a code of conduct for all maintainers and contributors. Please read
our [Code of Conduct](CODE_OF_CONDUCT.md) to understand the expectations before
making any contributions.## License
Distributed under the MIT License. See [LICENSE](LICENSE) for more information.
## Contact
- [![LinkedIn][linkedin-shield]][linkedin-url]
- [![Outlook][outlook-shield]](mailto:[email protected]?subject=[GitHub]autochain-bot)
[project-screenshot]: assets/static/images/project.png
[//]: # "Shields"
[linkedin-shield]: https://img.shields.io/badge/linkedin-%230077B5.svg?style=for-the-badge&logo=linkedin&logoColor=white
[outlook-shield]: https://img.shields.io/badge/Microsoft_Outlook-0078D4?style=for-the-badge&logo=microsoft-outlook&logoColor=white
[python-shield]: https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54
[pydantic-shield]: https://img.shields.io/badge/Pydantic-FF43A1?style=for-the-badge&logo=pydantic&logoColor=white
[pycharm-shield]: https://img.shields.io/badge/PyCharm-21D789?style=for-the-badge&logo=pycharm&logoColor=white
[markdown-shield]: https://img.shields.io/badge/Markdown-000000?style=for-the-badge&logo=markdown&logoColor=white
[github-shield]: https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white
[ruff-shield]: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v1.json
[black-shield]: https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=appveyor
[mypy-shield]: https://img.shields.io/badge/mypy-checked-2A6DB2.svg?style=for-the-badge&logo=appveyor
[pytest-shield]: https://img.shields.io/badge/Pytest-0A9EDC?style=for-the-badge&logo=pytest&logoColor=white
[visual-studio-code-shield]: https://img.shields.io/badge/Visual_Studio_Code-007ACC?style=for-the-badge&logo=visual-studio-code&logoColor=white
[poetry-shield]: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/python-poetry/website/main/static/badge/v0.json
[isort-shield]: https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336
[github-actions-shield]: https://img.shields.io/badge/github%20actions-%232671E5.svg?style=for-the-badge&logo=githubactions&logoColor=white
[pre-commit-shield]: https://img.shields.io/badge/pre--commit-F7B93E?style=for-the-badge&logo=pre-commit&logoColor=white
[license-shield]: https://img.shields.io/badge/License-MIT-yellow.svg
[pandas-shield]: https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge&logo=pandas&logoColor=white
[numpy-shield]: https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge&logo=numpy&logoColor=white
[scikit-Learn-shield]: https://img.shields.io/badge/scikit--learn-%23F7931E.svg?style=for-the-badge&logo=scikit-learn&logoColor=white
[openai-shield]: https://img.shields.io/badge/OpenAI-ChatGPT-blue
[transformers-shield]: https://img.shields.io/badge/Hugging%20Face-Transformers-brightgreen
[pytorch-shield]: https://img.shields.io/badge/PyTorch-red
[//]: # "URL"
[linkedin-url]: https://linkedin.com/in/juanpablocadenaaguilar
[python-url]: https://docs.python.org/3.11/
[python-url]: https://www.python.org/
[pydantic-url]: https://docs.pydantic.dev
[pycharm-url]: https://www.jetbrains.com/pycharm/
[markdown-url]: https://daringfireball.net/projects/markdown/
[github-url]: https://github.com/jpcadena/autochain-bot
[ruff-url]: https://beta.ruff.rs/docs/
[black-url]: https://github.com/psf/black
[mypy-url]: http://mypy-lang.org/
[pytest-url]: https://docs.pytest.org/en/7.2.x/
[visual-studio-code-url]: https://code.visualstudio.com/
[poetry-url]: https://python-poetry.org/
[isort-url]: https://pycqa.github.io/isort/
[github-actions-url]: https://github.com/features/actions
[pre-commit-url]: https://pre-commit.com/
[license-url]: https://opensource.org/licenses/MIT
[pandas-url]: https://pandas.pydata.org/docs/
[numpy-url]: https://numpy.org/
[scikit-learn-url]: https://scikit-learn.org/stable/
[openai-url]: https://www.openai.com/
[transformers-url]: https://huggingface.co/transformers/
[pytorch-url]: https://pytorch.org/