An open API service indexing awesome lists of open source software.

https://github.com/patrickdavies100/datapipeline37

Some Data Science practice using datasets available online. Currently test data is similar to this dataset: https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023 but the plan is to expand.
https://github.com/patrickdavies100/datapipeline37

data data-science pandas-dataframe python3

Last synced: 9 months ago
JSON representation

Some Data Science practice using datasets available online. Currently test data is similar to this dataset: https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023 but the plan is to expand.

Awesome Lists containing this project

README

          

# AmazonUK Data Science Practice
A Data Science project using datasets such as https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023

Technologies used:
IDE: PyCharm 2024 1.2

Libraries: PANDAS, PySide6, PyArrow

Objectives:
1. Create tools for automated data process including cleaning, transformation, and processing.
2. The application can generate a working serialisation format of a pipeline.
3. This format can be customised by the user.