https://github.com/patrickdavies100/datapipeline37
Some Data Science practice using datasets available online. Currently test data is similar to this dataset: https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023 but the plan is to expand.
https://github.com/patrickdavies100/datapipeline37
data data-science pandas-dataframe python3
Last synced: 9 months ago
JSON representation
Some Data Science practice using datasets available online. Currently test data is similar to this dataset: https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023 but the plan is to expand.
- Host: GitHub
- URL: https://github.com/patrickdavies100/datapipeline37
- Owner: PatrickDavies100
- Created: 2024-05-31T20:51:30.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-01T13:19:58.000Z (over 1 year ago)
- Last Synced: 2025-02-28T16:31:24.031Z (over 1 year ago)
- Topics: data, data-science, pandas-dataframe, python3
- Language: Python
- Homepage:
- Size: 39.1 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# AmazonUK Data Science Practice
A Data Science project using datasets such as https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023
Technologies used:
IDE: PyCharm 2024 1.2
Libraries: PANDAS, PySide6, PyArrow
Objectives:
1. Create tools for automated data process including cleaning, transformation, and processing.
2. The application can generate a working serialisation format of a pipeline.
3. This format can be customised by the user.