https://github.com/shama-llama/data-science-assignments
Repository for CoSc 6262 Course Assignments
https://github.com/shama-llama/data-science-assignments
computer-science cosc-6262 data-science
Last synced: 5 months ago
JSON representation
Repository for CoSc 6262 Course Assignments
- Host: GitHub
- URL: https://github.com/shama-llama/data-science-assignments
- Owner: shama-llama
- License: mit
- Created: 2025-04-10T16:18:18.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-08-03T04:41:57.000Z (11 months ago)
- Last Synced: 2025-08-03T06:22:54.056Z (11 months ago)
- Topics: computer-science, cosc-6262, data-science
- Language: Jupyter Notebook
- Homepage:
- Size: 3.69 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Data Science Assignments
[](https://docs.python.org/3.12/)
[](https://numpy.org/doc/stable//release/2.3.0-notes.html)
[](https://docs.scipy.org/doc/scipy/release/1.16.0-notes.html)
[](https://pandas.pydata.org/pandas-docs/version/2.3/index.html)
[](https://scikit-learn.org/stable/whats_new/v1.7.html)
[](https://opensource.org/licenses/MIT)
This repository is a collection of assignments that were done as part of the Data Science (CoSc 6262) course. The notebooks cover fundamental topics and practical applications of core Python libraries for data analysis, scientific computing, and machine learning.
## Topics
- **NumPy:** Open source project for numerical computing with Python.
- **Pandas:** Open source data analysis and manipulation tool.
- **SciPy:** Tools for array computing and specialized data structures.
- **scikit-learn:** Tools for predictive data analysis.
## Datasets
> Kaggle, “Titanic - Machine Learning from Disaster,” Kaggle Competitions, 2012. [Online]. Accessed: Apr. 25, 2025. Available: [https://www.kaggle.com/c/titanic/data](https://www.kaggle.com/c/titanic/data).
>
> C. Zhang, “IMDB 5000 Movie Dataset,” Kaggle Datasets, [Online]. Accessed: Apr. 25, 2025. Available: [https://www.kaggle.com/datasets/carolzhangdc/imdb-5000-movie-dataset](https://www.kaggle.com/datasets/carolzhangdc/imdb-5000-movie-dataset).
## Project Setup
This project uses `uv` for package management. `uv` is an extremely fast Python package and project manager, written in Rust that can be used as a drop-in replacement for `pip`, `pip-tools`, `pipx`, `poetry`, `pyenv`, `twine`, `virtualenv`.
- **`uv` Installation**
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
- **Clone the Repository:**
```bash
git clone https://github.com/shama-llama/data-science-assignments.git
cd data-science-assignments
```
- **Create a Virtual Environment and Install Dependencies with `uv`:**
```bash
uv venv
uv pip install -e .
```
- **Activate the Virtual Environment:**
```bash
source .venv/bin/activate
```
- **Launch Jupyter Notebook:**
```bash
jupyter notebook
```
Navigate to the `notebooks/` directory to run the analysis.
## License
This project is licensed under the terms of the [MIT](LICENSE) open source license.