An open API service indexing awesome lists of open source software.

https://github.com/akshint0407/ds-lab

This repository provides practical solutions and code samples for Data Science (DS) lab assignments in the 3rd year of engineering.
https://github.com/akshint0407/ds-lab

3rd-year-2nd-semester cheatsheet data-science fun jupyter-notebook practicals python

Last synced: about 1 month ago
JSON representation

This repository provides practical solutions and code samples for Data Science (DS) lab assignments in the 3rd year of engineering.

Awesome Lists containing this project

README

          

# Data Science Practical Codes – 3rd Year Engineering

This repository provides practical solutions and code samples for common Data Science (DS) lab assignments in the 3rd year of engineering. The aim is to help students easily access, understand, and implement DS concepts using Python.
**Before starting any experiment, please check the specific aim and instructions from the lab manual provided by your respective university.**

---

## 📑 Contents

1. **Data Wrangling I:**
Using Python (Pandas) on any open source dataset to perform data wrangling tasks such as loading, cleaning, and transforming data.

2. **Data Wrangling II:**
Create an “Academic Performance” dataset of students and perform data wrangling operations such as handling missing values, renaming columns, and data type conversions.

3. **Descriptive Statistics I:**
Compute measures of central tendency (mean, median, mode) and variability (variance, standard deviation) on any open source dataset.

4. **Descriptive Statistics II:**
Repeat the above statistical operations on another open source dataset for practice and comparison.

5. **Classification – Logistic Regression:**
Perform classification using the Logistic Regression algorithm on the given dataset (`Social_Network_Ads.csv`).

6. **Classification – Naive Bayes (I):**
Perform classification using the Naive Bayes algorithm on the `iris.csv` dataset.

7. **Text Analytics – To perform text extraction and preprocessing on text using NLTK methods:**
1. Extract sample documents and apply the following document preprocessing methods:
- Tokenization
- POS Tagging
- Stop Words Removal
- Stemming
- Lemmatization
2. Represent documents using:
- **Term Frequency (TF)**
- **Inverse Document Frequency (IDF)**

8. **Data Visualization(I) – Titanic Dataset:**
To perform Data Visualization:
1. Use the inbuilt `titanic` dataset to explore passenger data using **Seaborn**.
2. Plot a histogram to show the distribution of **ticket fare** (`fare` column).

10. **Data Visualization(II) – Titanic Dataset:**
Additional visualizations on the Titanic dataset for deeper insights and practice.

11. **Data Visualization(III) – Iris Dataset:**
Create various visualizations (scatter plots, pair plots, etc.) on the Iris dataset to explore relationships between features.

---

## 🚀 How to Use

- Browse to the relevant folder or notebook for each practical.
- Read the code and comments for step-by-step explanations.
- Run the code in your local Python environment or Jupyter Notebook.
- Refer to your university’s lab manual for the exact aim and requirements before starting each experiment.

---

## 🤝 Contributing

Contributions are welcome! If you have improved solutions or additional practicals, feel free to open a pull request.

---

## ⭐ Support

If you find this repository helpful, please star it and share with your classmates.
For questions or suggestions, open an issue!

---

Happy Learning and Coding!

---