Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sabsar42/bangladesh-medicine-data-analysis

Bangladesh Medicine Dataset Analysis This repository contains an analysis of a medicine dataset obtained through web scraping. The analysis explores various aspects of the dataset, providing insights into pharmaceuticals, dosage descriptions, generic names, pharmaceutical companies, and retail prices.
https://github.com/sabsar42/bangladesh-medicine-data-analysis

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/sabsar42/bangladesh-medicine-data-analysis
Owner: sabsar42
Created: 2024-01-28T21:18:46.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-03-01T15:21:16.000Z (12 months ago)
Last Synced: 2024-11-10T12:09:29.465Z (3 months ago)
Language: Jupyter Notebook
Homepage: https://www.kaggle.com/datasets/shakibabsar42/bangladesh-medicine-dataset-dgda
Size: 36.5 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Bangladesh Medicine Dataset Analysis

This repository contains an analysis of a medicine dataset obtained through web scraping. The analysis explores various aspects of the dataset, providing insights into pharmaceuticals, dosage descriptions, generic names, pharmaceutical companies, and retail prices.

## Dataset Overview

The dataset used in this analysis comprises information related to medicines, including brand names, dosages, generic names, pharmaceutical companies, and retail prices.

## Analysis Highlights

1. **Pharmaceutical Companies:**
- Identification of top pharmaceutical companies based on the number of unique medicines produced.
- Visual representation of the distribution of medicines among the top pharmaceutical companies.

2. **Dosage Descriptions:**
- Identification of medicines with multiple dosage descriptions.
- Exploration of the top medicines with multiple dosage descriptions.

3. **Generic Names:**
- Analysis of the most common generic names in the dataset.
- Identification of pharmaceutical companies producing the most unique generic names.

4. **Retail Prices:**
- Overview of the distribution of retail prices for allopathic medicines.
- Identification of medicines with specific dosage descriptions and their corresponding retail prices.

## Methodology

The analysis was conducted using Python and popular data analysis libraries such as Pandas, Matplotlib, and Seaborn. The dataset was cleaned, filtered, and visualized to derive meaningful insights.

## Web Scraping

The dataset was obtained through web scraping techniques. The process involved extracting relevant information from online sources to compile a comprehensive dataset for analysis.

## Repository Structure

- **datasets/:**
- `.json`: Contains the raw and processed medicine datasets in JSON format.

- **notebooks/:**
- `dgda-medicine-dataset-analysis.ipynb`: Jupyter notebook used for data analysis.
- `web-scrapping-medicines-dgda.ipynb`: Python scripts for web scraping and data preprocessing.
- `medicine-dataset-analysis-report.ipynb`: Visualizations and report generated during the analysis.