Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rid17pawar/collegecutoffexplorer
Our project’s primary purpose is to provide consolidated information about all engineering colleges along with the caste-wise cut-off list for all available branches. We have utilized the pdfplumber and openpyxl python libraries along with Power BI Reports for creating Dashboard.
https://github.com/rid17pawar/collegecutoffexplorer
data-extraction powerbi powerbi-report python
Last synced: about 1 month ago
JSON representation
Our project’s primary purpose is to provide consolidated information about all engineering colleges along with the caste-wise cut-off list for all available branches. We have utilized the pdfplumber and openpyxl python libraries along with Power BI Reports for creating Dashboard.
- Host: GitHub
- URL: https://github.com/rid17pawar/collegecutoffexplorer
- Owner: rid17pawar
- Created: 2023-05-18T14:53:53.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-11T07:53:56.000Z (about 1 year ago)
- Last Synced: 2024-10-13T02:42:45.858Z (2 months ago)
- Topics: data-extraction, powerbi, powerbi-report, python
- Language: Python
- Homepage:
- Size: 10.1 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CollegeCutoffExplorer
### _YouTube Video Link: [https://youtu.be/SwE4mxQxhEI](https://youtu.be/z044Oe8OUII)_
#### PPT Presentation: [click here](https://drive.google.com/file/d/17AUJgyMIsWUYjG5X902cL9oM8NA0pSYc/view?usp=sharing)## Introduction
This project is designed for admission counseling departments to analyze and generate customized lists of colleges and branches based on candidate/students' percentage, location, and branch preferences.
We aim to reduce counselor's workload and enhance student satisfaction and decision-making during the admission process, by eliminating the need to go through multiple lengthy PDF files containing college cut-off details.
This project is developed exclusively for admission counseling departments in Maharashtra dealing with student queries related to Admission Cut-off for Engineering Colleges in Maharashtra.*About Technologies we used in this project,*
We have utilized the **pdfplumber** and **openpyxl** **python libraries**, combined with regular expressions, to extract cut-off information from the PDF files.
To prepare the data, we employed **PowerBI Power Query Editor**, which involved tasks such as data cleaning, transformation, and merging. Furthermore, we leveraged **PowerBI Reports** for visualizing the data.
Data extraction from PDF involves the process of extracting relevant information and data from PDF documents. PDF (Portable Document Format) is a widely used file format for storing and sharing documents. However, extracting data from PDF files can be challenging due to the format's inherent complexity and lack of structured data. Data extraction techniques are employed to automatically identify and extract specific data elements, such as text, tables, or images, from PDF documents. This extraction process often involves using specialized software tools (RPA Tools like UiPath) or programming scripts that can analyze the PDF content, locate the desired data, and convert it into a structured format, such as a spreadsheet or a database. Data extraction from PDFs is particularly useful in scenarios where large amounts of data need to be processed and analyzed. **We have utilized the pdfplumber and openpyxl python libraries, combined with regular expressions, to extract information from the PDF files.**
Data preparation encompasses the process of extract, transform, and load (ETL). Prior to loading the data for visualization, we performed transformations to ensure it is well-organized, user-friendly, properly formatted, and validated. This approach enhances data quality and safeguards against potential issues like unexpected duplicates, null values, incompatible formats, and incorrect indexing.
Data visualization is the process of visually representing information and facts. It plays a vital role in data analysis by facilitating clear and concise communication of complex data. Visualizing data makes it easier to grasp intricate information. Graphs and charts provide a comprehensible depiction of data, enabling people to better understand and interpret its significance. By utilizing data visualization techniques, we can enhance decision-making by leveraging the insights derived from the data. In our case, we have employed various visualizations to construct a PowerBI Report. **To prepare the data, we employed PowerBI Power Query Editor, which involved tasks such as data cleaning, transformation, and merging. Furthermore, we leveraged PowerBI Reports for visualizing the data.**## System Architecture
![System_Diagram](https://github.com/rid17pawar/CollegeCutoffExplorer/assets/47048717/7cd77460-267e-47d1-8d46-25fc4ec89c67)## Technologies Used-
### 1. Front end Technologies:
- Power BI Desktop
- Power Query Editor
### 2. Back end Technologies:
- Python Libraries,
- pdfplumber
- openpyxl## Snapshots-
Power BI Reports
![PowerBI_Report_TopN](https://github.com/rid17pawar/CollegeCutoffExplorer/assets/47048717/f0e1a132-0484-4e87-83e4-90f8665178e7)![PowerBI_Report_Details](https://github.com/rid17pawar/CollegeCutoffExplorer/assets/47048717/b0b5802d-38d7-42f8-a88d-e896bbe72988)
### Thank You !