Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists by ManoharVit
A curated list of projects in awesome lists by ManoharVit .
https://github.com/manoharvit/movietl-data-pipeline
Crafted an ETL pipeline to handle 26 million user ratings and about 45,000 movies. The pipeline has the potential of ingesting data at an efficiency of 10,000 records per minute into AWS Redshift. Implemented a standardized data model and automated data quality checks using Airflow, contributing to a 97% success rate for regular ETL cycles.
Last synced: 19 Dec 2024
https://github.com/manoharvit/ecommerce-dive-deep-sales-analysis
In this project, we developed an ETL pipeline using Apache Airflow to process delivery data and track delayed shipments. The pipeline downloads data from an AWS S3 bucket, cleans it using Spark/Spark SQL to identify missing delivery deadlines, and uploads the cleaned dataset back to S3. This ensures efficient delivery performance tracking.
airflow airflow-dags ecommerce elt pyspark s3 s3-bucket spark sql
Last synced: 12 Oct 2024
https://github.com/manoharvit/clinical-skin-lesion-diagnosis
Evaluated custom CNN and VGG models to classify seven distinct categories of skin lesions for addressing the challenge of data imbalance in medical datasets. Employed advanced machine learning techniques, comparative analysis of architectures, to enhance diagnostic accuracy in dermatology, contributing to early skin cancer detection.
Last synced: 08 Nov 2024
https://github.com/manoharvit/ecommerce-fashion-recommendation
Based on past interactions with customer and product metadata, including product description and appearance, recommend 12 products that are quite relevant to the customer.
Last synced: 08 Nov 2024
https://github.com/manoharvit/manoharvit
Data Analytical student pursuing Master's at Northeastern University. Being a technology-enthusiast, I love to work in a professional environment where I can contribute myself and enrich my analytical skills according to the latest technologies and work towards achieving the organizational goals
Last synced: 08 Nov 2024
https://github.com/manoharvit/crowd-sourced-mapping
Using crowdsourced satellite data and ndvi values, trained a powerful multivariate classification machine learning model to classify land cover categories using'max_ndvi' and other temporal data. Geospatial data processing using logistic regression and neural networks. Rectified class imbalance, bias-variance tradeoff, and dimensionality reduction.
Last synced: 08 Nov 2024
https://github.com/manoharvit/data-driven-insights-of-elonmusk-tweets
Conduct a word frequency analysis and keyword network analysis of Elon Musk's tweets. Transform the extracted keyword information from the aforementioned file into a weighted adjacency matrix. Plotting log-log graphs of word frequencies and ranks for each year using Zipf's law. Show bigram network graphs for each year.
Last synced: 08 Nov 2024
https://github.com/manoharvit/personality-prediction-test
Career professionals and psychologists use this information in personality career tests for recruitment and candidate assessment. Accurate personality estimation is delicate and can be falsified to some extent by a seeker, as I did. Given that employment is frequently associated with significant fiscal and social benefits, job campaigners are incen
data-mining data-science data-visualization machine-learning
Last synced: 08 Nov 2024
https://github.com/manoharvit/motion-tracker-for-elderly-assisted-living
Leaving elderly at home alone may also cause anxiety for the caretakers who are at their workplace far away. In order to lessen the worries of the caretaker, a smart assisted living eco-system need to be in placed to enable the elderly to seek help in the case of emergency. In this work, we developed an elderly friendly system platform, which is capable of tracking motion of the elderly within an indoor home environment.
Last synced: 08 Nov 2024
https://github.com/manoharvit/raymond-annual-report
Excel analysis on customer behavior and sales performance of Raymond Showroom Dataset 2022.
Last synced: 08 Nov 2024
https://github.com/manoharvit/budget-and-financials-of-mbta
MBTA, its role in public transportation, and how financial and operational data analysis may improve its services. Assessing MBTA's revenue and expenditure and optimizing operations with advanced data management. Describe the analysis tools and methods, such as MySQL, NoSQL, and Python, and the report structure.
conceptual-modelling database mongodb nosql sql
Last synced: 08 Nov 2024
https://github.com/manoharvit/movie-recommendation-system-using-r
The Movie Recommendation System using R makes tailored movie suggestions via collaborative filtering. It profiles users based on ratings and movie genres. User-Based Collaborative Filtering (UBCF) creates suggestions by evaluating user similarity based on preferences. Users may search for movies by category and evaluate suggestions using metrics.
Last synced: 08 Nov 2024
https://github.com/manoharvit/digits-recognition-system-using-machinelearning
The approach and dataset determines the system's accuracy and efficiency. For handwritten digit recognition, this paper provides a reasonable overview of machine and deep learning techniques such as SVM (Support Vector Machine), CNN (Convolutional Neural Network), and KNN (K Nearest Neighbor). It also shows which algorithm is the most effective at doing digit recognition. It also provides a comparison among different algorithms based on their accuracy, so that the most accurate method with the minimum errors can be used in distinct handwritten digit recognition applications.
Last synced: 08 Nov 2024
https://github.com/manoharvit/sun-tracking-solar-panel
A solar panel which is incident to the sun can gather more amount of solar energy in proper orientation when it is attached to a motor. This motor is electrically connected to the controller board. The system periodically checks the availability of solar energy from one horizon to other horizon. In the scan it checks which direction has maximum incident solar energy and hence the incident sun and positions the solar panel in that direction. In this way maximum power that can be harnessed with the Solar Panel
Last synced: 08 Nov 2024
https://github.com/manoharvit/social-media-sentiment-analysis
Evaluate six deep learning models on the Affects in Tweets Dataset, including CNN, Bidirectional GRU, LSTM, Logistic Regression, Support Vector Classifier, and a voting classifier. Achieve a peak accuracy and an overall good accuracy through thorough model assessment and testing.
Last synced: 08 Nov 2024
https://github.com/manoharvit/facial-expression-detection
Processed 29,051 images ensuring balanced emotion distribution, achieving 87% real detection.
Last synced: 08 Nov 2024