Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists by ManoharVit

A curated list of projects in awesome lists by ManoharVit .

https://github.com/manoharvit/movietl-data-pipeline

Crafted an ETL pipeline to handle 26 million user ratings and about 45,000 movies. The pipeline has the potential of ingesting data at an efficiency of 10,000 records per minute into AWS Redshift. Implemented a standardized data model and automated data quality checks using Airflow, contributing to a 97% success rate for regular ETL cycles.

Last synced: 19 Dec 2024

https://github.com/manoharvit/ecommerce-dive-deep-sales-analysis

In this project, we developed an ETL pipeline using Apache Airflow to process delivery data and track delayed shipments. The pipeline downloads data from an AWS S3 bucket, cleans it using Spark/Spark SQL to identify missing delivery deadlines, and uploads the cleaned dataset back to S3. This ensures efficient delivery performance tracking.

airflow airflow-dags ecommerce elt pyspark s3 s3-bucket spark sql

Last synced: 12 Oct 2024

https://github.com/manoharvit/clinical-skin-lesion-diagnosis

Evaluated custom CNN and VGG models to classify seven distinct categories of skin lesions for addressing the challenge of data imbalance in medical datasets. Employed advanced machine learning techniques, comparative analysis of architectures, to enhance diagnostic accuracy in dermatology, contributing to early skin cancer detection.

Last synced: 08 Nov 2024

https://github.com/manoharvit/ecommerce-fashion-recommendation

Based on past interactions with customer and product metadata, including product description and appearance, recommend 12 products that are quite relevant to the customer.

Last synced: 08 Nov 2024

https://github.com/manoharvit/manoharvit

Data Analytical student pursuing Master's at Northeastern University. Being a technology-enthusiast, I love to work in a professional environment where I can contribute myself and enrich my analytical skills according to the latest technologies and work towards achieving the organizational goals

Last synced: 08 Nov 2024

https://github.com/manoharvit/crowd-sourced-mapping

Using crowdsourced satellite data and ndvi values, trained a powerful multivariate classification machine learning model to classify land cover categories using'max_ndvi' and other temporal data. Geospatial data processing using logistic regression and neural networks. Rectified class imbalance, bias-variance tradeoff, and dimensionality reduction.

Last synced: 08 Nov 2024

https://github.com/manoharvit/data-driven-insights-of-elonmusk-tweets

Conduct a word frequency analysis and keyword network analysis of Elon Musk's tweets. Transform the extracted keyword information from the aforementioned file into a weighted adjacency matrix. Plotting log-log graphs of word frequencies and ranks for each year using Zipf's law. Show bigram network graphs for each year.

Last synced: 08 Nov 2024

https://github.com/manoharvit/personality-prediction-test

Career professionals and psychologists use this information in personality career tests for recruitment and candidate assessment. Accurate personality estimation is delicate and can be falsified to some extent by a seeker, as I did. Given that employment is frequently associated with significant fiscal and social benefits, job campaigners are incen

data-mining data-science data-visualization machine-learning

Last synced: 08 Nov 2024

https://github.com/manoharvit/motion-tracker-for-elderly-assisted-living

Leaving elderly at home alone may also cause anxiety for the caretakers who are at their workplace far away. In order to lessen the worries of the caretaker, a smart assisted living eco-system need to be in placed to enable the elderly to seek help in the case of emergency. In this work, we developed an elderly friendly system platform, which is capable of tracking motion of the elderly within an indoor home environment.

Last synced: 08 Nov 2024

https://github.com/manoharvit/raymond-annual-report

Excel analysis on customer behavior and sales performance of Raymond Showroom Dataset 2022.

Last synced: 08 Nov 2024

https://github.com/manoharvit/budget-and-financials-of-mbta

MBTA, its role in public transportation, and how financial and operational data analysis may improve its services. Assessing MBTA's revenue and expenditure and optimizing operations with advanced data management. Describe the analysis tools and methods, such as MySQL, NoSQL, and Python, and the report structure.

conceptual-modelling database mongodb nosql sql

Last synced: 08 Nov 2024

https://github.com/manoharvit/movie-recommendation-system-using-r

The Movie Recommendation System using R makes tailored movie suggestions via collaborative filtering. It profiles users based on ratings and movie genres. User-Based Collaborative Filtering (UBCF) creates suggestions by evaluating user similarity based on preferences. Users may search for movies by category and evaluate suggestions using metrics.

Last synced: 08 Nov 2024

https://github.com/manoharvit/digits-recognition-system-using-machinelearning

The approach and dataset determines the system's accuracy and efficiency. For handwritten digit recognition, this paper provides a reasonable overview of machine and deep learning techniques such as SVM (Support Vector Machine), CNN (Convolutional Neural Network), and KNN (K Nearest Neighbor). It also shows which algorithm is the most effective at doing digit recognition. It also provides a comparison among different algorithms based on their accuracy, so that the most accurate method with the minimum errors can be used in distinct handwritten digit recognition applications.

Last synced: 08 Nov 2024

https://github.com/manoharvit/sun-tracking-solar-panel

A solar panel which is incident to the sun can gather more amount of solar energy in proper orientation when it is attached to a motor. This motor is electrically connected to the controller board. The system periodically checks the availability of solar energy from one horizon to other horizon. In the scan it checks which direction has maximum incident solar energy and hence the incident sun and positions the solar panel in that direction. In this way maximum power that can be harnessed with the Solar Panel

Last synced: 08 Nov 2024

https://github.com/manoharvit/social-media-sentiment-analysis

Evaluate six deep learning models on the Affects in Tweets Dataset, including CNN, Bidirectional GRU, LSTM, Logistic Regression, Support Vector Classifier, and a voting classifier. Achieve a peak accuracy and an overall good accuracy through thorough model assessment and testing.

Last synced: 08 Nov 2024

https://github.com/manoharvit/facial-expression-detection

Processed 29,051 images ensuring balanced emotion distribution, achieving 87% real detection.

Last synced: 08 Nov 2024