Projects in Awesome Lists by CyprianFusi
A curated list of projects in awesome lists by CyprianFusi .
https://github.com/cyprianfusi/frauddetectionmodel-with-gretl
With this model: the amount of backlog would be reduced significantly, the amount of staff needed to do the job would be reduced drastically, the processing time would be shortened significantly and more cases of fraudulent transactions would be tracked down in a given amount of data processed - more than 40% increase in efficiency!
adaboostclassifier backlog cap-curve imbalanced-data logistic-regression
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/image_classification_using_gtsrb
Image Classification using the German Traffic Sign Recognition Benchmark (GTSRB) using tensorflow2.0
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/chat-with-your-files_rag-app
Welcome to the Chat with your Files, a powerful and user-friendly RAG interface that allows you to upload documents, process them with AI, and interactively ask questions—receiving context-aware, source-attributed answers in real time.
frontend javascript langchain llama-index llm nextjs rag vector-database
Last synced: 05 Oct 2025
https://github.com/cyprianfusi/using-our-sql-skills-to-answer-business-questions
Using complex SQL queries, including PostgreSQL window functions, to answer specific business questions. Notably using multiple named subqueries, views to extract data from a database to address specific problems.
pandas-python postgresql sql sqlite3 window-functions-in-sql
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/building-a-food-ordering-application-in-python
This is an example of Functional Programming Paradigm with separation of Concerns in display!
functional-programming python3 separation-of-concerns
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/developing-a-dynamic-ai-chatbot-in-python
Creating an AI chatbot that can take on different personas, keep track of conversation history, and provide coherent responses.
generative-ai large-language-models llm-prompting oops-in-python openai togetherai
Last synced: 12 Jun 2025
https://github.com/cyprianfusi/russia-2016-us-elections
Analysing the Impact of Russian Tweets on the 2016 US Presidential Elections
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/predicting-heart-disease-using-k-nearest-neighbours
Up to 90% accuracy with just 5 features using KNN algorithm and PCA for feature engineering. The dataset contained less than 1000 observations. The model's accuracy could be improved using more observations, further hyperparameter optimization and feature engineering
cap-curve feature-engineering feature-selection heart-disease knn-classifier pca pca-analysis
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/credit-card-customer-segmentation-using-k-means-algorithm
You are a data scientist working for a credit card company. You're asked to help segment a dataset containing information about the company’s clients into different groups to enable the company to apply different business strategies for each type of customer.
customer-segmentation kmeans-clustering pandas-python unsupervised-machine-learning
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/train-lenet5-for-inference
In this exercise we will train a network similar to the LeNet5 using Tensorflow2.0 and use it for inference.
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/world-happiness-report-for-2015-2019
World Happiness Report for 2019 with strange and unexpected results for Sub-Sahara African Countries! But it's data speaking...
data-visualization pandas-python
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/your-chances-of-winning-a-lottery-jackpot-is-negligible
No reasonable amount of tickets would increase your chances of winning in a lottery. Stop the addiction!
Last synced: 04 Oct 2025
https://github.com/cyprianfusi/kaggle_bulldozer_price_prediction
Validation RMSLE obtained: 0.21163 which is less than the RMSLE score (0.22909) that won the Kaggle Competition.
bulldozer-price-prediction feature-engineering feature-selection pca random-forest-regressor rmsle
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/sentiment-predictions-using-lstm-gru
LSTM and GRU models for sentiment predictions
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/validating_ml_models
This repo in due time shall contain notebooks with exercises demonstrating various methods of validating ML models
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/cyprianfusi
Config files for my GitHub profile.
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/predicting-heart-disease-using-logistic-regression-classification-algorithm
With a precision of 86% and model's CAP curve showing an accuracy of 100%! This means it is capable of correctly predicting 100% of patients with a heart disease after processing 50% of the data. The model's performance is "Too Good to be True"! However, with Train accuracy = 86% and Test accuracy = 82%, there is no visible sign of overfitting.
cap-curve false-negative false-positive heart-disease logistic-regression precision type-1-and-type-2-errors
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/app-profiles-for-the-app-store-and-google-play-markets-using-pandas
The goal is to analyze data to help our developers understand what type of apps are likely to attract more users
data-wrangling pandas python3 visualization
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/ai-for-sentiment-analysis
Applying AI for Sentiment Analysis with IMDB Dataset using tensorflow 2.0
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/applied_stats_in_feature_engineering
Hypothesis testing and feature engineering in machine learning
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/uk-covid-19-data-via-opendata-api
With recommendation to the UK government to halt all mandatory testing! Tests should only be conducted on patients as part of diagnosis and treatment. This is because with low prevalence of the disease most positive test results are false positives. This is due to irreducible error in the test.
api covid-19 data-visualization pandas-python uk
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/new-york-city-public-schools-and-sat-scores
One of the most controversial issues in the U.S. educational system is the efficacy of standardized tests and whether they're unfair to certain groups. We could correlate SAT scores with factors like race, gender, income, and more.
data-analysis-python data-cleaning data-visualization data-wrangling
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/building-a-word-raider---an-interactive-word-guessing-game
Applying recursive function call, Separation of Concerns Principle and functional programming in building a word guessing game.
functional-programming game-development python3 recursive-functions separation-of-concerns
Last synced: 01 Jul 2025
https://github.com/cyprianfusi/predicting-listing-gains-in-the-indian-ipo-market-using-tensorflow
A test accuracy of 75% with just 319 observations is encouraging but not good enough. With more observations and further tweaking, a much higher accuracy could be achieved. Listing gains are the percentage increase in the share price of a company from its IPO issue price on the day of listing
binary-classification deep-learning keras neural-networks python3 regularization tensorflow
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/multi-class-classification-on-stack-overflow-questions
A naive approach to multiclass text classifier on stack overflow questions with almost 80% accuracy!
keras-tensorflow multiclass-classification nlp-machine-learning pandas-dataframe python3 tensorflow2 wrangling
Last synced: 29 Mar 2025
https://github.com/cyprianfusi/data-scientist-technical-exercise-10ds
With recommendations to UK Department for Education of 10 Local Authorities where National Tutoring Programme (NTP) should be intensified and a response to UK Secretary of Health regarding a 76% Accident and Emergency (A&E) performance target which seems far-fetched.
data-analysis data-cleaning data-visualization hypothesis-testing pandas-python policy statistics
Last synced: 21 Sep 2025
https://github.com/cyprianfusi/market-basket-analysis-and-customers-segmentation
Data mining, market basket analysis and customers segmentation
apriori-algorithm association-rule-mining customer-segmentation data-mining market-basket-analysis
Last synced: 02 Aug 2025
https://github.com/cyprianfusi/pdf-fastapi-backend
Chat seamlessly with multiple PDF files in AWS cloud.
Last synced: 16 Aug 2025
https://github.com/cyprianfusi/advanced-rag-chat-app
A production-ready, full-stack Retrieval-Augmented Generation (RAG) application that enables intelligent conversations with your documents using advanced AI models. Upload files, ask natural language questions, and receive real-time, AI-generated responses complete with clear source attributions.
fastapi langchain llms nodejs pgvector postgresql rag rag-chatbot react
Last synced: 25 Aug 2025
https://github.com/cyprianfusi/binati-multi-agents-research
An intelligent research chatbot built with Streamlit that leverages multiple data sources to provide comprehensive answers to your questions. The application uses LangChain agents powered by Groq's Llama model to search through ArXiv papers, Wikipedia articles, and the web.
arxiv-api groq langchain ollama python streamlit wikipedia-api
Last synced: 30 Dec 2025
https://github.com/cyprianfusi/youtube-website-summarizer
A Streamlit web application that uses LangChain and Groq AI to automatically summarize content from YouTube videos and websites. Get concise 300-word summaries of any video or article with just a URL!
generative-ai langchain llm python3 streamlit youtube youtube-api
Last synced: 30 Dec 2025
https://github.com/cyprianfusi/predict-fuel-efficiency-using-linear-regression-with-tensorflow
This project demonstrates the power of introducing non-linearity in neural network models to capture relevant patterns in data
Last synced: 02 Mar 2025
https://github.com/cyprianfusi/kaggle-multi-classs-dog-breed-classification-using-transfer-learning
An end-to-end multi-class classification model for classifying dogs' breed using pre-trained model
classification computer-vision geforce gpu mobilenetv2 multiclass-classification python3 tensorflow2 transfer-learning
Last synced: 29 Dec 2025
https://github.com/cyprianfusi/todo-backend-fastapi
backend of todo app with fastapi and postgresql
Last synced: 15 Jul 2025
https://github.com/cyprianfusi/rendering-a-dynamic-ai-chatbot-using-streamlit
OpenAI Large Language Model (LLM) powered Chatbot using Streamlit
generative-ai large-language-models llm-prompting openai streamlit togetherai
Last synced: 15 Jul 2025
https://github.com/cyprianfusi/complete-statistical-hypothesis-tests
Complete Statistical Hypothesis Test using real-world data is a blueprint for hypothesis testing! It covers almost all the hypothesis tests commonly used.
hypothesis-testing pandas-python statistics visualization
Last synced: 21 Mar 2025
https://github.com/cyprianfusi/scraping-ip-adresses-from-wiki-revision-history
Crawling and Scraping IP Addresses from Wikipedia Revision History Pages
Last synced: 21 Mar 2025