An open API service indexing awesome lists of open source software.

https://github.com/munas-git/munas-git.github.io

My data science portfolio
https://github.com/munas-git/munas-git.github.io

Last synced: 4 months ago
JSON representation

My data science portfolio

Awesome Lists containing this project

README

          

# Data Scientist | Machine Learning Researcher.
[LinkedIn](https://www.linkedin.com/in/einstein-ebereonwu) **|** [X (FKA Twitter)](https://x.com/einsteinmuna) **|** [YouTube](https://www.youtube.com/@einstein-munachi)
> Innovative Data Scientist, an expert in Python, SQL, PySpark, and data visualisation. Specialised in machine learning, NLP (BERT, RoBERTa, GPT), and Computer Vision (YOLO, ResNet, Diffusion Models), experienced in the use of vector databases for proficient similarity search and retrieval. Passionate about learning, and leveraging cutting-edge AI technologies including transformers and generative models. Skilled in building scalable, distributed solutions for complex data problems using Spark, driving insights and decision-making.

## Skills
**Databases**: MySQL, PostgreSQL, MongoDB, ChromaDB
**Programming & Data Processing**: Python, SQL, PySpark, Hadoop.
**API & MLOps**: Flask, FastAPI, Streamlit, Docker, AWS, Azure, GCP, Render.
**Data Visualisation & Analytics**: Tableau, PowerBI, Matplotlib, Plotly, Seaborn.
**Productivity Tools**: MS Suite (Word, Excel, PowerPoint, Outlook), Jira, Slack, Teams.
**Machine Learning & AI**: PyTorck, Scikit-Learn, YOLO, Roboflow, Transformers, NLP, Computer Vision.
**Non-Technical**: Communication, attention to detail, self-motivation, collaboration, eagerness to learn.

## Education
- **M.S., Data Science and Analytics | Royal Holloway University of London**
Grade: 1st (First Class Honors)
Dissertation: Deep Learning based Medical Image Segmentation using DeepLabsV3+ equipped with attention mechanism.

- **B.S., Software Engineering | Babcock University, Nigeria.**
Grade: 2:1 (Second-Class Honors, First Division.)
Dissertation: Text summarisation, topic modeling, and Language detection web application with GPT, & Decision Trees.

## Research Experience
**Optimising Medical Image Segmentation Through Attention Mechanisms and Custom Loss Functions @ Royal Holloway University of London.
(June 2024 - Aug 2024)**
***Highlight*** - Intensive two-month-long computer vision research focused on improving the accuracy of breast cancer, skin lesions, and lung segmentation.

Developed and applied advanced data preprocessing and augmentation techniques to enhance model performance and increase dataset diversity. Integrated CBAM and SEBlock attention mechanisms into the DeepLabV3+ architecture using PyTorch, significantly improving segmentation accuracy. Proposed and tested a new DiceBCE loss function, boosting model convergence and performance. Conducted in-depth analysis, demonstrating improvements in segmentation accuracy for breast cancer, skin lesions, and lung images. Currently preparing a research paper for publication, focusing on the impact of attention mechanisms in medical image segmentation.

- **Tools & Tech:** Python, PyTorch, DeepLabV3+, Attention Mechanisms, CBAM, SEBlock, DiceBCE Loss Function, Data Augmentation, Medical Image Segmentation

## Work Experience
**Data Scientist @ Vault Hill (_August - October 2023_)**
Collaborated on AI and data-driven initiatives, optimising workflows and delivering measurable business outcomes. By implementing scalable pipelines and fostering collaboration through clear documentation and insight-sharing, I ensured alignment between technical solutions and organisational goals while driving innovation and success.
- **Tools & Tech:** Python, Llama-2, PyTorch, LangChain, AWS (S3, Lambda, SageMaker), ETL Pipelines, Custom Web Crawlers, Regex, Data Visualisation Tools.

**Data Analyst & Backend Developer @ SunFi (_Januray - June 2022_)**
Optimised system performance and streamlined workflows by implementing automation and scalable infrastructure solutions. Efforts included designing efficient database schemas, developing Slack bot templates, and integrating APIs for bug reporting, resulting in improved operational efficiency and team collaboration.

- **Tools & Tech:** Python, PostgreSQL, Docker, API Integration, Automation Tools, Jira, Slack Bot Templates, Database Schema Design.

## Projects
### [LookOutAI](https://github.com/munas-git/LookOutAI) (Multi-model - Computer Vision & NLP)
LookOutAI is a sophisticated image recognition tool designed to enhance security and privacy by identifying individuals in photos or videos using advanced AI technology. It provides detailed descriptions of their actions or behaviour and offers versatile features such as selectively blurring faces to ensure privacy or un-censoring specific targets for clarity. Ideal for security applications, LookOutAI enables law enforcement or security teams to process video evidence with precision while safeguarding the privacy of uninvolved individuals.
- **Tools & Tech:** Python, NumPy, OpenCV, Pixtral AI, Embeddings, Gradio.

![System Ourput Sample](https://github.com/user-attachments/assets/b549c210-9169-4d02-ad70-232d9c8f793c)

### [Robust Medical Image Segmentation - DeepLabsV3+ with Attention]() (Computer Vision)

Extensive research on optimising the DeepLabV3+ architecture for medical image segmentation. Through rigorous experimentation, I developed enhanced variants of DeepLabV3+ by integrating attention mechanisms, specifically Convolutional Block Attention Module (CBAM) and Squeeze-and-Excitation (SEBlock) attention, applied in parallel within the encoder. These attention-equipped models consistently outperformed the base DeepLabV3+ and other published research works across multiple medical imaging datasets, including breast cancer ultrasound, lung X-rays, and ISIC skin lesions 2017.
- **Tools & Tech:** Python, PyTorch, Attention Mechanisms, DeepLabsV3+.

![Lung Segmentation Predictions](https://github.com/user-attachments/assets/d6e3591d-69e9-4afd-a603-8ab9e343b82c)

### [Automated Attention Tracking and Reporting system](https://github.com/munas-git/Automated-Attention-Tracking-And-Reporting) (Computer Vision)

The Automated Attention Tracking and Reporting system is a sleek AI-driven dashboard that monitors and reports live distraction levels during classes, meetings, or lectures. Utilising advanced computer vision techniques to help educators and facilitators enhance focus and attention during sessions.
- **Tools & Tech:** Python, Roboflow, Streamlit.

![Distracted Image](https://github.com/user-attachments/assets/f0ee09cf-9c82-4b51-8b1f-d844f4f34ebe)

### [Extractive, Abatractive document/text Summarising System.](https://github.com/munas-git/text-summarization-webapp) (NLP)

The document summarising system developed as my final year project for a B.Sc. in Software Engineering at Babcock University goes beyond basic summarisation. It accurately predicts and suggests the top 2 topics for any text input, detects the language of the text and features text-to-speech functionality to read summaries aloud. Additionally, users can download their summaries as Word documents (.docx).
- **Tools & Tech:** Python, OpenAI, Scikit-learn, Decision Trees, Pandas, NumPy.

![214431479-aae08584-b96e-4934-a205-45a315d6cb94](https://github.com/user-attachments/assets/5adf37f8-d659-4a9e-b871-197beec621ab)

## Certifications (Proof of continuous learning)
- [Introduction to Cloud 101](https://www.credly.com/badges/a5d61cc6-ec06-4fdb-9cba-53d0166328c0/linked_in_profile) By [AWS](https://www.credly.com/organizations/amazon-web-services/badges)
- [Deep Learning Specialisation](https://www.coursera.org/account/accomplishments/specialization/certificate/PQ2Z3UR2CLUK) By [DeepLearning.AI](https://www.deeplearning.ai/)
- [Data Analytics Specialisation](https://www.coursera.org/account/accomplishments/specialization/certificate/8H5L372MDYLF) By [Google](https://grow.google/intl/uk/enroll-certificates/?utm_source=google&utm_medium=paidsearch&utm_campaign=ha-sem-bk-gen-exa__geo%E2%80%94UK&utm_term=google%20training%20classes&gclsrc=aw.ds&gad_source=1&gclid=CjwKCAjwreW2BhBhEiwAavLwfIqhzCRpXvxMO1WxYMrOfN5tIqhgttybbjv_kbPGMfwRXJXAqtjClBoCM-UQAvD_BwE)

## Blogs
- [REST API Implementation in Python for Model Deployment: Flask and FastAPI.](https://medium.com/@einsteinmunachiso/rest-api-implementation-in-python-for-model-deployment-flask-and-fastapi-e80a6cedff86)
- [An exposé on Retrieval-Based ChatBot.](https://medium.com/@einsteinmunachiso/building-an-ai-chatbot-in-python-retrieval-based-chatbot-9c6c7f3ef6bf)
- [Saving Your Machine Learning Model In Python: pickle.dump()](https://medium.com/mlearning-ai/saving-your-machine-learning-model-in-python-pickle-dump-b01ae60a791c)
- [Web Scraping with MS Excel and Python: Static Site Contents.](https://medium.com/@einsteinmunachiso/web-scraping-with-ms-excel-and-python-static-site-contents-4903ea08b85)
- ➡️ [More Data Science Blog](https://medium.com/@einsteinmunachiso)

## Volunteering
- **YOLOvX app beta version tester @ [YolovX](https://yolovx.com/)**
**(_September 2024 - Present_)**
I am currently running tests on the app's beta version and will have more to say soon...

- **Logistics Coordinator @ TEDx [Royal Holloway.](https://www.linkedin.com/company/tedx-royal-holloway/)**
**(_October 2023 - September 2024_)**

Successfully managed four bake sales contributing to event funding, and coordinated the attendance of over 100 participants for a high-impact TEDx talk that garnered over 41,000 cumulative YouTube views by orchestrating event planning and logistics with a 13-member team.

- **Data Scientist @ [HeatGeek](https://www.heatgeek.com/) Hackathon.**
**(_June 15th - 16th 2024_)**

I volunteered my data and AI skills at a two-day AI hackathon, where my team won the Innovation Prize. We built an AI tool to help homeowners and installers assess heat pump locations using satellite imagery and API data. Huge thanks to everyone at Heat Geek for hosting such an inspiring event!

- **Network Member @ [EPMS Royal Holloway University.](https://intranet.royalholloway.ac.uk/students/information-hub/academic/school-of-engineering-physical-and-mathematical-sciences.aspx)**
**(_October 2023 - September 2024_)**

Advocate for student voices by actively voicing opinions and contributing to meaningful discussions, ensuring that the concerns and ideas of fellow students were heard and addressed.

- **Pioneer Lead Data Scientist @ [GDSC Babcock University.](https://www.linkedin.com/company/gdsc-babcock/posts/?feedView=all)**
**(_September 2022 - August 2023_)**

As the GDSC Lead for the Data Science track, I developed a comprehensive training framework that achieved significant learning outcomes. I introduced over 300 students to Python Programming and Object-Oriented Programming (OOP) through regular in-person and online sessions. I enhanced practical skills by organising coding sessions focused on data cleaning, visualization, machine learning model development, evaluation, and regularisation

- **Volunteer Data Science Bootcamp Teacher @ [GDSC](https://www.linkedin.com/company/gdsc-iet-lucknow/) Nigeria.**
**(_February 2023 - March 2023_)**

I successfully led a 3-week boot camp tailored for Sub-Saharan Africa, motivating students in the tech industry with valuable advice. I also curated and shared resources to boost proficiency in data-related skills, providing continuous support to ensure a thorough understanding of the material.

- **2x Student Representative (Senator) @ [Babcock University Student's Association.](https://www.linkedin.com/company/babcock-university-students-association/)**
**(_September 2021 - August 2023_)**

I effectively resolved peer concerns by skillfully advocating their needs to relevant authorities, ensuring swift and impactful outcomes. I strengthened the school's communication efforts by guaranteeing the timely and accurate dissemination of critical information to the student body, while also managing multiple emergencies with composure and efficiency, prioritizing the safety and well-being of everyone involved. Additionally, I contributed to enhancing students' overall quality of life through active engagement in various enrichment initiatives.

- **BUCC Network Member @ [Babcock University Computer Club.](https://www.linkedin.com/company/bucc-official/posts/?feedView=all)**
**(_September 2019 - June 2023_)**

During my tenure, I engaged in numerous impactful initiatives, one of the highlights being my key role in orchestrating a highly successful career fair in 2023. As a vital member of the planning committee, I helped bring in distinguished speakers and leading organisations within the tech industry. Additionally, I enhanced the event by contributing valuable insights as a panellist during discussions.

## Interests and Activities
### AI Research and Trends.
I am dedicated to staying at the forefront of artificial intelligence, I do this by regularly reading research papers and follow industry trends through newsletters such as TechCrunch, TL;DR as well as other popular tech blogs and company X and LinkedIn accounts.

### Teaching
Passionate about teaching and sharing knowledge, I actively seek opportunities to educate others through small group meetings, workshops, and voluntary initiatives. I believe in fostering a collaborative learning environment, whether within my company or through community engagement, to inspire growth and empower individuals to reach their full potential.

### Fitness and Working Out
I enjoy exploring various training methods, and maintaining a balanced lifestyle through activities such as running, working out as well as meal prep every now and then.

### Hackathons
I enjoy hackathons. I recently attended and won a two-day AI hackathon at Heat Geeks AI where my team won the prize for Innovation by developing an AI tool to help homeowners and installers assess heat pump locations using satellite imagery, internal company API and OpenAI API.
References Available on request.