An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by dmarks84

A curated list of projects in awesome lists by dmarks84 .

https://github.com/dmarks84/coursework_project_banks-web-scraping-sql

Project for IBM Data Engineering & Python course on ETL & Big Data -- Scraped website data and made API calls for additional data; wrangled and transformed this data and loaded into a SQL database.

apis beautifulsoup databases elt etl nosql numpy pandas pipelines python sql sqlite web-scraping

Last synced: 10 Apr 2026

https://github.com/dmarks84/coursework_project_airfoil-noise-prediction

Project for IBM Data Engineering & Python course on ML & AI -- Created predictions for noise of an airfoil based on various physical features

apache-spark api automation data-modeling etl linear-algebra numpy pandas pipelines python regression statistics supervised-ml

Last synced: 13 Apr 2026

https://github.com/dmarks84/ind_project_california-housing-data--kaggle

Independent Project - Kaggle Dataset-- I worked on the California Housing dataset, performing data cleaning and preparation; exploratory data analysis; feature engineering; regression model buildings; model evaluation.

cross-validation data-modeling data-reporting data-visualization eda folium grid-search matplotlib model-evaluation numpy pandas pca python seaborn sklearn statistics supervised-ml unsupervised-ml

Last synced: 08 Apr 2026

https://github.com/dmarks84/coursework_coursework_project_automobile-sales-visualization

Project for IBM Data Science course on Visualization & Dashboards -- Analyzed historical sales data, performing EDA and setting up an interactive dashboard

communication dash dashboards data-modeling elt etl folium matplotlib numpy pandas pipelines plotly python scipy seaborn visualization

Last synced: 10 Apr 2026

https://github.com/dmarks84/coursework_project_ml-classifier-eval-selection

Project for University of Michigan Applied Data Science Specialization -- Predicted viewer engagement based on features related to video metrics; evaluated a large set of classifiers under different scoring metrics to select the "optimal" one.

classification cross-validation data-modeling data-reporting data-visualization databases dataframes eda grid-search matplotlib numpy pandas python scikit-learn statistics supervised-ml

Last synced: 02 Apr 2026

https://github.com/dmarks84/coursework_project_data-analysis-apache-spark

Project for IBM Data Engineering & Python course on ETL & Big Data -- Read in data, wrote to SQL database and performed queries, performed statistical analysis and issued reports

apache-sprk automation dag data-modeling eda elt etl numpy pandas pipelines python sql statistics visualization

Last synced: 11 Apr 2026

https://github.com/dmarks84/ind_project_mall-customer-clustering--kaggle

Independent Project - Kaggle Dataset-- I worked with the Mall Customer Segmentation Dataset, which provided a various instances of shoppers of different ages, incomes, etc. I utilized unsupervised ML clustering algorithms to identify useful customer segments.

clustering dataframes dbscan kmeans-clustering market-segmentation mean-shift pandas python sklearn technical-analysis technical-communication unsupervised-ml

Last synced: 12 Apr 2026

https://github.com/dmarks84/coursework_capstone_spacex_predictions

Final Project for IBM Data Science Professional Certificate -- Applied all skills and methods utilized in the series of courses for this certification to predict the success of SpaceX landings; issued full report to stakeholders

api classification dash eda folium linear-algebra matplotlib mysql numpy pandas plotly probability python seaborn sql statistics supervised-ml technical-writing web-scraping

Last synced: 08 Apr 2026

https://github.com/dmarks84/coursework_project_boston-data-project

Project for IBM Data Science course on Statistics -- Read in a large data set and performed several statistical analyses and hypothesis testing

communication data-modeling data-reporting dataframes eda hypothesis-testing matplotlib numpy pandas probability python scipy seaborn statistics visualization

Last synced: 08 Apr 2026

https://github.com/dmarks84/ind_project_superstore-sales-time-series-analysis--kaggle

Independent Project - Kaggle Dataset-- I worked on the Superstore Sales Dataset, performing (as Part 1) data cleaning and preparation and exploratory data analysis. The main task was to make predictions for future sales based on time-series analysis, which is found in Part 2.

chloropleth data-modeling data-visualization eda linear-regression matplotlib numpy pandas python seaborn sklearn statistics statsmodels supervised-ml time-series-analysis

Last synced: 09 Apr 2026

https://github.com/dmarks84/ind_project_european-soccer-top-points-contributors--kaggle

Independent Project - Kaggle Dataset-- I worked on the European Soccer Dataset, using SQL (SQLite) to read in the data and then data wrangling before running statistical analysis and hypothesis testing on questions of who helps earn the most points for their team.

data-wrangling hypothesis-testing numpy p-values pandas python scipy-stats statistics t-test

Last synced: 09 Apr 2026

https://github.com/dmarks84/coursework_project_apache-airflow-kafka-on-toll-booth-data

Project for IBM Data Engineering & Python course on ETL & Big Data -- Read in live toll booth data, wrangles and transformed, and wrote into a SQL database

apache-airflow apache-kafka automation dags data-modeling databases eda elt etl mysql numpy pandas pipelines python sql

Last synced: 11 Apr 2026

https://github.com/dmarks84/coursework_project_linux-file-backup

Project for IBM Data Engineering & Python course on Linux & Shell Scripts -- Wrote and executed bash scripts to manipulate folders and files to create a full directory backup with automation using crontab

automation bash crontab elt etl linux pipelines python shell-scripts

Last synced: 12 Apr 2026

https://github.com/dmarks84/ind_project_obesity-multi-class-classification--kaggle

Independent Project - Kaggle Competition -- I worked on the obesity classification data set as part of a Kaggle Competition of the same name, scoring (for accuracy) above 0.9

classification correlation-analysis cross-validation data-modeling data-visualization dataframes eda gridsearchcv matplotlib multiclass-classification numpy pandas python seaborn sklearn statistics supervised-ml

Last synced: 11 Apr 2026

https://github.com/dmarks84/coursework_project_text-mining-spam-analysis

Project for University of Michigan Applied Data Science Specialization -- Performed NLP in order to build features of email messages; trained various classification models to help predict if a message was spam.

classification databases eda nlp numpy pandas python scikit seaborn sentiment-analysis statistics supervised-ml text-mining unsupervised-ml visualization

Last synced: 11 Apr 2026

https://github.com/dmarks84/coursework_project_ml-classification

Project for IBM Data Science course on Machine Learning -- Trained ML models for classification, evaluating based on a variety of metrics

classification communication data-modeling dataframes numpy pandas python scikit-learn supervised-ml

Last synced: 11 Apr 2026

https://github.com/dmarks84/dmarks84

Personal

Last synced: 09 Apr 2025

https://github.com/dmarks84/ind_project_new-topic-nlp-analysis-classification--kaggle

Independent Project - Kaggle Dataset-- I worked with the News Category Dataset, which provided a headline and description, etc. in .json format; used NLTK for NLP, tokenizing, lemmatizing, and finding part-of-speech; trained and tuned parameters on classifier models to predict news category based on headline text.

classification hyperparameter-tuning json lemmatization model-evaluation model-refinement nlp nltk pandas python sklearn supervised-ml

Last synced: 11 Apr 2026

https://github.com/dmarks84/coursework_project_text-mining-topic-modeling

Project for University of Michigan Applied Data Science Specialization -- Developed functions to score similarity between text passages.

data-modeling data-reporting data-visualization databases eda nlp numpy pandas python statistics text-mining

Last synced: 12 Apr 2026

https://github.com/dmarks84/coursework_project_sentiment-analysis

Project for University of Michigan Python Programming Specialization -- Read in tweets and analyzed their content to perform basic sentiment analysis

classification programming python sentiment-analysis statistics web-scraping

Last synced: 09 Apr 2025

https://github.com/dmarks84/coursework_project_nlp-with-nltk

Project for University of Michigan Applied Data Science Specialization -- Utilized NLTK library to process natural language, and then built several spelling recommenders for a list of misspelled words.

data-modeling databases dataframes eda nlp numpy pandas python reporting statistics text-mining visualization

Last synced: 13 Apr 2026

https://github.com/dmarks84/coursework_project_network-analysis-node-link-prediction

Project for University of Michigan Applied Data Science Specialization -- Analyzed network nodes and edges, developing custom features based on various scoring metrics; used features to train classifier model to predict node attribute (employee salary type) and future edges (employee connections)

classification cross-validation data-reporting databases eda grid-search matplotlib network-analysis numpy pandas python scikit-learn statistics supervised-ml visualization

Last synced: 13 Apr 2026

https://github.com/dmarks84/professional_certifications

A full set of the certificates achieved my the work I completed as part of various Professional Certifications, Specializations, Courses, and Projects.

independent-education

Last synced: 21 Jan 2026

https://github.com/dmarks84/coursework_project_image-text-recognition

Project for University of Michigan Python Programming Specialization -- Read in documents with images and text, and utilized CV libraries/packages to extract specific types of images and text, pairing them together

classification computer-vision image-classification numpy pandas programming python text-classification

Last synced: 14 Apr 2026

https://github.com/dmarks84/coursework_capstone_full_data_engineering

Final Project for IBM Data Engineering & Python Professional Certificate -- Applied all skills and methods utilized in the series of courses for this certification

apache-airflow apache-hadoop apache-kafka apache-spark api beautifulsoup cassandra dags etl mongodb nosql pandas plotly postgresql python scipy seaborn sql

Last synced: 25 Feb 2026

https://github.com/dmarks84/ind_project_readme-generator

Independent (personal) project in which I automatically generate README files for each of my repositories from my coursework

dataframes etl numpy pandas programming python

Last synced: 29 Apr 2026

https://github.com/dmarks84/ind_project_movie-database-sqlite

Independent Project - I joined and manipulated data from disparate tables of movie information using Python & SQLite; defined schema, created tables/views, queried data, etc. Utilized CTE's, Window Functions, and other DDL, DQL, DML, and DCL scripts.

advanced-sql cte databases dcl ddl dml dql group-by joins python query sql sqlite tables views window-functions

Last synced: 02 May 2026

https://github.com/dmarks84/ind_project_data-science-london-scikit-learn--kaggle

Independent Project - Kaggle Competition -- I worked on the Data Science London data set for the Data Science London + Scikit-learn competition.

classification cross-validation data-modeling data-reporting data-visualization dataframes eda grid-search matplotlib numpy pandas python sklearn statistics supervised-ml

Last synced: 06 Apr 2026

https://github.com/dmarks84/ind_project_docker-image-pnw-weather-app

Independent Project - I created a Docker image that stands up a website that live weather alerts on an interactive map.

api dash devops docker docker-images dockerfile folium geopandas json plotly python requests webapp websites

Last synced: 05 May 2026

https://github.com/dmarks84/ibm_ds

A temporary repository for the work I'm doing in the IBM Data Science course

Last synced: 09 Apr 2025

https://github.com/dmarks84/ibm-ds-capstone

Files for my capstone project for the IBM Data Science Professional Certificate

Last synced: 09 Apr 2025

https://github.com/dmarks84/coursework_project_ml-model-eval-refine

Project for IBM Data Science course on ML Models & Analysis -- Read in large dataset of home sales and utilized polynomial linear regression analysis to make predictions of future home sales prices

classification communication data-modeling dataframes machine-learning matplotlib numpy pandas programming python regression scikit-learn scipy seaborn supervised-ml visualization

Last synced: 09 Apr 2026