An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by pcpp94

A curated list of projects in awesome lists by pcpp94 .

https://github.com/pcpp94/web_scraping_jwt_complex_auth

A customized web scraping tool for a specific website that utilizes JavaScript-based login and JWT security. This repository is equipped to handle JWT token management, simulate user login flows, and interact with dynamically loaded content, providing an effective solution for extracting data from complex, modern web applications.

js jwt-authentication webscraping

Last synced: 30 Apr 2026

https://github.com/pcpp94/demanda_from_messy_excels_thefuzz

A specialized tool for processing a designated folder of Excel files with varying sheets, formats, and inconsistent data types. This repository corrects file extensions, ensures accurate data types for each column, and compiles all data into a single, structured table with millions of rows, ready for analysis. Handles many inconsistent Excels.

etl excel

Last synced: 22 Mar 2025

https://github.com/pcpp94/parsing_wrongly_formatted_excels

A tailored solution for a specific folder containing Excel files with varied and incorrect extensions (.xml, .xlsx, .xls). This repository automatically identifies and processes files with inconsistent formats, applies smart cleaning functions, and standardizes data structures, ensuring reliable data extraction and cleanup from an updated folder

etl excel

Last synced: 22 Mar 2025

https://github.com/pcpp94/nano_gpt_test

his repository provides a minimalist implementation of a GPT-style language model, crafted from scratch using transformer architectures. Inspired by OpenAI's GPT, this project builds a nano-scale version designed to demystify the mechanics of transformer-based language models.

gpt transformers

Last synced: 22 Mar 2025

https://github.com/pcpp94/prophet_gb_demand_forecasting_one

This repository provides tools for forecasting Great Britain’s power demand using Facebook’s Prophet model. It includes Python functions and Jupyter notebooks for daily and monthly forecasts, integrating weather and GDP as regressors, with hyperparameter tuning via HyperOpt.

bayesian-inference bayesian-optimization daily decomposition demand energy forecast forecasting holidays hourly mathematics monthly power prophet time-series time-series-analysis timeseries weather

Last synced: 10 Jun 2025

https://github.com/pcpp94/arcgis_api_data_parsing_double_authentication

A specialized web scraping tool for an ArcGIS-powered website, featuring support for dual-layer authentication: standard login and Microsoft Office authentication. This repository enables secure login, retrieves the necessary API tokens, and allows for comprehensive data extraction across all available layers, providing an efficient solution for ac

api arcgis webscraping

Last synced: 14 Jul 2025

https://github.com/pcpp94/elexon_pipeline_gb_demand

Guidelines and code snippets for extracting and processing Elexon gross demand data on Databricks. Provides half-hourly GB demand at sectoral (Domestic, Non-domestic), GSP-area granularity, settlement demand, and embedded generation. Supports non-commodity cost calculations for CfD, RO, and FiT.

data electricity elexon gb octopusenergy power powerdata pypsa uk

Last synced: 12 Jul 2025

https://github.com/pcpp94/webscraping_asp.net_form

A Python-based web scraping tool tailored for an ASP.NET website that incorporates iframes and requires basic authentication. This repository provides all necessary components for handling authentication, navigating iframes, and extracting data efficiently, making it suitable for scraping data from complex ASP.NET pages.

asp-net iframe webscraping

Last synced: 17 May 2026

https://github.com/pcpp94/streamlit_tests

This repository is dedicated to exploring Streamlit, a powerful Python library for building interactive, data-driven web applications with ease. Here, you'll find examples showcasing Streamlit’s widgets, data visualization capabilities, and layout customization options.

Last synced: 16 May 2026

https://github.com/pcpp94/oxml-fundamentals

Archive - Attending OxML Fundamentals 2024

ml oxford oxml

Last synced: 16 Mar 2025

https://github.com/pcpp94/arcgis_web_downloader

A tool for extracting all data from an ArcGIS web server via REST APIs, with support for pagination and bulk downloads.

api arcgis download etl gis rest

Last synced: 16 Mar 2026

https://github.com/pcpp94/raw_etl_pipeline

A streamlined ETL solution for ingesting and processing legacy data formats with minimal resources. Includes daily and weekly .bat scripts on Task Scheduler for automated extraction, cleaning, and normalization, turning complex files into structured data effortlessly.

etl legacy proof-of-concept

Last synced: 27 Jun 2025