Projects in Awesome Lists by pcpp94
A curated list of projects in awesome lists by pcpp94 .
https://github.com/pcpp94/web_scraping_jwt_complex_auth
A customized web scraping tool for a specific website that utilizes JavaScript-based login and JWT security. This repository is equipped to handle JWT token management, simulate user login flows, and interact with dynamically loaded content, providing an effective solution for extracting data from complex, modern web applications.
js jwt-authentication webscraping
Last synced: 30 Apr 2026
https://github.com/pcpp94/demanda_from_messy_excels_thefuzz
A specialized tool for processing a designated folder of Excel files with varying sheets, formats, and inconsistent data types. This repository corrects file extensions, ensures accurate data types for each column, and compiles all data into a single, structured table with millions of rows, ready for analysis. Handles many inconsistent Excels.
Last synced: 22 Mar 2025
https://github.com/pcpp94/parsing_wrongly_formatted_excels
A tailored solution for a specific folder containing Excel files with varied and incorrect extensions (.xml, .xlsx, .xls). This repository automatically identifies and processes files with inconsistent formats, applies smart cleaning functions, and standardizes data structures, ensuring reliable data extraction and cleanup from an updated folder
Last synced: 22 Mar 2025
https://github.com/pcpp94/nano_gpt_test
his repository provides a minimalist implementation of a GPT-style language model, crafted from scratch using transformer architectures. Inspired by OpenAI's GPT, this project builds a nano-scale version designed to demystify the mechanics of transformer-based language models.
Last synced: 22 Mar 2025
https://github.com/pcpp94/prophet_gb_demand_forecasting_one
This repository provides tools for forecasting Great Britain’s power demand using Facebook’s Prophet model. It includes Python functions and Jupyter notebooks for daily and monthly forecasts, integrating weather and GDP as regressors, with hyperparameter tuning via HyperOpt.
bayesian-inference bayesian-optimization daily decomposition demand energy forecast forecasting holidays hourly mathematics monthly power prophet time-series time-series-analysis timeseries weather
Last synced: 10 Jun 2025
https://github.com/pcpp94/arcgis_api_data_parsing_double_authentication
A specialized web scraping tool for an ArcGIS-powered website, featuring support for dual-layer authentication: standard login and Microsoft Office authentication. This repository enables secure login, retrieves the necessary API tokens, and allows for comprehensive data extraction across all available layers, providing an efficient solution for ac
Last synced: 14 Jul 2025
https://github.com/pcpp94/elexon_pipeline_gb_demand
Guidelines and code snippets for extracting and processing Elexon gross demand data on Databricks. Provides half-hourly GB demand at sectoral (Domestic, Non-domestic), GSP-area granularity, settlement demand, and embedded generation. Supports non-commodity cost calculations for CfD, RO, and FiT.
data electricity elexon gb octopusenergy power powerdata pypsa uk
Last synced: 12 Jul 2025
https://github.com/pcpp94/webscraping_asp.net_form
A Python-based web scraping tool tailored for an ASP.NET website that incorporates iframes and requires basic authentication. This repository provides all necessary components for handling authentication, navigating iframes, and extracting data efficiently, making it suitable for scraping data from complex ASP.NET pages.
Last synced: 17 May 2026
https://github.com/pcpp94/streamlit_tests
This repository is dedicated to exploring Streamlit, a powerful Python library for building interactive, data-driven web applications with ease. Here, you'll find examples showcasing Streamlit’s widgets, data visualization capabilities, and layout customization options.
Last synced: 16 May 2026
https://github.com/pcpp94/oxml-fundamentals
Archive - Attending OxML Fundamentals 2024
Last synced: 16 Mar 2025
https://github.com/pcpp94/raw_etl_pipeline
A streamlined ETL solution for ingesting and processing legacy data formats with minimal resources. Includes daily and weekly .bat scripts on Task Scheduler for automated extraction, cleaning, and normalization, turning complex files into structured data effortlessly.
Last synced: 27 Jun 2025