Projects in Awesome Lists tagged with data-collection
A curated list of projects in awesome lists tagged with data-collection .
https://github.com/naibowang/easyspider
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
batch-processing batch-script code-free crawler data-collection frontend gui html input-parameters layman parameters robotics rpa scraper spider visual visualization visualprogramming web www
Last synced: 12 May 2025
https://github.com/NaiboWang/EasySpider
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
batch-processing batch-script code-free crawler data-collection frontend gui html input-parameters layman parameters robotics rpa scraper spider visual visualization visualprogramming web www
Last synced: 20 Mar 2025
https://github.com/airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
bigquery change-data-capture data data-analysis data-collection data-engineering data-integration data-pipeline elt etl java mssql mysql pipeline postgresql python redshift s3 self-hosted snowflake
Last synced: 09 Sep 2025
https://github.com/snowplow/snowplow
The leader in Next-Generation Customer Data Infrastructure
analytics data data-collection data-pipeline marketing-analytics product-analytics snowplow snowplow-events snowplow-pipeline
Last synced: 13 May 2025
https://github.com/cloudquery/cloudquery
Data pipelines for cloud config and security data. Build cloud asset inventory, CSPM, FinOps, and vulnerability management solutions. Extract from AWS, Azure, GCP, and 70+ cloud and SaaS sources.
airbyte attack-surface-management aws azure bigquery cspm data data-analysis data-collection data-engineering data-integration elt etl etl-framework gcp github-api go google kubernetes sql
Last synced: 16 May 2026
https://github.com/firecrawl/firecrawl-mcp-server
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
batch-processing claude content-extraction data-collection firecrawl firecrawl-ai javascript-rendering llm-tools mcp mcp-server model-context-protocol search-api web-crawler web-scraping
Last synced: 07 Apr 2026
https://github.com/jitsucom/jitsu
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
bigquery clickhouse data-collection data-connectors data-integration golang postgres redshift snowflake
Last synced: 18 May 2026
https://github.com/brightdata/brightdata-mcp
A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.
ai-agents ai-integrations anti-bot-detection browser-automation data-collection data-extraction llm mcp mcp-server modelcontextprotocol scraping scraping-tools structured-data web-crawling web-data web-scraping
Last synced: 16 Jan 2026
https://github.com/plan-player-analytics/Plan
Player Analytics plugin for Minecraft Server platforms - View player activity of your server with ease. :calendar:
analytics bukkit-plugin bungeecord-plugin data-collection fabric-mod hacktoberfest mysql nukkit-plugin spigot-plugin sponge-plugin sqlite statistics velocity-plugin visualization webserver
Last synced: 14 Mar 2025
https://github.com/plan-player-analytics/plan
Player Analytics plugin for Minecraft Server platforms - View player activity of your server with ease. :calendar:
analytics bukkit-plugin bungeecord-plugin data-collection fabric-mod hacktoberfest mysql nukkit-plugin spigot-plugin sponge-plugin sqlite statistics velocity-plugin visualization webserver
Last synced: 01 Mar 2026
https://github.com/getodk/collect
ODK Collect is an Android app for filling out forms. It's been used to collect billions of data points in challenging environments around the world. Contribute and make the world a better place! ✨📋✨
android data-collection global-development global-health java mhealth mobile-data-collection odk social-impact xforms
Last synced: 10 Apr 2026
https://github.com/chaoss/augur
Python library and web service for Open Source Software Health and Sustainability metrics & data collection. You can find our documentation and new contributor information easily here: https://oss-augur.readthedocs.io/en/main/
chaoss data-collection data-modeling data-visualization defined-metrics facade git github hacktoberfest hacktoberfest2020 health linux linux-foundation metrics open-source opensource python-library research sustainability unix
Last synced: 21 Jan 2026
https://github.com/pnoker/iot-dc3
IoT DC3 is a 100% open-source, distributed Internet of Things (IoT) platform built on Spring Cloud. It accelerates IoT project development and simplifies IoT device management, offering a comprehensive solution for building robust IoT systems.
data-collection dcs docker gateway iot java lwm2m modbus mqtt multi-protocol opc-ua plc rpc rtsp s7 socket spring-cloud tcp things
Last synced: 16 Jul 2025
https://github.com/chapmanjacobd/library
99+ CLI tools to build, browse, and blend your media library
broadcatching cli command-line curation data-collection datacuration datasette-tool ffmpeg ffprobe files folders gallery-dl media mpv music playlist qbittorrent-nox sqlite videos yt-dlp
Last synced: 06 Jan 2026
https://github.com/zhaoyachao/zdh_web
大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块
bigdata collection data data-collection datapipeline datax-web etl pipline scheduler spark sparketl
Last synced: 04 Apr 2025
https://github.com/scriptsmith/reaper
Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
api data-collection data-mining data-scraping facebook gui pinterest reddit scraping socialmedia tumblr twitter youtube
Last synced: 07 Apr 2025
https://github.com/ScriptSmith/reaper
Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
api data-collection data-mining data-scraping facebook gui pinterest reddit scraping socialmedia tumblr twitter youtube
Last synced: 04 Apr 2025
https://github.com/K3V1991/Disable-Firefox-Telemetry-and-Data-Collection
How to disable Firefox Telemetry and Data Collection
blocking browser config data data-collection disable firefox how-to list mozilla mozilla-firefox off options privacy reporting security server settings telemetry tutorial
Last synced: 13 Apr 2025
https://github.com/silverton-io/buz
Serverless multi-protocol + multi-destination event collection system.
analytics analytics-tracking cloudevents cloudevents-schema contracts data data-collection data-platform eventbridge jsonschema product-analytics redpanda redpanda-console schema-registry schema-validation snowplow-analytics streaming-analytics streaming-data webhook-receiver webhook-server
Last synced: 12 Apr 2025
https://github.com/wq/wq.app
💻📱 wq's app library: a JavaScript framework powering offline-first web & native apps for geospatial data collection, mobile surveys, and citizen science. Powered by Redux, React, Material UI and Maplibre GL.
citizen-science data-collection geospatial gis mobile mobile-app offline offline-first survey wq-framework
Last synced: 26 Mar 2025
https://github.com/CertifaiAI/classifai
:fire: One of the most comprehensive open-source data annotation platform.
annotation annotation-tool big-data computervision data-annotation data-collection data-science deep-learning labelling machine-learning
Last synced: 11 May 2025
https://github.com/networkdynamics/pytok
A web scraper for TikTok using Playwright
data-collection tiktok tiktok-api tiktok-scraper web-scraper
Last synced: 19 Jan 2026
https://github.com/wq/wq.db
☁🌐 wq's db library, extending Django REST framework to support apps for geospatial field data collection, citizen science, and crowdsourcing.
citizen-science data-collection django django-rest-framework rest-api wq-framework
Last synced: 10 Jan 2026
https://github.com/douglasneuroinformatics/opendatacapture
An electronic data capture platform for administering remote and in-person clinical instruments
clinical data-collection electronic-data-capture esbuild form-builder full-stack monaco-editor monorepo multilingual nodejs prisma react research tailwindcss turborepo typescript
Last synced: 09 Apr 2026
https://github.com/bps-statistics/form-gear
FormGear is a framework engine for dynamic form creation and complex form processing and validation for data collection.
census data data-collection form-builder form-engine form-generator national-statistics official-statistics survey survey-builder survey-form
Last synced: 13 Oct 2025
https://github.com/Minipada/ros2_data_collection
Collect, validate and send data reliably from ROS 2 to create APIs and dashboards.
Last synced: 13 May 2025
https://github.com/ineffyble/genders.wtf
data-collection forms gender genders
Last synced: 05 Apr 2025
https://github.com/mxdldev/android-amap-track-collect
这阵子由于项目需要,需要从手机上采集用户的运动轨迹数据,这样的功能大家都见到的很多了,比如咕咚、悦动圈,对跑步运动轨迹数据进行采集,再如,微信运动、钉钉运动,对于每一天你走步进行计数,如果要记录轨迹就离不开的手机定位,如果要记录步数那就离不开陀螺仪(角速度传感器),花了一天多的时间实现了一个定位数据实时采集的功能。
android data-collection gps location motion-track
Last synced: 17 Jul 2025
https://github.com/andreztz/pyradios
A Client for the Radio Browser API
api data-collection entertainment internet-radio internet-radio-stations music open-api python radio-browser radio-stations streaming
Last synced: 07 Sep 2025
https://github.com/pantunes/xtcryptosignals
Cryptocurrencies price data collection, price tickers, signals notifications, charts, Telegram bot and more.
agregator altcoins api bitcoin crypto-currencies cryptocurrency data-collection ethereum exchange exchange-api notifications portfolio service signals-notifications ticker trading
Last synced: 30 Apr 2025
https://github.com/akvo/akvo-flow
A data collection and monitoring tool that works anywhere.
agpl akvo akvo-flow data-collection java
Last synced: 29 Aug 2025
https://github.com/melvynator/ELK_twitter
This is a data pipeline for Twitter (ETL) using the elastic stack Elasticsearch, Logstash and Kibana (version 6.1)
data-collection data-visualization elasticsearch elk elk-stack kibana logstash machine-learning natural-language-processing twitter twitter-api
Last synced: 30 Aug 2025
https://github.com/chaindexing/chaindexing-rs
Index any EVM chain and query in SQL
blockchain dapp dapps data-collection developer-tools ethereum evm indexer multichain postgres smart-contracts solidity sql sync-engine web3
Last synced: 07 Mar 2026
https://github.com/melvynator/elk_twitter
This is a data pipeline for Twitter (ETL) using the elastic stack Elasticsearch, Logstash and Kibana (version 6.1)
data-collection data-visualization elasticsearch elk elk-stack kibana logstash machine-learning natural-language-processing twitter twitter-api
Last synced: 14 Jul 2025
https://github.com/graphlit/graphlit-mcp-server
Model Context Protocol (MCP) Server for Graphlit Platform
claude content-extraction content-ingestion data-collection llm-tools mcp-server model-context-protocol search-api unstructured-data web-crawler web-scraping
Last synced: 12 Oct 2025
https://github.com/ilaria-manco/song-describer
Song Describer is a data collection platform for annotating music with textual descriptions.
annotations audio-captioning data-collection music-dataset
Last synced: 24 Sep 2025
https://github.com/getodk/javarosa
The core library that many of the ODK tools are built around. It's written in Java, implements the ODK XForms spec, and runs on mobile devices and cloud servers. ✨🏗✨
data-collection global-development global-health java mhealth mobile-data-collection odk xforms
Last synced: 26 Apr 2026
https://github.com/pzaino/thecrowler
A Content Discovery and Development Platform. Empowering Cybersecurity, AI, Marketing, and Finance professionals and researchers to discover, analyze, and interact with the web in all its dimensions.
automation blue-team-tool content-detection content-discovery crawler crawling cyber-security cybersecurity cybersecurity-tools data-collection data-science distributed-systems golang indexer indexing reconnaissance red-team-tools scraping search-engine vulnerability-detection
Last synced: 06 Feb 2026
https://github.com/ntivirikin/xeno-canto-py
Python wrapper for the xeno-canto.org API to aid in downloading and managing recordings.
api-wrapper birding birds birdsong classification data-collection data-mining json metadata python scraper song xeno-canto xenocanto
Last synced: 21 Feb 2026
https://github.com/davidberenstein1957/dataset-viber
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
data-collection data-quality evaluation human-feedback
Last synced: 06 Mar 2025
https://github.com/leogregianin/ibge
🌎 Data collection of geographical divisions of Brazil by IBGE
brasil brazil data-collection ibge json
Last synced: 23 Jul 2025
https://github.com/edgee-cloud/edgee
The full-stack edge platform for your edge oriented applications.
data-collection edge edge-computing edgee http https proxy rust wasm wasm-component webassembly
Last synced: 02 Jan 2026
https://github.com/wooster0/shifting
A privacy-focused list of alternatives to online services.
big-data browser data-collection facebook gmail google list microsoft open-source privacy privacy-enhancing-technologies privacy-protection privacy-tools search-engine security services youtube
Last synced: 05 May 2025
https://github.com/documents-brasil/ibge
🌎 Data collection of geographical divisions of Brazil by IBGE
brasil brazil data-collection ibge json
Last synced: 12 Apr 2025
https://github.com/khuangaf/itri-speech-recognition-dataset-generation
Automatic Speech Recognition Dataset Generation
automatic data-collection mask-rcnn speech-recognition
Last synced: 30 Apr 2025
https://github.com/atapas/js-collections-map-set
Repository to have example code to demonstrate JavaScript Map and Set data structures.
data-collection data-structures javascript map set
Last synced: 12 Apr 2025
https://github.com/getodk/central-frontend
Vue.js based frontend for ODK Central
data-collection global-development global-health javascript mhealth odk social-impact vuejs
Last synced: 15 Mar 2026
https://github.com/gaalcaras/mailinglistscraper
A python web scraper for public email lists.
data-collection mailinglist scraper scrapy spider webscraping
Last synced: 02 Aug 2025
https://github.com/fulldecent/google-voice-numbers
Retrieves the full list of available Google Voice numbers and finds the best ones
data-collection google-voice harvest harvest-data spider telephone-number telephony
Last synced: 30 Dec 2025
https://github.com/gidim/babler
Data Collection System For NLP/Speech Recognition
blogs data-collection forums language-modeling machine-learning nlp scraping
Last synced: 16 May 2025
https://github.com/nuhmanpk/webtrench
A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code
audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets
Last synced: 21 Mar 2025
https://github.com/nuhmanpk/Webtrench
A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code
audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets
Last synced: 08 Jul 2025
https://github.com/lcsrodriguez/ecocal
Worldwide economic calendar Python package (details, estimates, market news, ...)
data-collection economic-calendar financial-events multithreaded python webscraping
Last synced: 17 May 2026
https://github.com/alttch/pulr
pull devices and transform data into events
automation data data-collection data-conversion ethernet-ip industiral modbus plc plc-programming snmp
Last synced: 28 Apr 2025
https://github.com/esri/data-collection-dotnet
Data collection application built using the .NET Runtime SDK.
arcgis data-collection dotnet offline online open-source-app popup related-records runtime runtime-sdk wpf
Last synced: 07 Jul 2025
https://github.com/eurostat/pyrostat
API (Python) for Eurostat data collections upload
Last synced: 05 Feb 2026
https://github.com/lironmiz/pcep-30-0x
PCEP™ – Certified Entry-Level Python Programmer certification shows that the individual is familiar with universal computer programming concepts like data types, containers, functions, conditions, loops, as well as Python programming language syntax, semantics, and the runtime environment.
certificate control-flow course data-collection data-types education exceptions functions input-output learning-by-doing literals numeral-systems operations operators pcap practice python-syntax-and-semantics python3 runtime-environment variables
Last synced: 18 Mar 2025
https://github.com/mahtafetrat/manatts-persian-speech-dataset
ManaTTS is the largest open Persian speech dataset with 100+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
data-collection data-preprocessing dataset-preparation forced-alignment mana-tts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset
Last synced: 08 Apr 2025
https://github.com/qarmin/system-info-collector
App to collect ram/cpu usage from OS and show it in pretty graphs
Last synced: 06 Jul 2025
https://github.com/robotology/wearables
Code moved to https://github.com/robotology/human-dynamics-estimation
data-collection force-torque-sensor framework imu sensor wearable wearable-devices
Last synced: 16 Mar 2026
https://github.com/cph-cachet/carp.core-kotlin
Infrastructure-agnostic framework for distributed data collection.
data-collection ddd distributed-computing hacktoberfest mhealth research research-platform
Last synced: 19 Apr 2025
https://github.com/unicornunicode/FACT
FACT is a tool to collect, process and visualise forensic data from clusters of machines running in the cloud or on-premise.
cloud data-collection forensics
Last synced: 12 Jul 2025
https://github.com/akvo/akvo-flow-mobile
Akvo Flow app
akvo-flow android data-collection gplv3 java
Last synced: 29 Aug 2025
https://github.com/xsser01/phantomcollect
Advanced stealth web data collection framework for security
cybersecurity-tools data-collection ethical-hacking fingerprinting monitoring monitoring-tools open-source osint-tools privacy python recon-tools tools web-security
Last synced: 12 Mar 2026
https://github.com/MahtaFetrat/ManaTTS-Persian-Speech-Dataset
ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
data-collection data-preprocessing dataset-preparation forced-alignment mana-tts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset
Last synced: 01 Mar 2025
https://github.com/gear5sh/gear5
high performance better alternative to Airbyte, Singer, Meltano
airbyte data data-collection data-engineering data-engineering-pipeline data-ingestion elt etl etl-framework g5 gear5 golang meltano singer singer-io singer-tap
Last synced: 14 Jan 2026
https://github.com/cardi/aws-spot-price-history
automating aws spot price history retrieval
aws-ec2 data-collection spot-instances
Last synced: 12 May 2025
https://github.com/sowinskibraeden/dayz-reforger
A general purpose Discord bot to handle DayZ Killfeed, stats, alarms and factions' armbands using Nitrado log files.
data-analytics data-collection dayz discord discord-bot discord-js fetch-api mongodb nitrado regex scalability
Last synced: 09 Jul 2025
https://github.com/pvernier/pykobo
A Python module to fetch data from the Kobo API
api data-collection kobo kobo-toolbox kobocollect kobotoolbox xlsform
Last synced: 07 Feb 2026
https://github.com/irfanalidv/trustpilot_scraper
A Python library for scraping Trustpilot reviews.
beautifulsoup data-collection data-extraction etl-pipeline review-scraper text-mining trustpilot web-scraping-python
Last synced: 14 Jan 2026
https://github.com/dadosjusbr/alba
Sistema para escalonamento e orquestração de execuções, visando a automatização de processos do DadosJusBR
coleta-de-dados dados-abertos dadosabertos data-collection hacktoberfest open-data opendata
Last synced: 14 Jan 2026
https://github.com/abeltavares/marketpipe
🛠 Containerized and configurable Airflow ETL pipeline for collecting and storing stock and cryptocurrency market data.
airflow aws ci-cd cryptocurrency data-analysis data-collection data-storage docker iac oop pgadmin pipeline postgresql python sql stocks unit-testing
Last synced: 22 Apr 2025
https://github.com/harisbinzia/mastodoner
Mastodoner is a command line tool (and Python library) for archiving Mastodon, a decentralized micro-blogging social network.
data-collection mastodon social-network
Last synced: 12 Apr 2025
https://github.com/mumarshahbaz/oscilloscope-online-v2
Web Serial Plotter with as much customization as possible. Custom Colors, Automatic Timescale, Live data visualization! Plot as many graphs as you can with just a click of a button! Truly, an online Oscilloscope!
arduino automatic-time customizable data-collection data-visualization esp experiment online oscilloscope serial-plotter timescale web
Last synced: 13 Sep 2025
https://github.com/redayzarra/sleepapneadetection
My capstone project explores machine learning, hardware, and web development to create a smart home system for monitoring the health of homebound patients suffering from sleep apnea. The system includes data collection through sensors, embedded ML (TinyML) to analyze data, and web development for creating a medical dashboard.
arduino arduino-ide capstone capstone-project data-collection embedded-systems machine-learning machine-learning-algorithms medical mern mern-project mern-stack python tinyml web-development
Last synced: 08 Oct 2025
https://github.com/ikstream/dalec
Dalec is a project that aims to provide a privacy preserving data collection method. It utilizes DNS for client/server seperation while transmiting data encrypted
collection data data-collection dns exfiltration shell
Last synced: 11 Aug 2025
https://github.com/dsacms/metrics
Experimentations in Open Source Repository Metrics
cmsgov data-collection data-visualization git github github-pages health html-css-javascript metrics opensource python website
Last synced: 13 Apr 2025
https://github.com/munroe-meyer-institute-vr-laboratory/cometrics
Clinical tool for coregistration of frequency and duration based behavior, physiological signals, and video data. Session tracking features streamline multi-session clinical data recording.
behavior behavior-analysis behavioral-sciences biometrics clinical-research data-annotation-machine-learning data-annotation-tools data-collection empatica-e4
Last synced: 17 Jan 2026
https://github.com/mahtafetrat/gptinformal-persian-speech-dataset
A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject
data-collection data-preprocessing dataset-preparation forced-alignment mana-tts manatts persian persian-speech speech-corpus speech-data-collection speech-dataset speech-processing speech-synthesis text-to-speech text-to-speech-dataset tts tts-dataset
Last synced: 19 Jan 2026
https://github.com/sodascience/social_science_inferences_with_llms
Addressing LLM-related measurement error in social science modeling research.
data-collection inference large-language-models llms
Last synced: 30 Jan 2026
https://github.com/exloud/windows-telemetry-disabler
PowerShell/Batch utility that permanently disables Windows telemetry and data-collection services on Windows 10/11.
batch-script data-collection disable disable-services disabler exloud log nsudo powershell-script privacy telemetry windows windows7 windows7-windows11 windows8-1
Last synced: 06 Oct 2025
https://github.com/ivopetkov/data-object
A familiar and powerful Data Object abstraction for PHP.
data-collection data-list data-object filter sort
Last synced: 26 Oct 2025
https://github.com/sowinskibraeden/dayz-data-collection
Collects Log Data from Nitrado DayZ server.
Last synced: 09 Jul 2025
https://github.com/sferez/twitter_toolbox
Complete Toolbox for Scraping, Streaming, Interact with API, Cleaning, Preprocessing, Applying NLP on Twitter Data
data-collection data-science nlp preprocessing twitter twitter-api twitter-scraping twitter-streaming-api
Last synced: 10 Apr 2025
https://github.com/aymane-maghouti/real-time-data-pipeline-using-kafka
This project implements a real-time data pipeline using Apache Kafka, Python's psutil library for metric collection, and SQL Server for data storage. The pipeline collects metrics data from the local computer, processes it through Kafka brokers, and loads it into a SQL Server database. Additionally, a real-time dashboard is created using Power BI.
apache-kafka data-collection data-streaming data-visualization powerbi python real-time real-time-data-pipeline
Last synced: 05 Jul 2025
https://github.com/beingvirus/jobminer
JobMiner – A Python-based web scraping toolkit for extracting and organizing job listings from multiple websites into structured data.
automation beautifulsoup career crawler data-collection data-mining hacktoberfest hacktoberfest-accepted hacktoberfest2025 job-scraper jobs open-source python selenium web-scraping
Last synced: 10 Oct 2025
https://github.com/kartta-labs/noter-frontend
Photo annotation tool
annotation crowdsourcing data-collection historical-data historical-maps photos
Last synced: 16 Aug 2025
https://github.com/amacsmith/macfly
The project's idea is to be able to add URLs to a list that most likely consist of live data. The scrapers will do an initial scrape of the site and send that along with a prompt to an AI model to regenerate a page displaying charts or explain the data retrieved. While allowing the scrapers to continue and push live scraped data to ai generate page
ai data-collection ml possibilities scraper website-generator websockets
Last synced: 01 Feb 2026
https://github.com/thatsinewave/guardianwatch-bot
Simple discord bot that grabs all the public data about each user inside a server and outputs a list
administrative-tools csv-export data-collection discord-api discord-bot discord-py discord-token educational good-first-contribution good-first-issue good-first-pr good-first-project google-sheets-api member-information mit-license open-source python server-management thatsinewave user-analytics
Last synced: 27 Sep 2025
https://github.com/shuyib/teaching_data_collection
Learn data collection by putting a couple of things into consideration
best-practices data-collection data-science data-structures data-visualization makefile matplotlib pandas-dataframe polars-dataframe
Last synced: 22 Mar 2025
https://github.com/yousefkotp/local-leads-finder
Local Leads Finder helps you uncover nearby business prospects in minutes, enter a keyword and city, watch real-time progress, and download clean lead lists ready for outreach. Perfect for agencies, freelancers, and growth teams who need consistent, enriched local data without the heavy work.
api-integration business-intelligence data-collection flask google-maps lead-finder lead-generation lead-generation-bot lead-generation-data lead-generation-tool leads local-business local-businesses marketing-automation prospecting python sales-tools web-scraping web-scraping-python
Last synced: 06 Apr 2026
https://github.com/solrikk/datadigger
DataDigger is a powerful and intuitive web application designed to extract and analyze data from web pages.
business-intelligence content-extraction data-analysis data-collection data-extraction data-mining go golang-api html-parser marketing-tools metadata-extraction research-tools seo-tools web-application web-crawling web-scraping web-tools
Last synced: 15 Apr 2025
https://github.com/alitahir4024/data-collecting-project
This project is simple data collection project and to practise JS local storage skills
data-collection html-css-javascript localstorage
Last synced: 19 Mar 2026
https://github.com/dmdhrumilmistry/githubprofilescraper
Scrapes github profiles and stores data in json format
data-collection dmdhrumilmistry github-scraper python3 scraper
Last synced: 18 Jul 2025
https://github.com/benweare/givemeed
A Digital Micrograph Script to collect three-dimension electron diffraction data.
crystallography data-collection digital-micrograph electron-crystallography electron-diffraction electron-microscopy micro-ed three-dimensional-electron-diffraction transmission-electron-microscopy
Last synced: 07 Apr 2026
https://github.com/tuvalabs/django-inapp-survey
In App Survey/Announcement for Django Application
angular announcement-banner campaign data-collection django django-application django-inapp-survey django-rest-framework question-and-answer survey survey-app
Last synced: 26 Oct 2025
https://github.com/kmader/easy_dash
A library for making Dash apps easier to build and particularly focusing on common data collection use cases
data-collection python reactive-programming scientific-computing web-gui widgets
Last synced: 25 Oct 2025
https://github.com/artucuno/guild-network-map
Create a map of all mutual guilds that members share with you on Discord.
data-collection data-visualization discord
Last synced: 26 Feb 2026
https://github.com/firefly-cpp/succulent
Collect POST requests
data-collection data-preprocessing-pipelines data-science esp32 machine-learning raspberry-pi
Last synced: 02 Aug 2025