{"id":23758399,"url":"https://github.com/orville-wright/ticker_data","last_synced_at":"2025-09-05T04:34:02.496Z","repository":{"id":37527499,"uuid":"220338725","full_name":"orville-wright/ticker_data","owner":"orville-wright","description":"Extract live market data from Yahoo Finance, Nasdaq.com and Bigcharts.marketwatch.com","archived":false,"fork":false,"pushed_at":"2025-06-24T11:51:01.000Z","size":2155,"stargazers_count":12,"open_issues_count":0,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-24T12:28:16.706Z","etag":null,"topics":["finance","pandas","python","stock-data","stock-market","stocks","ticker-data","ticker-symbol","trading"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/orville-wright.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-11-07T22:15:10.000Z","updated_at":"2025-06-24T11:51:05.000Z","dependencies_parsed_at":"2024-12-15T22:22:18.227Z","dependency_job_id":"8a79cfed-2114-4e6b-ac42-881a2118178e","html_url":"https://github.com/orville-wright/ticker_data","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/orville-wright/ticker_data","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orville-wright%2Fticker_data","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orville-wright%2Fticker_data/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orville-wright%2Fticker_data/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orville-wright%2Fticker_data/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/orville-wright","download_url":"https://codeload.github.com/orville-wright/ticker_data/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orville-wright%2Fticker_data/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273713396,"owners_count":25154609,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-05T02:00:09.113Z","response_time":402,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["finance","pandas","python","stock-data","stock-market","stocks","ticker-data","ticker-symbol","trading"],"created_at":"2024-12-31T19:54:26.303Z","updated_at":"2025-09-05T04:33:57.474Z","avatar_url":"https://github.com/orville-wright.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ticker_data\n\nDate: 06 Oct 2021\u003cbr\u003e\n** Major updates **\u003cbr\u003e\nBigcharts did a large data schema udpate to its quote zone, which broke the data extractors badly. Interstingly Bigcharts is a subsidary of marketwatch.com, which is a subsidiary of Dow Jones \u0026 Company, which is a property of News Corp (OMG). It's also odd that this schema update happened 1 week after nasdaq.com did their major data schema update. The extractors and data cleaners/wranglers are now fully updated and re-aligned with the new schema (which had some non-trivial internal changes).\n\u003cbr\u003e\n\u003cbr\u003e\nNASDAQ.com did a major release of their Live Quote API data model. NASDAQ pushed their update out on Sept 31st and it became live on Oct 1. This broke a lot of quote related code as NASDAQ.com has divided their Live quote data model into multiple API zones. The fix has been completed and the code is now re-aligned with NASDAQ.com new data model (which is a bit messy under the covers as it's now 4+ API zones \u0026 has inconsistencies in the json data structures across the 4 zones).\u003cbr\u003e\n\u003cbr\u003e\n\u003cbr\u003e\n28 Sept 2021:\u003cbr\u003e\nYAHOO.com did a major rewrite of their internal page data structures (see disclaimer below). This broke the finance.yahoo.com core data scraper badly. The code is now fully aware of Yahoo's enhancments. The logic works well (again). I've started investigating the query1.yahoo.com API interface as an alternative to scraping.\u003cbr\u003e\n- The news ML (NLP) prepare functionaly (i.e. -n \u003csymbol\u003e and -a CMD_line options) are now stable. The NLP prep code runs without errors \u0026 the new hinter/confidence logic is complete. All this NLP pre-work code is necessary to prepare the machine to NLP read a corpus of stock news articles etc. We need to know which articles are 'Real news reports' which articels are junk adds or bogus links to junk adds, and where the final target article lives in the real world.\n\u003cbr\u003e\n\u003cbr\u003e\nOlder news...\u003cbr\u003e\n- ML NLP (Natural Language Processing) hacking continues - The machine wants to NLP read the news artciles for a stock and guess/inferr sentimnet.\n- the NLP prep system now scans the Yahoo Finance NEWS feeds of multiple stocks and inferrs (with confidence) new articles that are fake/credible, their type \u0026 thier true locality.\n- This is pretty complex (i..e deciding what a real news item is (article/reseasrch report/vide story, op-ed article etc) and then learning the final target locatlity of the article you want to the machine to NLP real. All this NLP prep-code is very finaince.yahoo.com centric, but now that its complete...it wont be difficlt to port to other news data sources.\u003cbr\u003e\n\u003cbr\u003e\n\u003cbr\u003e\n\u003cbr\u003e\n**DISCLAIMER**\u003cbr\u003e\nTis code is still in heavy development \u0026 design. Much of it works well, but a few areas are still early in their prototyping phase (e.g. ML \u0026 A.I). Also, the overall strategy behind the design is subject to change as code in key areas becomes more functional. - Use at you own risk.\u003cbr\u003e\n\u003cbr\u003e\n**SYNOPSIS**\u003cbr\u003e\nI built this App to extract live stock data (the raw data info) from various real-time Market web feeds.\u003cbr\u003e\nIt's main objectis is to literally *get at the raw underlying data*, so you can do much more interesting things with it.\u003cbr\u003e\nSee the WIKI for more info: https://github.com/orville-wright/ticker_data/wiki\n\nThe code currently supports the following data sources, data extraction methods and API's...\n  1. yahoo.com/finance  - BS4 web scraper/data extractor - (S/M/L/X sector/mkt-cap stats, top gainers/looser)\n  2. yahoo.com/news - BS4 web scraper/data extractor - all news for a ticker)\n  3. alpaca.markets.com - native python API (live stock quotes, live 60 second O/H/L/C/V candlestick bars)\n  4. bigcharts.marketwatch.com\n      * live quotes (15 m ins delayed)\n      * live company ticker details\n      * All data comes via BS4 web scraper/data extractor\n  5. nasdaq.com - Native API \u0026 JSON extractor\n     * live real-time quotes\n     * No more BS4 scraping needed (deprecated)\n     * The old NASDAQ Unusual Volume website is officially dead. The website is now a fancy/complex JavaScript site.\n     * The new site is more difficult to read as it's 100% JavaScript. The new code works with native NASDAQ API \u0026 gets pure JSON data.\n     * WARN: page is slow at the market open because unusual volume needs to build-up momentum (for 5/10 mins) before being flagged as 'unusual'.\n  6. marketwatch.com\n     * live news feed processor/reader to assist ML and AI intelligence code\n     * marketwatch.com new scraper module is not yet fully working. Although JavaScript scraping is now working in general.\n     * marketwatch.com is a very bloated rich media site, so its slow-ish but it has nice 'real-time data' and lots of rich info for ML \u0026 AI.\n     * Site is paranoid about JS validation/checking early in the client connection setup, so needs JS hack treatment.\n\nOnce I've extracted the Data, I package into a few formats...\n1. Pandas Data Frames\n2. Numpy arrays\n3. Native pythons DICT's {}\n4. scikit-learn - ML Sentiment analysis of news - (ML schemes \u003e\u003e countvectorizer, Termdoc vocab matrix, NLTM stopwords)\n\nThis is not a Backtesting framework (yawn...boring) or Day Trading trade execution platform (yawn...) or a Portfolio position dashboard (boring).\n\nThis tool's goal is to extract tones of data in real time about the market (on any day, at any moment right now) and build up a\nlarge corpus of live data to leverage as a feed into Machine Learning, Data Science \u0026 Statistics algorithyms...in order to support\ntrade strategies.\n\nThere are many websites that provide considerable data and analytics in their beautifully rich web pages, but they are slow,\nover-inflated with useless bloat, riddled with targeted adds and pointless news headlines. They are unusable as a DS/ML tool for a\ntrader who is executing trades in real-time....but the data they show is delicious and wonderful. - That's all you really need from\ntheir websites - their data.\n\nSo this tool's objective is to take their data, package it into internal API methods and focus it into ONE single pool of information.\n\u003cbr\u003e\n\u003cbr\u003e\n\n**DISCLAIMER**: Most websites do not like or appreciate data scraping apps/robots or apps that treat their website as a source of raw data (by extracting data from their underlying platform). Using this App might not be well-aligned with some website usage 'Terms \u0026 conditions'.  - Caveat emptor.\u003cbr\u003e\n\u003cbr\u003e\nThis code works on production websites/pages \u0026 API that are in constant development. The data scraping and API extraction  is coded for the internal structures of each source webpage \u0026 API; at a *point-in-time*. Those pages may change at any time as the sites do updates, enhancments \u0026 optimzations. - This may result in the data extraction \u0026 data wrangeling code breaking.\n\n\nRegards,\u003cbr\u003e\n**~Orville**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Forville-wright%2Fticker_data","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Forville-wright%2Fticker_data","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Forville-wright%2Fticker_data/lists"}