{"id":15671760,"url":"https://github.com/defgsus/billion-bubbles","last_synced_at":"2025-03-30T05:24:33.060Z","repository":{"id":76001728,"uuid":"456300535","full_name":"defgsus/billion-bubbles","owner":"defgsus","description":"tool and website for graphing the top shareholders/insiders in (north american) capitalism","archived":false,"fork":false,"pushed_at":"2023-01-02T02:36:41.000Z","size":608,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-05T07:31:57.890Z","etag":null,"topics":["archive","financial","graph","nasdaq","relations","sec-edgar"],"latest_commit_sha":null,"homepage":"https://defgsus.github.io/billion-bubbles/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/defgsus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-06T23:34:29.000Z","updated_at":"2022-02-28T02:59:31.000Z","dependencies_parsed_at":"2023-07-04T04:06:03.784Z","dependency_job_id":null,"html_url":"https://github.com/defgsus/billion-bubbles","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/defgsus%2Fbillion-bubbles","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/defgsus%2Fbillion-bubbles/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/defgsus%2Fbillion-bubbles/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/defgsus%2Fbillion-bubbles/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/defgsus","download_url":"https://codeload.github.com/defgsus/billion-bubbles/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246280100,"owners_count":20752072,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archive","financial","graph","nasdaq","relations","sec-edgar"],"created_at":"2024-10-03T15:04:52.777Z","updated_at":"2025-03-30T05:24:33.038Z","avatar_url":"https://github.com/defgsus.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## graphing the top shareholders\n\nusing the **nasdaq.com** API which itself aggregates the\n**sec.gov** filings API.\n\nSome latest findings can be visited at\n[defgsus.github.io/billion-bubbles/](https://defgsus.github.io/billion-bubbles/)\n\nThe german *billion* is actually equal to the english *trillion*.\nThat's the range in which the top-top companies operate. So\nthis repo could as well be called *trillion troubles*\ninstead of *billion bubbles*. The bubbles, though, are \npicked as the means of representation of companies \nand shareholders.\n\n\n### usage\n\nRun the typical *python env and pip requirements* stuff then\n\nfor example:\n\n```bash\npython bubble.py --company MSFT \\\n    --depth 23 --min-share-value 10_000_000 \\\n    --output graph.graphml\n```\n\n... to start at **Microsoft** and follow all shareholders and insiders\nand the respective companies connected to them, up to a \nbranching level of **23**, while ignoring all shareholders\nbelow a position of **10 million** dollars market value.\nFinally render everything into a portable graph format.\n\nThe `output` filename determines the format. \nigraph supports [many formats](https://igraph.org/python/doc/tutorial/tutorial.html#igraph-and-the-outside-world).\nI personally suggest `graphml` because it preserves all the\nvertex and edge attributes. `gml` is also good but the\nigraph reader messes up integers larger than 32 bit, which\nis not useful because this is *trillion-trouble* data. \n\nThis will run, unfortunately, several days, and the nasdaq.com\ndatabase is stressed a bit. In fact, querying the complete\nlist of company holders or holder positions can lead to request \ntimeouts of 40 seconds, even though the page sizes \nare relatively small. Requests are repeated 3 times\nuntil they eventually work or the whole scraper fails,\nwhich did not happen yet. But it was close!\n\nThe sqlite file is growing a lot. Let's say after visiting\n5000 companies and their connected entries it's about 3.5Gb.\nIt probably can save a lot of space when ignoring the\nstock charts, but i deem them to be quite useful at some point. \n\n\n#### exporting/importing the database\n\nThis is useful if you do parallel scrapings on different\nmachines. You can merge databases together like this:\n\nExport the sqlite to compressed newline-delimited json:\n\n```bash\npython db.py export -o export.ndjson.gz -v\n````\n\nImport (all new) objects from the ndjson into sqlite:\n\n```bash\npython db.py import -i export.ndjson.gz -v\n````\n\nOmit the `.gz` extension to store uncompressed ndjson files.\n \n\n### interesting other sources\n\n- sec.gov *EDGAR*\n- https://www.cbetta.com/\n- https://anitab.org/research-and-impact/top-companies/2021-results/\n- https://fortune.com/fortune500/\n- https://www.allsides.com/\n- https://www.bilderbergmeetings.org/background/steering-committee/steering-committee\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdefgsus%2Fbillion-bubbles","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdefgsus%2Fbillion-bubbles","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdefgsus%2Fbillion-bubbles/lists"}