{"id":15651829,"url":"https://github.com/leoncvlt/etf4u","last_synced_at":"2025-04-14T03:33:50.733Z","repository":{"id":40246386,"uuid":"336879089","full_name":"leoncvlt/etf4u","owner":"leoncvlt","description":"📊 Python tool to scrape real-time information about ETFs from the web and mixing them together by proportionally distributing their assets allocation","archived":false,"fork":false,"pushed_at":"2024-09-23T16:02:10.000Z","size":61,"stargazers_count":37,"open_issues_count":4,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-27T17:35:30.143Z","etag":null,"topics":["etfs","investing","python","scraping","stock-market"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/leoncvlt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-07T20:09:54.000Z","updated_at":"2024-12-10T21:55:56.000Z","dependencies_parsed_at":"2024-10-23T06:53:08.850Z","dependency_job_id":null,"html_url":"https://github.com/leoncvlt/etf4u","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leoncvlt%2Fetf4u","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leoncvlt%2Fetf4u/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leoncvlt%2Fetf4u/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leoncvlt%2Fetf4u/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/leoncvlt","download_url":"https://codeload.github.com/leoncvlt/etf4u/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248815504,"owners_count":21165935,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["etfs","investing","python","scraping","stock-market"],"created_at":"2024-10-03T12:40:18.953Z","updated_at":"2025-04-14T03:33:50.703Z","avatar_url":"https://github.com/leoncvlt.png","language":"Python","funding_links":["https://www.buymeacoffee.com/leoncvlt"],"categories":[],"sub_categories":[],"readme":"# etf4u\n\nA python tool which scrapes real-time information about ETF holdings from the web, and can blend multiple ETF together by proportionally distributing their assets allocation, optionally clamping the maximum amount of assets for the blended fund.\n\n## Installation \u0026 Requirements\n\nMake sure you're in your virtual environment of choice, then run\n- `poetry install --no-dev` if you have [Poetry](https://python-poetry.org/) installed\n- `pip install -r requirements.txt` otherwise\n\n## Usage\n\n```\netf4u [-h] [--clamp CLAMP] [--minimum MINIMUM] [--exclude EXCLUDE [EXCLUDE ...]] [--include INCLUDE [INCLUDE ...]] [--out-file OUT_FILE] [--no-cache] [-v] funds [funds ...]\n\npositional arguments:\n  funds                 A list of ETF symbols (or a single one) to scrape\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --clamp CLAMP         Clamp the number of maximum assets to this value, redistributing weights\n  --minimum MINIMUM     Remove all assets with allocation smaller than this number after redistribution\n  --exclude EXCLUDE [EXCLUDE ...]\n                        A list of tickers to exclude from the scraped portfolio. Pass the tickers directly to the argument (e.g. --exclude AAA BBB CCC) Or pass the path to a text file containing the tickers\n  --include INCLUDE [INCLUDE ...]\n                        Only include assets whose ticker appear in this list. Pass the tickers directly to the argument (e.g. --include AAA BBB CCC) Or pass the path to a text file containing the tickers\n  --out-file OUT_FILE   Exports the holdings list to this comma-separated (.csv) file\n  --no-cache            Don't use cache files to load or store data\n  -v, --verbose         Increase output log verbosity\n  ```\n\n## Explanation\n\nThe tool exports the list of assets as a simple `{ [asset_symbol] : [weight] }` dictionary format. Use the `--out-file` option to export this to a .csv. \n\nWhen going through the provided ETF symbols, the script checks if there's a bespoke adapter defined to fetch information for that specific fund (some ETFs provides the full list of holdings on their website) - if not found, it uses a generic adapter that scrapes https://etfdb.com/ - the public version of the website only publishes the top 15 holdings for a fund but the script mixes and matches several requests with different sorting criterias to try and get the largest amount of data possible.\n\nAll data is cached on a daily basis, meaning that using the same fund in multiple commands will only scrape the real-time information once a day and then re-use the data from disk afterwards. Use the `--no-cache` flag to always query real-time data.\n\nYou can also use the tool to scrape a single ETF by passing only one symbole to the `--funds` parameter and not supplying the `--clamp` option.\n\n## Creating an adapter\n\nSimply create a new `.py` file in the `adapters` folder, implementing the following:\n\n- A `FUNDS` variable in the module's scope containing a list of ETF symbols that should be processed with this adapter\n- a `fetch()` method which takes a `fund` parameter being the ETF symbol, and returns a dictionary of assets and their weights in the  `{ [asset_symbol] : [weight] }` format\n\nNo need to add anything else, the script automatically checks all modules in the `adapters` folder when processing funds. For a practical examples, check the existing adapters.\n\n## Example usage\n`python etf4u ARKK ARKW ARKQ ARKF ARKG --clamp 50 --out-file blend_ark.csv`\nAdds together all holdings the 5 ARK’s Active ETFs, keeps only the top 50 holdings on the list, rebalances all weights proportionally and exports the assets list to the `blend_ark.csv` file \n\n## Support [![Buy me a coffee](https://img.shields.io/badge/-buy%20me%20a%20coffee-lightgrey?style=flat\u0026logo=buy-me-a-coffee\u0026color=FF813F\u0026logoColor=white \"Buy me a coffee\")](https://www.buymeacoffee.com/leoncvlt)\nIf this tool has proven useful to you, consider [buying me a coffee](https://www.buymeacoffee.com/leoncvlt) to support development of this and [many other projects](https://github.com/leoncvlt?tab=repositories).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleoncvlt%2Fetf4u","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fleoncvlt%2Fetf4u","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleoncvlt%2Fetf4u/lists"}