{"id":15133088,"url":"https://github.com/bellingcat/vk-url-scraper","last_synced_at":"2026-02-04T00:39:15.486Z","repository":{"id":41121423,"uuid":"504525540","full_name":"bellingcat/vk-url-scraper","owner":"bellingcat","description":"Scrape VK URLs to fetch info and media - python API or command line tool. ","archived":false,"fork":false,"pushed_at":"2024-07-16T15:40:07.000Z","size":317,"stargazers_count":42,"open_issues_count":0,"forks_count":6,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-29T20:51:01.794Z","etag":null,"topics":["command-line","media-downloader","open-source-research","python","scraper","vk"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/vk-url-scraper/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bellingcat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-17T12:25:26.000Z","updated_at":"2024-10-26T05:00:00.000Z","dependencies_parsed_at":"2024-07-16T18:37:45.117Z","dependency_job_id":null,"html_url":"https://github.com/bellingcat/vk-url-scraper","commit_stats":{"total_commits":104,"total_committers":3,"mean_commits":"34.666666666666664","dds":0.08653846153846156,"last_synced_commit":"ea834c37e253d040f372f8d92b3b8b35d9edd994"},"previous_names":[],"tags_count":39,"template":false,"template_full_name":"allenai/python-package-template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bellingcat%2Fvk-url-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bellingcat%2Fvk-url-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bellingcat%2Fvk-url-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bellingcat%2Fvk-url-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bellingcat","download_url":"https://codeload.github.com/bellingcat/vk-url-scraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233990110,"owners_count":18762135,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line","media-downloader","open-source-research","python","scraper","vk"],"created_at":"2024-09-26T05:00:26.977Z","updated_at":"2025-09-23T17:31:46.876Z","avatar_url":"https://github.com/bellingcat.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# vk-url-scraper\nPython library to scrape data, and especially media links like videos and photos, from vk.com URLs.\n\n\u003e This repo has been archived because it relies on a fixed git commit of the vk_api library which we can no longer publish to pypi, see [issue](https://github.com/bellingcat/vk-url-scraper/issues/66). You can still install the latest install. This archived state may change if a solution is found to publish the library to pypi again.\n\n[![PyPI version](https://badge.fury.io/py/vk-url-scraper.svg)](https://badge.fury.io/py/vk-url-scraper)\n[![PyPI download month](https://img.shields.io/pypi/dm/vk-url-scraper.svg)](https://pypi.python.org/pypi/vk-url-scraper/)\n[![Documentation Status](https://readthedocs.org/projects/vk-url-scraper/badge/?version=latest)](https://vk-url-scraper.readthedocs.io/en/latest/?badge=latest)\n\n\nYou can use it via the [command line](#command-line-usage) or as a [python library](#python-library-usage), check the **[documentation](https://vk-url-scraper.readthedocs.io/en/latest/)**.\n\n## Installation\nYou can install the most recent release from [pypi](https://pypi.org/project/vk-url-scraper/) via `pip install vk-url-scraper`.\n\nCurrently you need to manually unsintall and re-install one dependency (as it is installed from github and not pypi):\n```bash\npip uninstall vk-api\npip install git+https://github.com/python273/vk_api.git@b99dac0ec2f832a6c4b20bde49869e7229ce4742\n```\n\nTo use the library you will need a valid username/password combination for vk.com. \n\n## Command line usage\n```bash\n# run this to learn more about the parameters\nvk_url_scraper --help\n\n# scrape a URL and get the JSON result in the console\nvk_url_scraper --username \"username here\" --password \"password here\" --urls https://vk.com/wall12345_6789\n# OR\nvk_url_scraper -u \"username here\" -p \"password here\" --urls https://vk.com/wall12345_6789\n# you can also have multiple urls\nvk_url_scraper -u \"username here\" -p \"password here\" --urls https://vk.com/wall12345_6789 https://vk.com/photo-12345_6789 https://vk.com/video12345_6789\n\n# you can pass a token as well to avoid always authenticating \n# and possibly getting captcha prompts\n# you can fetch the token from the vk_config.v2.json file generated under by searching for \"access_token\"\nvk_url_scraper -u \"username\" -p \"password\" -t \"vktoken goes here\" --urls https://vk.com/wall12345_6789\n\n# save the JSON output into a file\nvk_url_scraper -u \"username here\" -p \"password here\" --urls https://vk.com/wall12345_6789 \u003e output.json\n\n# download any photos or videos found in these URLS\n# this will use or create an output/ folder and dump the files there\nvk_url_scraper -u \"username here\" -p \"password here\" --download --urls https://vk.com/wall12345_6789\n# or\nvk_url_scraper -u \"username here\" -p \"password here\" -d --urls https://vk.com/wall12345_6789\n```\n\n## Python library usage\n```python\nfrom vk_url_scraper import VkScraper\n\nvks = VkScraper(\"username\", \"password\")\n\n# scrape any \"photo\" URL\nres = vks.scrape(\"https://vk.com/photo1_278184324?rev=1\")\n\n# scrape any \"wall\" URL\nres = vks.scrape(\"https://vk.com/wall-1_398461\")\n\n# scrape any \"video\" URL\nres = vks.scrape(\"https://vk.com/video-6596301_145810025\")\nprint(res[0][\"text\"]) # eg: -\u003e to get the text from code\n```\n\n```python\n# Every scrape* function returns a list of dict like\n{\n\t\"id\": \"wall_id\",\n\t\"text\": \"text in this post\" ,\n\t\"datetime\": utc datetime of post,\n\t\"attachments\": {\n\t\t# if photo, video, link exists\n\t\t\"photo\": [list of urls with max quality],\n\t\t\"video\": [list of urls with max quality],\n\t\t\"link\": [list of urls with max quality],\n\t},\n\t\"payload\": \"original JSON response converted to dict which you can parse for more data\n}\n```\n\nsee [docs] for all available functions. \n\n### TODO\n* scrape album links\n* scrape profile links\n* docs online from sphinx\n\n## Development\n(more info in [CONTRIBUTING.md](CONTRIBUTING.md)).\n\n1. setup dev environment with `pipenv install --dev`\n1. setup environment with `pipenv install -r requirements.txt`\n1. Activate the environment with `pipenv shell` (or prepend `pipenv run` to all commands)\n2. To run all checks to `make run-checks` (fixes style) or individually\n   1. To fix style: `black .` and `isort .` -\u003e `flake8 .` to validate lint\n   2. To do type checking: `mypy .`\n   3. To test: `pytest .` (`pytest -v --color=yes --doctest-modules tests/ vk_url_scraper/` to use verbose, colors, and test docstring examples)\n3. `make docs` to generate shpynx docs -\u003e edit [config.py](docs/source/conf.py) if needed\n\nTo test the command line interface available in [__main__.py](__vk_url_scraper/__main__.py) you need to pass the `-m` option to python like so: `python -m vk_url_scraper -u \"\" -p \"\" --urls ...`\n\n\n## Releasing new version\n1. edit [version.py](vk_url_scraper/version.py) with proper versioning\n2. make sure to run `pipenv run pip freeze \u003e requirements.txt` if you manage libs with pipenv\n   1. if the hardcoded version of [vk_api](https://github.com/python273/vk_api) is still being used, then you must comment/remove that line from the generated requirements file and instruct users to manually install the version from the source as pypi does not allow repo/commit tags. Additionally, add the latest released version, currently `vk-api==11.9.9`. \n3. run `./scripts/release.sh` to create a tag and push, alternatively\n   1. `git tag vx.y.z` to tag version\n   2. `git push origin vx.y.z` -\u003e this will trigger workflow and put project on [pypi](https://pypi.org/project/vk-url-scraper/)\n4. go to https://readthedocs.org/ to deploy new docs version (if webhook is not setup)\n\n### Fixing a failed release\n\nIf for some reason the GitHub Actions release workflow failed with an error that needs to be fixed, you'll have to delete both the tag and corresponding release from GitHub. After you've pushed a fix, delete the tag from your local clone with\n\n```bash\ngit tag -l | xargs git tag -d \u0026\u0026 git fetch -t\n```\n\nThen repeat the steps above.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbellingcat%2Fvk-url-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbellingcat%2Fvk-url-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbellingcat%2Fvk-url-scraper/lists"}