{"id":16379895,"url":"https://github.com/justdoom/archiveenginebackend","last_synced_at":"2026-06-19T12:32:20.176Z","repository":{"id":195310679,"uuid":"619078195","full_name":"JustDoom/ArchiveEngineBackend","owner":"JustDoom","description":null,"archived":false,"fork":false,"pushed_at":"2025-06-04T07:46:37.000Z","size":129,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-04T14:12:04.507Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JustDoom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-03-26T07:41:43.000Z","updated_at":"2025-06-04T07:46:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"cbee5768-b02a-46cd-9743-4f8db466043a","html_url":"https://github.com/JustDoom/ArchiveEngineBackend","commit_stats":null,"previous_names":["justdoom/archiveenginebackend"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/JustDoom/ArchiveEngineBackend","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustDoom%2FArchiveEngineBackend","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustDoom%2FArchiveEngineBackend/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustDoom%2FArchiveEngineBackend/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustDoom%2FArchiveEngineBackend/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JustDoom","download_url":"https://codeload.github.com/JustDoom/ArchiveEngineBackend/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JustDoom%2FArchiveEngineBackend/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34532253,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-19T02:00:06.005Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T03:49:53.586Z","updated_at":"2026-06-19T12:32:20.161Z","avatar_url":"https://github.com/JustDoom.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ArchiveEngine\n\n[![Discord Server](https://img.shields.io/discord/979589333524820018?color=7289da\u0026label=EimerArchive\u0026style=flat-square\u0026logo=appveyor)](https://discord.gg/k8RcgxpnBS)\n\nThis project aims to provide a basic search engine for the Wayback Machine part of archive.org. You are able to look for\nother urls that begin with a specific url but that is limited to 10,000 results which may not be enough in most cases.\nSo I decided to try and make a \"search engine\" for it myself. Currently, it only uses data from the url itself.\n\n## Why\n\nArchive.org/The Wayback Machine is an amazing service but they are really lacking in the API and searchability\ndepartment. I thought if there was something that allows me to search for archived links with a certain word in the url\nit would make finding certain lost files so much easier (Mainly Minecraft related ones).\n\n## How it works\n\nThis uses the [CDX API](https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server) to fetch URLs to be\nindexed. It used to use some other endpoint, but it was extremely slow and unreliable. This new system is at least 10\ntimes faster in most cases. Also, a lot more reliable.\n\nThe new system also allows finding any URL under the domains subdomains, so it can find URLs under any subdomains as\nwell even if you didn't know it had any!\n\n## TODO\n\n### API\n\n- [ ] Write the TODO :P\n\n### Indexer\n\n- [x] Indexing of links from a certain domain\n- [x] Re-try indexing of failed requests. Status code errors or even just timeouts!\n- [x] Basic multithreading for fetching and indexing URLs\n- [x] Stop duplicate links\n- [ ] Even more speed somehow\n- [ ] Queue up domains to search through\n\n## Contributing\n\nTODO","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjustdoom%2Farchiveenginebackend","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjustdoom%2Farchiveenginebackend","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjustdoom%2Farchiveenginebackend/lists"}