Internet Archive
Internet Archive is a website for a digital collection run by the archive.org group, also responsible for the Wayback Machine software.
- GitHub: https://github.com/topics/internet-archive
- Aliases: wayback-machine,
- Last updated: 2026-06-13 00:16:14 UTC
- JSON Representation
https://github.com/internetarchive/openlibrary
One webpage for every book ever published!
books hacktoberfest internet-archive library-catalogue open-source
Last synced: 09 Jun 2026
https://github.com/wabarc/wayback
An archiving tool with an IM-style interface that prioritizes privacy and accessibility, integrated with various archival services including Internet Archive, archive.today, Ghostarchive, IPFS, Telegraph, and file systems.
archive har heroku internet-archive ipfs irc mastodon matrix memento nostr notion save-the-internet screenshot self-hosted snapshot snapshot-webpage telegram telegraph twitter wayback-machine
Last synced: 11 Apr 2025
https://github.com/akamhy/waybackpy
Wayback Machine API interface & a command-line tool
archive-webpage archive-webpages cdx-api internet-archive internet-archiving osint savepagenow wayback-machine wayback-machine-api wayback-machine-python web-archiving webarchiving
Last synced: 15 May 2025
https://github.com/bibanon/tubeup
Use yt-dlp to download video/metadata and upload to the Internet Archive.
archival gplv3 internet-archive preservation python video youtube youtube-dl yt-dlp
Last synced: 14 May 2025
https://github.com/oduwsdl/archivenow
A Tool To Push Web Resources Into Web Archives
internet-archive web-archiving
Last synced: 05 Apr 2025
https://github.com/VerifiedJoseph/Save-to-the-Wayback-Machine
Browser extension for quickly saving web pages to the Internet Archive's Wayback Machine.
brave-extension browser-extension chrome-extension firefox-addon internet-archive opera-extension vivaldi-extension wayback-machine
Last synced: 27 Mar 2025
https://github.com/helgeho/ArchiveSpark
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
archivespark internet-archive spark spark-framework warc web-archiving webarchive
Last synced: 08 Apr 2025
https://github.com/helgeho/archivespark
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
archivespark internet-archive spark spark-framework warc web-archiving webarchive
Last synced: 05 Apr 2025
https://github.com/crissyfield/troll-a
Drill into WARC web archives
command-line-tool common-crawl internet-archive security security-tools warc
Last synced: 16 Jan 2026
https://github.com/tiben/ia-rcade
Use MAME with roms from archive.org
chd emulator ia-mame internet-archive java mame mess retrocomputing rom
Last synced: 28 Jul 2025
https://github.com/claromes/waybacktweets
Retrieves archived tweets from Wayback Machine in HTML, CSV, and JSON
internet-archive osint osint-tools socmint twitter wayback-machine wayback-tweets x
Last synced: 04 Apr 2025
https://github.com/gdamdam/iagitup
A command line tool to archive a git repository from GitHub to the Internet Archive.
archive archiving cli git github internet-archive internetarchive
Last synced: 22 Aug 2025
https://github.com/erlange/wbm-dl
Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.
command-line-app command-line-parser command-line-tool console console-app console-application csharp internet internet-archive internet-wayback-machine wayback-machine wayback-machine-downloader website-scraper
Last synced: 05 Apr 2025
https://github.com/hrbrmstr/newsflash
Tools to Work with the Internet Archive and GDELT Television Explorer in R
gdelt-television-explorer internet-archive r r-cyber rstats
Last synced: 19 Jul 2025
https://github.com/ticky/wayback-classic
🕸 A frontend for the Wayback Machine which works on old browsers
1990s 2000s cgi cgi-application internet-archive ruby wayback wayback-machine
Last synced: 11 Apr 2025
https://github.com/wimpysworld/ia-get
File downloader for archive.org ⬇️
download-manager downloader hacktoberfest internet-archive rust wayback-machine
Last synced: 26 Apr 2025
https://github.com/buren/wayback_archiver
Ruby gem to send URLs to Wayback Machine
internet-archive ruby rubygem wayback-archiver wayback-machine
Last synced: 04 Apr 2025
https://github.com/machawk1/mink
Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user access to the copy
chrome extension internet-archive memento memento-rfc
Last synced: 17 Nov 2025
https://github.com/hrbrmstr/wayback
:rewind: Tools to Work with the Various Internet Archive Wayback Machine APIs
internet-archive memento r r-cyber rstats wayback wayback-machine web-scraping
Last synced: 21 Mar 2025
https://github.com/machawk1/Mink
Chrome extension that uses Memento to indicate that a page a user is viewing on the live web has an archived copy and to give the user access to the copy
chrome extension internet-archive memento memento-rfc
Last synced: 03 Apr 2025
https://github.com/internetarchive/internet-archive-voice-apps
Voice Apps (Actions on Google, Alexa Skill) of Internet Archive. Just say: "Ok Google, Ask Internet Archive to Play Jazz" or "Alexa, Ask Internet Internet Archive to play Instrumental Music"
actions-on-google alexa-skill dialog-flow internet-archive voice-assistant
Last synced: 09 Apr 2025
https://github.com/8ensmith/mcp-open-library
A Model Context Protocol (MCP) server for the Internet Archive's Open Library API that enables AI assistants to search for book and author information.
ai assistants authors books internet-archive mcp mcp-server modelcontextprotocol open-library open-library-api openlibrary
Last synced: 14 Jan 2026
https://github.com/wabarc/archive.is
A command-line tool and Go package for wayback web pages to archive.today
anticensorship archive archiveis archivetoday golang internet-archive internet-freedom memento no-more-404 tor wayback wayback-machine
Last synced: 16 May 2025
https://github.com/dbz/webcache
Chrome extension to view the cached version of the current webpage
cache-extension cache-service chrome-extension coral-cdn google-cache google-chrome-extension internet-archive javascript wayback-machine webcache
Last synced: 29 Oct 2025
https://github.com/metabrainz/artwork-redirect
URL redirect service for the coverartarchive.org
cover-art internet-archive python
Last synced: 28 Jul 2025
https://github.com/wabarc/cairn
NPM package and CLI tool for saving web page as single HTML file
archive base64 cli html html-files internet-archive javascript memento mhtml node nodejs npm-package obelisk single-file typescript wayback wayback-archiver webpage
Last synced: 11 Apr 2025
https://github.com/saveweb/wikiteam3
archiving MediaWikis (and uploading wikidump to the Internet Archive)
internet-archive mediawiki wiki
Last synced: 29 Sep 2025
https://github.com/mearman/mcp-wayback-machine
MCP server and CLI tool for interacting with the Internet Archive's Wayback Machine
archival cli internet-archive mcp mcp-server model-context-protocol wayback-machine web-archive
Last synced: 31 May 2026
https://github.com/JamieMagee/wayback
Save pages to the Wayback Machine as part of your CI/CD pipeline
actions github-actions internet-archive wayback wayback-machine
Last synced: 27 Mar 2025
https://github.com/jamiemagee/wayback
Save pages to the Wayback Machine as part of your CI/CD pipeline
actions github-actions internet-archive wayback wayback-machine
Last synced: 16 Apr 2026
https://github.com/saveweb/dokuwiki-dumper
A tool for archiving DokuWiki
archive dokuwiki internet-archive
Last synced: 08 Apr 2026
https://github.com/caltechlibrary/waystation
Automatically archive your repository's GitHub Pages in the Wayback Machine.
archiving automation documentation github-action github-actions github-automation github-pages internet-archive preservation wayback-machine
Last synced: 12 Jun 2025
https://github.com/alopezrivera/anchorage
Save your bookmark collection in the Internet Archive, or locally.
archiving internet-archive permanence web
Last synced: 29 Apr 2025
https://github.com/internetarchive/newsum
Daily TV News Summary using GPT
gdelt gpt internet-archive news-summarization openapi python summarization tv tv-news
Last synced: 07 May 2025
https://github.com/rchrd2/iajs
Internet Archive JavaScript Client which supports reading and writing data in NodeJS and the Browser
api-client archive-org gificites internet-archive login metadata s3 sdk upload wayback wayback-machine
Last synced: 07 May 2025
https://github.com/noql-net/archiver
Archive censorship tools to the Internet Archive
amnezia censorship filternet internet-archive sing-box wireguard xray
Last synced: 17 Jan 2026
https://github.com/webis-de/archive-query-log
📜 The Archive Query Log.
information-retrieval information-retrieval-history internet-archive query-log search-engine-result-page serp wayback-machine web-archive
Last synced: 10 Feb 2026
https://github.com/madjin/internet-archive-vr
Multiplayer virtual reality worlds of the Internet Archive in SF
3d-reconstruction internet-archive janusxr javascript threejs webgl webxr
Last synced: 31 Oct 2025
https://github.com/wikisource/ia-upload
Tool to import files from the Internet Archive to Wikimedia Commons.
djvu internet-archive pdf wikimedia-commons wikisource
Last synced: 27 Jan 2026
https://github.com/internetarchive/iacopilot
Summarize and ask questions about items in the Internet Archive
cli copilot gpt iacopilot internet-archive python repl
Last synced: 07 May 2025
https://github.com/foxcouncil/vintagehive
This project tries to help alter the modern internet to work on really old computers and systems...
archival education fun internet-archive nostalgia preservation proxy proxy-server retro
Last synced: 06 Oct 2025
https://github.com/emijrp/internet-archive
Scripts for Internet Archive
archive archiving crawler digital-preservation internet-archive webpage website
Last synced: 21 Jun 2025
https://github.com/AnandChowdhary/archiver
🗄️ Auto-archive your webpages on the Internet Archive
archive-dot-org automation internet-archive nodejs typescript wayback-machine
Last synced: 29 Jul 2025
https://github.com/anandchowdhary/archiver
🗄️ Auto-archive your webpages on the Internet Archive
archive-dot-org automation internet-archive nodejs typescript wayback-machine
Last synced: 28 Apr 2025
https://github.com/httpreserve/linkstat
CLI implementation of httpreserve that can test links and retrieve internet archive replacements
archives cli code4lib digipres digital-preservation glam internet-archive link-checker wayback-machine web-archiving
Last synced: 17 Jan 2026
https://github.com/mineo/gocaa
Client for coverartarchive.org
cover-art go internet-archive musicbrainz
Last synced: 09 Apr 2025
https://github.com/bcongdon/wayback-archiver
🗄 CLI archival tool for the Wayback Machine
cli internet-archive wayback-machine
Last synced: 15 Apr 2025
https://github.com/itsliamdowd/WaybackBrowserMacOS
Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻
application browser coding developer html internet internet-archive internet-archiving js learn macos macos-app macos-application macos-menubar macos-swift storyboard swift swiftapp wayback-archiver wayback-machine
Last synced: 18 Jul 2025
https://github.com/plibither8/wayback-sitemap-archive
🏛️ Archive all pages specified in the webpage's sitemap to Internet Archive's Wayback Machine
internet-archive sitemap wayback-machine
Last synced: 12 Apr 2025
https://github.com/janheinrichmerker/wayback-api
🕰️ Java wrapper for the Internet Archive's Wayback API.
internet-archive java java-wrapper kotlin kotlin-coroutines kotlin-jvm kotlin-library wayback wayback-api wayback-machine wayback-machine-downloader
Last synced: 07 Apr 2025
https://github.com/jhu-library-applications/vocab-apis
Python scripts to pull and convert data between Library of Congress vocabularies and other external vocabularies (GeoNames, VIAF, etc)
aat fast-api geonames getty-vocabularies internet-archive lcnaf linked-data oclc
Last synced: 27 Jul 2025
https://github.com/internetarchive/ia
A JS interface to archive.org
api download internet-archive javascript json metadata search
Last synced: 07 May 2025
https://github.com/plibither8/wayback
🏛 Microservice that redirects to archived version of the URL if found, otherwise saves it to the Internet Archive
archive internet-archive microservice now wayback wayback-machine zeit
Last synced: 22 Jun 2025
https://github.com/geiserx/wayback-archive
Download complete websites from the Wayback Machine with full asset preservation for offline viewing
archive cdn content-preservation css digital-preservation google-fonts html internet-archive minification offline-browsing python recursive-download static-site-generator url-rewriting wayback-machine web-archiving web-crawler web-scraping website-backup website-downloader
Last synced: 06 Apr 2026
https://github.com/mrrfv/webarchive
Crawls websites and saves found URLs to a file.
archive archiveteam archiving crawler crawling ia internet-archive scraper web-archiving web-scraping
Last synced: 18 Mar 2025
https://github.com/codyogden/inbox-by-gmail
An archived version of the Inbox by Gmail product announcement website provided for research and commentary.
google history inbox-by-gmail internet-archive
Last synced: 25 Jul 2025
https://github.com/wabarc/archive.org
A command-line tool and Go package for wayback web pages to archive.org
anticensorship archive archive-dot-org golang internet-archive internet-freedom memento no-more-404 wayback wayback-machine
Last synced: 16 May 2025
https://github.com/caltechlibrary/eprints2archives
Send records from an EPrints server to the Internet Archive and other web archives
archiving eprints internet-archive memento preservation python terminal web-archives web-archiving
Last synced: 14 Apr 2025
https://github.com/nemobis/beic
Misc tools for BEIC (Biblioteca Europea di Informazione e Cultura)
geocoding internet-archive library-automation library-catalogue marc21 marcxml mets python ruby scraper unimarc wikimedia wikimedia-commons zotero
Last synced: 29 Jun 2025
https://github.com/atomotic/iafc
Mirror an Internet Archive item to a local Fedora Commons repository
fedora-repository internet-archive
Last synced: 08 Sep 2025
https://github.com/rob-sve/iadownloader
Auto-download files and collections from Internet Archive
download downloader internet-archive python tqdm
Last synced: 18 Jul 2025
https://github.com/utrechtuniversity/ia-webscraping
An AWS workflow for collecting webpages from the Internet Archive
aws internet-archive python terraform web-scraping
Last synced: 13 Jul 2025
https://github.com/jjfiv/poetry-identification
Poetry Identification Code from my dissertation runs on zip files containing DJVUXML from the Internet Archive.
digital-humanities djvuxml internet-archive machine-learning poetry random-forests
Last synced: 26 Aug 2025
https://github.com/thaliaarchi/urlhero
Link resolver for current and defunct URL shorteners
archiveteam internet-archive url-shortener url-unshortener urlteam
Last synced: 03 Aug 2025
https://github.com/lucasintel/wayback.cr
Internet Archive's ⏳Wayback Machine interface for Crystal
crystal data-hoarder internet-archive osint wayback-machine
Last synced: 04 Jun 2026
https://github.com/janforman/aspseek
ASPseek is a full-featured medium-to-large scale SQL-based Internet search engine. It consists of an indexing robot, search daemon and search frontend (CGI program). These programs are written in C++ using the STL library.
full-text-search indexing-engine internet-archive search-engine searching spider
Last synced: 27 Mar 2025
https://github.com/openvoiceos/awesome-ocp-skills
Media skills for OCP, music, movies, radio, audiobooks and more!
audio audiobooks internet-archive internet-radio media movies music mycroft openvoiceos ovos podcasts skill youtube
Last synced: 28 Dec 2025
https://github.com/uvalib/emma
Education Materials Made Accessible
accessibility accessible-content-e-portal internet-archive
Last synced: 25 Jan 2026
https://github.com/gentlecat/caa
Go library for Cover Art Archive
api-wrapper cover-art-archive go golang internet-archive musicbrainz
Last synced: 12 Jan 2026
https://github.com/jhu-library-applications/is-this-digitized
google-books-api hathitrust internet-archive oclc
Last synced: 30 Mar 2025
https://github.com/bac0id/save-page-now-api
A wrapper for Internet Archive Wayback Machine's Save Page Now.
internet-archive python save-page-now wayback-machine
Last synced: 28 Feb 2026
https://github.com/thelovinator1/feedvault.se
FeedVault is an open-source web application that allows users to archive and search their favorite web feeds.
archive archivebox atom-feed backup feed-archive hacktoberfest internet-archive internet-archiving rss rss-aggregator rss-archive wayback-machine
Last synced: 02 Sep 2025
https://github.com/proinsias/wayback-utils
Utilities for submitting URLs to the Internet Archive's Wayback Machine
internet-archive urls wayback-machine
Last synced: 22 Mar 2025
https://github.com/internetarchive/tracey
Tracey Jaquith, Internet Archive 🏛️, talks and slides
cicd devops internet-archive javascript markdown slides
Last synced: 11 Dec 2025
https://github.com/tqdv/save-page-now
Save Page Now to the Wayback Machine
firefox-addon internet-archive wayback-machine
Last synced: 06 Apr 2026
https://github.com/overbrowsing/wasteback-machine
JavaScript library for measuring the size and composition of archived web pages.
internet-archive sustainable-web-design wayback-machine web-archives webpage-metrics website-analysis
Last synced: 27 Oct 2025
https://github.com/bocaletto-luca/githubtointernetarchive
GitHub Archiver to Internet Archive A single-file Python tool (main.py) that mirrors every repository of a GitHub user or organization and uploads each mirror as a tarball to Internet Archive. Metadata (description, license, topics) are automatically pulled from GitHub and attached to each upload. @bocaletto-luca
bocaletto-luca github-to-internet-archive internet-archive linux opensource python terminal
Last synced: 11 May 2026
https://github.com/internetarchive/public-domain-day-film-contest
Internet Archive Public Domain Day Film Contest 2024 Entries
contest films internet-archive public-domain
Last synced: 19 Mar 2026
https://github.com/mobiwn/com-eng-archive
A simple, user-friendly archive of the now-unavailable website, recreated for easy local access and community use.
internet-archive local-deployment urmia-university uucessc wayback-machine web-scraping
Last synced: 19 Mar 2026
https://github.com/janheinrichmerker/wayback-gradle-plugin
🕰️ Gradle plugin for the Internet Archive's Wayback API.
gradle gradle-plugin internet-archive wayback wayback-api wayback-gradle-plugin wayback-machine wayback-machine-downloader
Last synced: 30 Apr 2026
https://github.com/bac0id/wayback-machine-auto-save
A crawler to save web pages on list to Save Page Now of Internet Archive's Wayback Machine.
crawler internet-archive python save-page-now wayback-machine
Last synced: 28 May 2026
https://github.com/yankeexe/wayback-machine-browser-extension
Use wayback machine to go to snapshots taken in point in time | Unlock Paywalls
browser-extension chrome-extension firefox-extension internet-archive opensource wayback-machine
Last synced: 20 May 2026
https://github.com/exurd/roblox_wb_proxy
The Wayback Machine as a Roblox API proxy
internet-archive luau roblox roblox-api roblox-api-wrapper roblox-apis roblox-dev roblox-development roblox-lua roblox-studio wayback-machine
Last synced: 11 Feb 2026
https://github.com/hymkor/hp.vector.co.jp_va009797
2000-11-05 頃の hxxp://hp.vector.co.jp/VA009797 以下のコンテンツを復元したものです
Last synced: 14 Feb 2026
https://github.com/harrisonpage/slides
Posting 35mm slide scans to Internet Archive and Bluesky
bluesky internet-archive photography
Last synced: 22 Apr 2026
https://github.com/aashir-athar/crate
Offline-first React Native music player for Creative Commons & public-domain music. Paste a Jamendo, Internet Archive, or Audius link, download legally, play offline. Expo SDK 56, TypeScript.
android audio-player audius creative-commons cross-platform expo expo-router internet-archive ios jamendo mobile-app music-player music-streaming offline-first offline-music public-domain react-native tanstack-query typescript zustand
Last synced: 07 Jun 2026
https://github.com/sirhenricus/the-yahoolinator
Randomized Yahooligans Wayback Machine Pages
for-fun html internet-archive javascript makeshift oldweb random-number-generators wayback-machine
Last synced: 03 Apr 2025
https://github.com/david0z/wayback-machine-downloader
Terminal UI tool for collective download of resources from Archive.org Wayback Machine
archive-org console-app console-application internet-archive internet-wayback-machine wayback-machine wayback-machine-downloader website-scraper
Last synced: 29 Oct 2025
https://github.com/hfiguiere/iatile
Set the tile page for an Internet Archive text item
Last synced: 22 May 2026
https://github.com/iamthenerdnextdoor/iastatus
A very basic status checker for IA servers in C.
ia internet-archive internetarchive internetarchiveattack status status-checker
Last synced: 04 Jun 2026
https://github.com/faithvoid/plugin.video.medusa
Internet Archive streaming add-on for XBMC.
internet-archive internetarchive x4x xbmc xbmc-addon xbmc-plugin xbmc-video-plugin xbmc4xbox xboxmediacenter
Last synced: 05 Apr 2025
https://github.com/neilvallon/arct
Tab completion for the Internet Archive
internet-archive tab-completion web-crawling
Last synced: 18 Oct 2025
https://github.com/exurd/yita
YouTube Into The Archive
chrome-extension ia internet-archive mirrortube youtube yt
Last synced: 21 Apr 2026