{"id":19892576,"url":"https://github.com/compscidr/ipfs_indexer","last_synced_at":"2025-05-02T18:31:46.083Z","repository":{"id":37938595,"uuid":"412630136","full_name":"compscidr/ipfs_indexer","owner":"compscidr","description":"An ipfs indexer / search engine built in rust","archived":false,"fork":false,"pushed_at":"2025-03-24T16:58:35.000Z","size":482,"stargazers_count":6,"open_issues_count":4,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-23T19:29:43.196Z","etag":null,"topics":["hacktoberfest","indexer","ipfs","rust","rust-lang"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/compscidr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-01T22:12:49.000Z","updated_at":"2025-03-24T16:58:20.000Z","dependencies_parsed_at":"2023-02-13T14:25:21.148Z","dependency_job_id":"b5adef89-02e1-4fe2-9322-1a02c7fac9bd","html_url":"https://github.com/compscidr/ipfs_indexer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compscidr%2Fipfs_indexer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compscidr%2Fipfs_indexer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compscidr%2Fipfs_indexer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compscidr%2Fipfs_indexer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/compscidr","download_url":"https://codeload.github.com/compscidr/ipfs_indexer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252088582,"owners_count":21692823,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest","indexer","ipfs","rust","rust-lang"],"created_at":"2024-11-12T18:24:16.004Z","updated_at":"2025-05-02T18:31:43.571Z","avatar_url":"https://github.com/compscidr.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ipfs_indexer\n[![.github/workflows/build-and-test.yml](https://github.com/compscidr/ipfs_indexer/actions/workflows/build-and-test.yml/badge.svg)](https://github.com/compscidr/ipfs_indexer/actions/workflows/build-and-test.yml)\n\nAn ipfs indexer / search engine built in rust.\n\n## Build notes\nOn Ubuntu you may need to apt install: `libssl-dev`.\n\nAs of libp2p 0.44.0, it seems to require rust nightly: https://stackoverflow.com/questions/69848319/unable-to-specify-edition2021-in-order-to-use-unstable-packages-in-rust\n\nDid init as a \"binary\" - not sure if this makes sense, or if other people think we should split this into a library\nbit and an application bit. I suppose we can always change it later as it grows.\n\nFollowing this guide for libp2p:\nhttps://github.com/libp2p/rust-libp2p/blob/master/src/tutorial.rs\n\nTrying to follow best practices from here:\nhttps://doc.rust-lang.org/cargo/commands/cargo-init.html\n\nAdding Cargo.lock to version control - it *seems* like it might be best practice for a binary (app):\nhttps://stackoverflow.com/questions/62861623/should-cargo-lock-be-committed-when-the-crate-is-both-a-rust-library-and-an-exec\n\n## Building on mac\n\nRequirements: `xcode`\n\nRun `xcode-select --install` if you do not have xcode installed, need to update xcode, or run into xcode related build errors\n\n## Running with logging output\n- Run `cargo build` to build\n- Run `RUST_LOG=info ./target/debug/ipfs_indexer` to see logging output (adjust level accordingly)\n- Run `RUST_LOG=info ./target/debug/ipfs_indexer 127.0.0.1:8080` to use your own ipfs gateway instead of ipfs.io\n\nBy default runs an endpoing on `0.0.0.0:9090` so you can go to \n- http://localhost:9090/status\n- http://localhost:9090/enqueue/somecid\n- http://localhost:9090/search/somequery\n\n## Running with docker\nFrom the docker directory, run `docker-compose up`. Currently image is only ~26MB.\n\n## CI\nI setup a workflow that should run a build at least on push, but doesnt run any tests because I have no idea how test\nsuites work yet for rust.\n\n\n## What Needs to be Done\n- Discover content to be indexed, add them to the index queue\n  - [ ] Listen in on the gossip protocol **Jason working on**\n  - [X] Start from some collection of pages on ipfs.io/ipfs\n- Implement an index queue processor\n  - [X] Fetch the ipfs content\n  - [X] Process the page for more ipfs links, Add those links into the index queue\n  - Index the pages somehow\n    - Ranked keywords by frequency or something?\n    - Need to update to support more than just html content (look at header and index files)\n    - Update the except to be flexible - for images, it could be a small crop render of the original image, for videos it could be a gif preview render\n  - Store the index somehow (start with in-memory, then figure out how to do storage later) - **Conor working on**\n    - A hashmap of map[keyword] -\u003e sorted tree where the entries are sorted by keyword frequency and entries contain ipfs hash? - **Conor working on**\n    - Will probably want to think of ejection mechanism sooner than later so we can eject to storage (least recently used? oldest? who knows?)\n    - Farther out - need to think about how the store will be designed\n   - [X] Probably also want to store an excerpt, page title of the page to present to front-end\n- [X] Implement a backend API which a future front-end can use, and in short term we can use to debug\n  - search -\u003e search result\n    - ordered list of \u003ctitle, link, excerpts\u003e, possibly grouped by text, images, videos, other\n  - stats:\n    - indexed entries\n    - outstanding index queue\n    - memory used / free\n    - stoage used / free\n- Implement a front-end which queries the index storage and displays the page title, ipfs/io/ipfs link to the page and excerpt\n  from the browser\n- Feedback loop from what people click on more often to rank those higher\n- [X] Might a docker container:\n  - [X] deploy will auto restart itself on crash (will also make it easy to see consumed memory with docker stats and other tools)\n  - [X] will be able to deploy with a local ipfs instance all ready to go within the container\n  - can artificially restrict memory so we can test things like ejection mechanisms\n- Farther out -\u003e hook into papertrail or some logging service so we can see what's up if it dies\n- Tests! - **Conor working on**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompscidr%2Fipfs_indexer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcompscidr%2Fipfs_indexer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompscidr%2Fipfs_indexer/lists"}