Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/axelgard/seek-ipns
a search engine for IPNS
https://github.com/axelgard/seek-ipns
golang ipfs ipns ipns-ipfs python scikit-learn search search-engine
Last synced: about 18 hours ago
JSON representation
a search engine for IPNS
- Host: GitHub
- URL: https://github.com/axelgard/seek-ipns
- Owner: AxelGard
- License: mit
- Created: 2023-04-24T07:43:38.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2023-11-12T07:20:29.000Z (about 1 year ago)
- Last Synced: 2024-08-20T14:47:58.587Z (6 months ago)
- Topics: golang, ipfs, ipns, ipns-ipfs, python, scikit-learn, search, search-engine
- Language: Jupyter Notebook
- Homepage: https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-198457
- Size: 53.9 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# seekIPNS
A search engine for IPNS records.## Disclaimer
This is a part of my master thesis, so PR will not be accepted since this is for the thesis.
However feel free to fork and use as you wish.## The thesis
The full thesis can be found **[HERE](https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-198457)**.
My presentation for the thesis can be found **[HERE](./doc/TQDT33%20-%20Final%20presentation.pdf)**
### Search
A part of the thesis was building a search engine, which can also be found in this repo, [here](./server/).
The search model used [TF-IDF](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html).I made a super simple frontend for it as well which can be found [here](./frontend/).
![](./doc/Screenshot_seekIPNS_result.png)
### Some results
For each IPNS record found
* Peer ID
* CID
* All file names
* All file path
* All file sizes
* All file formats
* Number of files
* Number of peers hosting the CID
* Time when peer was foundwas eveluated
The records some of the records found on the network was saved and used for search.
35K unique peers were crawled, and of those, 333 were hosting a nonempty record
in their default IPNS record.![peer crawled](./charts/peers_time.png)
83099 files was found in the crawling.
![number of files](./charts/files_per_peer_log.png)The decentralization of the content was also looked at:
![decentralization](./charts/peers_hosting_cids.png)