{"id":13456703,"url":"https://github.com/raphaelsty/knowledge","last_synced_at":"2025-10-11T21:49:03.517Z","repository":{"id":154840679,"uuid":"605316812","full_name":"raphaelsty/knowledge","owner":"raphaelsty","description":"Open-source personal bookmarks search engine","archived":false,"fork":false,"pushed_at":"2025-03-17T00:12:28.000Z","size":11954482,"stargazers_count":615,"open_issues_count":1,"forks_count":31,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-17T07:54:13.715Z","etag":null,"topics":["bookmarks","github","hacker-news","knowledge-base","search-engine","twitter","zotero"],"latest_commit_sha":null,"homepage":"https://raphaelsty.github.io/knowledge/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/raphaelsty.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-22T22:45:41.000Z","updated_at":"2025-03-17T00:12:31.000Z","dependencies_parsed_at":"2023-07-02T06:41:50.769Z","dependency_job_id":"218c16fd-5269-451f-9d3b-53dc2055742a","html_url":"https://github.com/raphaelsty/knowledge","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raphaelsty%2Fknowledge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raphaelsty%2Fknowledge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raphaelsty%2Fknowledge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raphaelsty%2Fknowledge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/raphaelsty","download_url":"https://codeload.github.com/raphaelsty/knowledge/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245260809,"owners_count":20586475,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bookmarks","github","hacker-news","knowledge-base","search-engine","twitter","zotero"],"created_at":"2024-07-31T08:01:26.358Z","updated_at":"2025-10-11T21:49:03.495Z","avatar_url":"https://github.com/raphaelsty.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n# Knowledge\n\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://raphaelsty.github.io/knowledge/\"\u003e\u003cstrong\u003ePersonal Knowledge Base\u003c/strong\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"img/demo.gif\" alt=\"Demonstration GIF\" style=\"width:100%; border-radius:10px; box-shadow:0 4px 8px rgba(0,0,0,0.1);\"\u003e\n\u003c/p\u003e\n\n**Knowledge** is a web application that automatically transforms the digital footprint into a personal search engine. It fetches content you interact with from various platforms—**GitHub**, **HackerNews**, and **Zotero**—and organizes it into a navigable knowledge graph.\n\n---\n\n## 🌟 Features\n\n- **🤖 Automatic Aggregation:** Daily, automated extraction of GitHub stars, HackerNews upvotes, and Zotero library.\n\n- **🔍 Powerful Search:** A built-in search engine to instantly find any item you've saved or interacted with.\n\n- **🕸️ Knowledge Graph:** Navigate bookmarks through a graph of automatically extracted topics and their connections.\n\nMy Personal Knowledge Base is available at [raphaelsty.github.io/knowledge](https://raphaelsty.github.io/knowledge/).\n\n---\n\n## 🛠️ How It Works\n\nA GitHub Actions workflow runs twice a day to perform the following tasks:\n\n1.  **Extracts Content** from specified accounts:\n    - GitHub Stars\n    - HackerNews Upvotes\n    - Zotero Records\n2.  **Processes and Stores Data** in the `database/` directory:\n    - `database.json`: Contains all the raw records.\n    - `triples.json`: Stores the knowledge graph data (topics and relationships).\n    - `retriever.pkl`: The serialized search engine model.\n3.  **Deploys Updates**:\n    - The backend API is automatically updated and pushed to the Fly.io instance.\n    - The frontend on GitHub Pages is refreshed with the latest data.\n\nThe backend is built with FastAPI and deployed on Fly.io, which offers a free tier suitable for this project. The frontend is a static site hosted on GitHub Pages. The search engine is powered by multiple [cherche](https://github.com/raphaelsty/cherche) lexical models and features a final [pylate-rs](https://github.com/lightonai/pylate-rs) model, which is compiled from Rust to WebAssembly (WASM) to run directly in the client's browser.\n\n## 🚀 Getting Started: Installation \u0026 Deployment\n\nFollow these steps to deploy your own instance of Knowledge.\n\n### 1\\. Fork \u0026 Clone\n\nFirst, fork this repository to your own GitHub account and then clone it to your local machine.\n\n### 2\\. Configuration\n\n#### A. Configure Secrets\n\nThe application requires API keys and credentials to function. These must be set as **Repository secrets** in your forked repository's settings (`Settings` \u003e `Secrets and variables` \u003e `Actions`).\n\n\u003cbr\u003e\n\n\u003ctable style=\"width:100%; border-collapse: collapse;\"\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth style=\"text-align:left; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eSecret\u003c/th\u003e\n\u003cth style=\"text-align:left; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eService\u003c/th\u003e\n\u003cth style=\"text-align:center; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eRequired\u003c/th\u003e\n\u003cth style=\"text-align:left; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eDescription\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ccode\u003eFLY_API_TOKEN\u003c/code\u003e\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ca href=\"https://fly.io\"\u003eFly.io\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"text-align:center; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eYes\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eAllows the GitHub Action to deploy your application. See the Fly.io section for instructions.\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ccode\u003eZOTERO_API_KEY\u003c/code\u003e\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ca href=\"https://www.zotero.org/settings/keys\"\u003eZotero Settings\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"text-align:center; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eOptional\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eAn API key to access your Zotero library.\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ccode\u003eZOTERO_LIBRARY_ID\u003c/code\u003e\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ca href=\"https://www.zotero.org\"\u003eZotero\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"text-align:center; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eOptional\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eThe ID of the Zotero group library you want to index.\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ccode\u003eHACKERNEWS_USERNAME\u003c/code\u003e\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003e\u003ca href=\"https://news.ycombinator.com\"\u003eHacker News\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"text-align:center; padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eOptional\u003c/td\u003e\n\u003ctd style=\"padding:8px; border-bottom: 1px solid \\#ddd;\"\u003eHackerNews username to fetch upvoted posts.\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd style=\"padding:8px;\"\u003e\u003ccode\u003eHACKERNEWS_PASSWORD\u003c/code\u003e\u003c/td\u003e\n\u003ctd style=\"padding:8px;\"\u003e\u003ca href=\"https://news.ycombinator.com/\"\u003eHacker News\u003c/a\u003e\u003c/td\u003e\n\u003ctd style=\"text-align:center; padding:8px;\"\u003eOptional\u003c/td\u003e\n\u003ctd style=\"padding:8px;\"\u003eHackerNews password.\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\n#### B. Specify Sources\n\nNext, edit the `sources.yml` file at the root of the repository to specify which GitHub users' starred repositories you want to track.\n\n```yml\ngithub:\n  - \"raphaelsty\"\n  - \"gbolmier\"\n  - \"MaxHalford\"\n```\n\n### 3\\. Deployment\n\n#### A. Deploy the API to Fly.io\n\n1.  **Install `flyctl`**, the Fly.io command-line tool. Instructions can be found [here](https://fly.io/docs/hands-on/install-flyctl/).\n2.  **Sign up and log in** to Fly.io via the command line:\n    ```sh\n    flyctl auth signup\n    flyctl auth login\n    ```\n3.  **Get API token** and add it to your GitHub repository secrets as `FLY_API_TOKEN`:\n    ```sh\n    flyctl auth token\n    ```\n4.  **Launch the app.** Follow the [Fly.io launch documentation](https://fly.io/docs/hands-on/launch-app/). This will generate a `fly.toml` file. You won't need a database.\n\n\u003e ⚠️ **Update API URLs**\n\u003e After deploying, you must replace all instances of `https://knowledge.fly.dev` in the `docs/index.html` file with your own Fly.io app URL (e.g., `https://app_name.fly.dev`).\n\n#### B. Set up GitHub Pages\n\n1.  Go to your forked repository's settings (`Settings` \u003e `Pages`).\n2.  Under `Build and deployment`, select the **Source** as `Deploy from a branch` and choose the `main` branch with the `/docs` folder.\n\n\u003e ⚠️ **Update CORS Origins**\n\u003e After your GitHub Pages site is live, you must add its URL to the `origins` list in the `api/api.py` file to allow cross-origin requests.\n\n```python\norigins = [\n    \"https://your-github-username.github.io\", # Add your GitHub Pages URL here\n]\n```\n\n---\n\n## 💸 Cost Management\n\nThis project is designed to be affordable, but you are responsible for the costs incurred on Fly.io. Here is how to keep them in check:\n\n\u003e ⚠️ **Bound Fly.io Concurrency**\n\u003e To prevent costs from scaling unexpectedly, define connection limits in the `fly.toml` file.\n\n```toml\n[services.concurrency]\n  hard_limit = 6\n  soft_limit = 3\n  type = \"connections\"\n```\n\n\u003e ⚠️ **Select a modest Fly.io VM**\n\u003e A small virtual machine is sufficient. A **shared-cpu-1x@1024MB** is a good starting point.\n\n---\n\n## 💻 Local Development\n\nTo run the API on local machine for development, simply run the following command from the root of the repository:\n\n```sh\nmake launch\n```\n\n---\n\n## 🔌 Zotero Integration\n\nThe Zotero integration allows you to save academic papers, articles, and other documents, which will then be automatically indexed by your search engine.\n\n- **Browser Extension:** Use the Zotero Connector extension for your browser to easily save documents from the web.\n\n- **Mobile App:** The Zotero mobile app lets you add documents on the go. Any uploads will be indexed within a few hours.\n\n  \u003cdiv style=\"display: flex; justify-content: space-around; align-items: center; gap: 10px;\"\u003e\n  \u003cimg src=\"./img/arxiv_1.png\" alt=\"Zotero mobile app\" style=\"width: 30%;\"\u003e\n  \u003cimg src=\"./img/arxiv_2.png\" alt=\"Zotero mobile app\" style=\"width: 30%;\"\u003e\n  \u003cimg src=\"./img/arxiv_3.png\" alt=\"Zotero mobile app\" style=\"width: 30%;\"\u003e\n  \u003c/div\u003e\n\n---\n\n## 💡 Acknowledgements\n\nMy personal Knowledge Base is inspired by and extracts resources from the Knowledge Base of François-Paul Servant, namely [Semanlink](http://www.semanlink.net/sl/home).\n\n## 📜 License\n\nThis project is licensed under the **GNU General Public License v3.0**.\n\nKnowledge Copyright (C) 2023 Raphaël Sourty\n","funding_links":[],"categories":["Python","twitter"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraphaelsty%2Fknowledge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fraphaelsty%2Fknowledge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraphaelsty%2Fknowledge/lists"}