An open API service indexing awesome lists of open source software.

https://github.com/mubelotix/simrepo

Web extension enhancing GitHub by showing similar projects in a repository's sidebar
https://github.com/mubelotix/simrepo

git github machine-learning recommendations repositories

Last synced: 5 months ago
JSON representation

Web extension enhancing GitHub by showing similar projects in a repository's sidebar

Awesome Lists containing this project

README

          

# SimRepo

[link-rgh]: https://github.com/sindresorhus/refined-github
[link-ngh]: https://github.com/sindresorhus/notifier-for-github
[link-hfog]: https://github.com/sindresorhus/hide-files-on-github
[link-tsconfig]: https://github.com/sindresorhus/tsconfig
[link-options-sync]: https://github.com/fregante/webext-options-sync
[link-cws-keys]: https://github.com/fregante/chrome-webstore-upload-keys
[link-amo-keys]: https://addons.mozilla.org/en-US/developers/addon/api/key

> Enhances GitHub by showing similar projects in a repository's sidebar

![Preview](media/previewer.png)

## Features

- Uses Manifest v3
- Use npm dependencies thanks to Parcel 2.
- [Auto-syncing options](#auto-syncing-options).
- [Auto-publishing](#publishing) with auto-versioning and support for manual releases.

## Installation

Firefox

## Technical details

Recommendations are generated by locating the nearest neighbors of a given repository within a vector space, where similar repositories are positioned close to each other. This vector space was built by training an SVC model on a large dataset containing over 300 million GitHub stars. To keep the model up-to-date, the dataset is refreshed incrementally — one-twelfth is updated each month.

Metadata for all supported repositories is packed in [a compressed JSON file](https://github.com/Mubelotix/SimRepo/blob/main/static/repos-json-gz), which is updated monthly.

The entire dataset is compacted into just 110MB, enabling all recommendation logic to run locally. This ensures fast performance and complete privacy.

At present, nearest neighbor searches use a simple brute-force method. While this is adequate for the current scale of around 300,000 repositories, future improvements may include more efficient approximate search algorithms such as [Annoy](https://github.com/spotify/annoy).

**Note**: _Because training the full model is computationally expensive, only a smaller version intended for testing is included in this repository and is licensed under GPL-3.0. The full-scale model is not included and remains proprietary._

## Getting started

### Requirements

- Node and npm installed
- An UNIX-like operating system

### 🛠 Build locally

1. Run `npm install` to install all required dependencies
2. Run `npm run build`

The build step will create the `distribution` folder, this folder will contain the generated extension.

### 🏃 Run the extension

Using [web-ext](https://extensionworkshop.com/documentation/develop/getting-started-with-web-ext/) is recommended for automatic reloading and running in a dedicated browser instance. Alternatively you can load the extension manually (see below).

1. Run `npm run watch` to watch for file changes and build continuously
2. Run `npm install --global web-ext` (only only for the first time)
3. In another terminal, run `web-ext run -t chromium`
4. Check that the extension is loaded by opening the extension options ([in Firefox](media/extension_options_firefox.png) or [in Chrome](media/extension_options_chrome.png)).

#### Manually

You can also [load the extension manually in Chrome](https://www.smashingmagazine.com/2017/04/browser-extension-edge-chrome-firefox-opera-brave-vivaldi/#google-chrome-opera-vivaldi) or [Firefox](https://www.smashingmagazine.com/2017/04/browser-extension-edge-chrome-firefox-opera-brave-vivaldi/#mozilla-firefox).