Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/best-of-lists/best-of-generator

πŸ† Generates a ranked list of awesome libraries and tools.
https://github.com/best-of-lists/best-of-generator

List: best-of-generator

awesome awesome-list best-of best-of-list conda librariesio markdown markdown-generator npm pypi

Last synced: 3 months ago
JSON representation

πŸ† Generates a ranked list of awesome libraries and tools.

Awesome Lists containing this project

README

        


Best-of Generator


πŸ†Β  Generates a ranked markdown list of awesome libraries and tools.









Getting Started β€’
Documentation β€’
Support β€’
Report a Bug β€’
Contribution β€’
Changelog

The best-of-generator is a CLI tool to generate a markdown page of ranked open-source projects based on a list of projects defined in a `yaml` file. It is integrated with different package managers - such as PyPI, NPM, Conda, and Docker Hub - to automatically collect a variety of project metadata and calculate project-quality scores. It also comes with a GitHub Action workflow for a fully automized update process.

> πŸ§™β€β™‚οΈ Create your own best-of list in just 3 minutes with [this guide](https://github.com/best-of-lists/best-of/blob/main/create-best-of-list.md).

## Highlights

- πŸ“‡Β  Generates a beautiful markdown page from a `yaml` list.
- πŸ”ŒΒ  Integrates various package managers (npm, pypi, conda ...).
- πŸ₯‡Β  Calculates a project-quality score based on a variety of metrics.
- πŸ“ˆΒ  Identifies trending projects based on collected metrics.
- πŸ”„Β  GitHub Action workflow for automated weekly updates.

## Getting Started

> πŸ§™β€β™‚οΈ If you want to create your own best-of list, we strongly recommend to follow [this guide](https://github.com/best-of-lists/best-of/blob/main/create-best-of-list.md) instead of setting up best-of manually. With the guide, it will only take about 3 minutes to get you started. It is already set-up to automatically run the best-of generator via our GitHub Action and includes other useful template files. Installing the best-of CLI tool is not required.

1. Install best-of generator via pip:
```bash
pip install best-of
```
2. Create a `projects.yaml` file based on the [documented structure](#projectsyaml-structure). This file should contain at least one project. For example:
```yaml
projects:
- name: "best-of-ml-python"
github_id: "ml-tooling/best-of-ml-python"
```
3. Run best-of generator via command-line:
```bash
best-of generate -g ./projects.yaml
```

You can find further information on how to configure the `projects.yaml` file and additional features in the [documentation section](#documentation) below.

## Support & Feedback

This project is maintained by [Benjamin RΓ€thlein](https://twitter.com/raethlein), [Lukas Masuch](https://twitter.com/LukasMasuch), [Jan Kalkan](https://www.linkedin.com/in/jan-kalkan-b5390284/), and [Johannes Rieke](https://twitter.com/jrieke). Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.

| Type | Channel |
| ------------------------ | ------------------------------------------------------ |
| 🚨  **Bug Reports** | |
| 🎁  **Feature Requests** | |
| πŸ‘©β€πŸ’»Β  **Usage Questions** | |
| πŸ“’Β  **Announcements** | |
| ❓  **Other Requests** |
|

## Documentation


YAML Structure β€’
Projects β€’
Categories β€’
Labels β€’
Configuration β€’
Project Quality Score β€’
Trending Projects β€’
CLI β€’
GitHub Action β€’
Python API

The best-of generator is a CLI tool to generate a markdown page from a list of projects configured in a `yaml` file. The documentation sections below will provide information on the [`projects.yaml` structure](#projectsyaml-structure), on its different sections ([projects](#projects), [labels](#labels), [categories](#categories) & [configuration]((#configuration))), on some of the best-of features (e.g. [project-quality score](#project-quality-score) & [trending projects](#trending-projects)), and instructions on how to run the markdown generation [via the command-line interface](#generation-via-cli) or [via GitHub Actions](#generation-via-github-action).

### `projects.yaml` Structure

The `projects.yaml` file has the following structure:

- `configurations` (optional): Can be used to overwrite the default configuration of the best-of list. More information in the [configuration section](#configuration).
- `categories` (required): All used categories should be listed here with at least a descriptive title. More information in the [categories section](#categories).
- `labels` (optional): Used labels can be added here to extend the label with additional aspects (e.g. URL, image, description). More information in the [labels section](#labels).
- `projects` (required): All projects that are supposed to be shown in the generated markdown page should be listed here. More information in the [projects section](#projects).

The following `yaml` shows a small example:

```yaml
# Optional: change the default configuration
configuration:
markdown_header_file: "config/header.md"
markdown_footer_file: "config/footer.md"

# Optional: add categories
categories:
- category: "data-engineering"
title: "Machine Learning & Data Engineering"
subtitle: "Best-of lists about machine learning, data engineering, data science, or other topics related to big data."

# Optional: add labels
labels:
- label: "python"
image: "https://www.python.org/static/favicon.ico"
description: "Best-of list with Python projects"

# Required: list of all projects
projects:
- name: "best-of-ml-python"
github_id: "ml-tooling/best-of-ml-python"
labels: ["python"]
category: "data-engineering"
```

### Projects

A project is the main component of a best-of list. In most cases, a project is hosted on GitHub and released on different package managers. Such a project should be added with the `github_id` and the IDs of all the package managers it is released to. However, it is also possible to add projects which are not hosted on GitHub or released on a package manager, as shown in the example below.

#### Project Examples

```yaml
projects:
# Projects with different package managers:
- name: "Tensorflow"
github_id: "tensorflow/tensorflow"
pypi_id: "tensorflow"
conda_id: "conda-forge/tensorflow"
dockerhub_id: "tensorflow/tensorflow"
- name: "Best-of Generator"
pypi_id: "best-of"
github_id: "best-of-lists/best-of-generator"
# Link to another project collection:
- name: "Best-of Overview"
homepage: "https://best-of.org"
resource: True
# Project that is not on GitHub:
- name: "Quart"
pypi_id: "quart"
homepage: "https://gitlab.com/pgjones/quart"
description: "Quart is a Python ASGI web microframework with the same API as Flask."
license: "MIT"
star_count: 772
show: True
```

The example above will be rendered as shown below:

![Projects Example](./docs/images/best-of-generated-projects-framed.png)

Every project can also be expanded to show additional project information (by clicking on the project), for example:

![Project Body Example](./docs/images/best-of-project-body-framed.png)

#### Project Properties


Property
Description


name
Name of the project. This name is required to be unique on the best-of list.


Optional Properties:


github_id
GitHub ID of the project based on user or organization and the repository name, e.g. best-of-lists/best-of-generator. If the project is hosted on GitLab, please use the gitlab_id property.


category
Category that this project is most related to. You can find all available category IDs in the projects.yaml file. The project will be sorted into the Others category if no category is provided.


labels
List of labels that this project is related to. You can find all available label IDs in the projects.yaml file.


license
License of the project. If set, license information from GitHub or package managers will be overwritten. Can be a custom URL pointing to more information in case it is not a standard license. `allowed_licenses` must be set to "all" or contain the URL in order to show the project.


description
Short description of the project. If set, the description from GitHub or package managers will be overwritten.


homepage
Homepage URL of the project. Only use this property if the project homepage is different from the GitHub URL.


docs_url
Documentation URL of the project. Only use this property if the project documentation site is different from the GitHub URL.


resource
If True, the project will be marked as a resource. Resources are not ranked and will always be shown on top of the category. You can use this to link to another best-of list section or website that contains additional projects.


group
If True, the project will be used as top project for grouping a set of related projects. group_id also needs to be set to the shared group ID.


group_id
Group ID that can be used to group this project to other projects. For every group, there needs to be one project with group set to True.


show
If True, the project will always be shown even when the project would be actual hidden (e.g. dead project, risky licenses, to few stars...). Only use this property if you are sure that this project needs to be shown.


ignore
If True, the project will be ignored. This also means that it will not be included in the hidden projects section. However, the project metadata will still be collected.


Supported Integrations:


pypi_id
Project ID on the Python package index (PyPi).


conda_id
Project ID on the conda package manager. If the main package is provided on a different channel, prefix the ID with the given channel: e.g. conda-forge/tensorflow


npm_id
Project ID on the Node package manager (npm).


dockerhub_id
Project ID on the Docker Hub container registry.


maven_id
Artifact ID on Maven central, e.g. org.apache.flink:flink-core.


github_id
GitHub ID of the project based on user or organization and the repository name, e.g. best-of-lists/best-of-generator.


gitlab_id
GitLab ID of the project based on user or organization and the repository name, e.g. best-of-lists/best-of-generator.

While you can theoretically overwrite all project metadata, we suggest to only set the properties which the best-of generator is not able to find on GitHub or the configured package managers. There are also other undocumented properties, but for most projects those properties should not be overwritten.

Additional undocumented project metadata (click to expand...)

- created_at
- update_at
- github_url
- github_release_downloads
- github_dependent_project_count
- last_commit_pushed_at
- star_count
- commit_count
- dependent_project_count
- contributor_count
- fork_count
- monthly_downloads
- open_issue_count
- closed_issue_count
- release_count
- latest_stable_release_published_at
- latest_stable_release_number
- trending
- helm_id
- brew_id
- apt_id
- yum_id
- snap_id
- maven_id
- dnf_id
- yay_id
- _url
- _latest_release_published_at
- _dependent_project_count

### Categories

A category allows to add additional structure to the best-of list by grouping related projects into a shared category. Thereby, every project is grouped into exactly one category. If no category is provided with the project metadata, the project will be categorized into `Others`.

#### Category Example

```yaml
categories:
- category: "data-engineering"
title: "Machine Learning & Data Engineering"
subtitle: "Best-of lists about machine learning, data engineering, data science, or other topics related to big data."

projects:
- name: "best-of-ml-python"
github_id: "ml-tooling/best-of-ml-python"
category: "data-engineering"
```

The example above will be rendered as shown below:

![Category Example](./docs/images/best-of-category-example-framed.png)

#### Category Properties


Property
Description


category
ID of the category. This ID should also be used for adding a project to this category.


title
Category name used as the header of the category section.


Optional Properties:


subtitle
Short description about the category shown under the title.


ignore
If True, the category and all its projects will be ignored.

### Labels

A label allows to highlight similarities or special features shared between projects. Compared to categories, a project can have any number of labels. The labels are shown as badges attached to the project description. It can have only an image (favicons are recommended), only a name, or both. We recommend to use image labels (or only very short labels) since the usage of labels will shorten the visible description text of a project.

#### Label Example

```yaml
labels:
- label: "python"
image: "https://www.python.org/static/favicon.ico"
description: "Best-of list with Python projects"
- label: "libraries"
name: "libraries"

projects:
- name: "best-of-ml-python"
github_id: "ml-tooling/best-of-ml-python"
labels: ["libraries", "python"]
category: "data-engineering"
```

The example above will be rendered as shown below:

![Label Example](./docs/images/best-of-label-example-framed.png)

#### Label Properties


Property
Description


label
ID of the label. This ID should also be used for adding the label to a project.


Optional Properties:


image
URL to an image. If a valid URL is provided, the image will be shown wherever the label is used.


name
Name of the label. If a name is provided, the name will be shown wherever the label is used.


description
Short description of the label. If show_labels_in_legend configuration is True and an image is set, this description will also be shown in the legend (explanations).


ignore
If True, the label will not be shown anywhere.


url
If url is set, the label will be a rendered as a link wherever it is used.

### Configuration

Many aspects of the best-of list can be configured. Since most default values are selected to support the widest range of different lists, changing the default configuration is not required for most cases.

#### Configuration Example

```yaml
configuration:
min_stars: 0
min_projectrank: 0
allowed_licenses: ["all"]
markdown_header_file: "config/header.md"
markdown_footer_file: "config/footer.md"
```

The configuration example above changes the default configuration to show all projects regardless of star count (via `min_stars`), projectrank (via `min_projectrank`), or license (via `allows_licenses`). It also configures a header (via `markdown_header_file`) and footer (via `markdown_footer_file`) markdown files that will be attached to the generated content.

#### Configuration Options


Config
Description
Default


output_file
The markdown output file.
./README.md


markdown_header_file
Path to a markdown file that will be attached above the generated content.



markdown_footer_file
Path to a markdown file that will be attached below the generated content.



output_generator
Select the markdown generator to use for generating the output markdown page. Currently, only markdown-list is supported.
markdown-list


project_inactive_months
Number of months without activity until a project is marked as inactive.
6


project_dead_months
Number of months without activity until a project is marked as dead.
12


project_new_months
Number of months since creation to mark a project as newcomer.
6


min_projectrank
Project will be hidden if it has a smaller projectrank (quality score).
10


min_stars
Project will be hidden if it has a less stars on GitHub.
100


require_license
If True, all projects without a detected license will be hidden.
True


require_repo
If True, all projects without a source repository - configure via github_id or gitlab_id - will be hidden.
False


min_description_length
The minimum length of the project description. If the length is less, the project will not be shown.
10


max_description_length
The maximum length of the project description.
55


ascii_description
If True, all non-ASCII characters in the project description will be removed. Useful for filtering out distractive emoji, but hurtful in non-English cases. (Note: GitHub emoji commands (e.g. :smile:) are always removed.)
True


projects_history_folder
The folder used for storing history files (csv files with project metadata). If null, no history files will be created.
./history


generate_install_hints
If False, the install hint code block for the package managers will not be shown.
True


generate_toc
If True, generate a table of content with all categories.
True


category_heading
How categories headings are generated. If simple, headings will be ## Category, and IDs are set by GitHub. If robust, headings will be <h2 id='category-id'>Category</h2>. (TOC relies on these IDs.) If all of your categories' names are ASCII, use simple.
simple


generate_legend
If True, generate a legend containing explanations for the used emojis.
True


sort_by
The project property used to sort the projects within a category.
projectrank


max_trending_projects
The number of trending projects to show for trending up as well as down.
5


hide_empty_categories
If True, empty categories will not be shown.
False


hide_project_license
If True, the project license badge will not be shown.
False


hide_license_risk
If True, the risk indicator for uncommon or risky licenses will not be shown.
False


show_labels_in_legend
If True, image labels will be listed in the legend (explanation) if they also have a description.
True


allowed_licenses
List of allowed licenses (spdx format). A project with a different license will be hidden. Use ["all"] to allow all licenses.
selection of common open-source licenses


extension_script
Path to a python script which is loaded before project collection or markdown generation to allow extensibility.

### Project Quality Score

All projects in a best-of list are ranked and sorted by a project-quality score (also called `projectrank`). The score is calculated based on various metrics automatically collected from GitHub and different package managers. The score is just a sum of points which a project collects for various aspects and metrics. The score only has a meaning when it is compared to the project-quality score of other projects. We currently use the following aspects to calculate the score:

> This calculation is just chosen by experience. There is no scientific proof that this really reflects the quality of a project.

- Has homepage link & description: `+ 1`
- Has an existing GitHub repository: `+ 1`
- Has a license: `+ 1`
- Has a commonly used license (e.g. MIT): `+ 1`
- Has multiple releases: `+ 1`
- Has stable releases based on semantic version: `+ 1`
- Has a release that is less than 6 months old: `+ 1`
- Repo was update in the last 3 months: `+ 1`
- Is older than 6 months: `+ 1`
- Metrics from GitHub & package mangers:
- Number of stars: `+ log(COUNT / 2)`
- Number of contributors: `+ log(COUNT / 2) - 1`
- Number of commits: `+ log(COUNT / 2) - 1`
- Number of forks: `+ log(COUNT / 2)`
- Number of monthly downloads: `+ log(COUNT / 2) - 1`
- Number of dependent projects: `+ log(COUNT / 1.5)`
- Number of watchers: `+ log(COUNT / 2) - 1`
- Number of closed issues: `+ log(COUNT / 2) - 1`

### Trending Projects

The best-of list is able to automatically identify trending projects by comparing [project-quality scores](#project-quality-score) between the metadata of the current generation with the latest history file. If the history is activated (`projects_history_folder` is not set to `null`), the best-of generation will automatically create a `_changes.md` file in the configured history folder for every update and a `latest-changes.md` file in the folder of the generated markdown page. These files contain a list of projects that are trending up (higher quality score since last update) and down (lower quality score since last update) as well as a list of all added projects since the last update, as shown in the following example:

![Trending project example](./docs/images/best-of-trending-projects-framed.png)

The [GitHub Action workflow](#generation-via-github-action) uses these markdown files to automatically create releases for every update. This allows to persist a useful changelog over many updates and enables readers to get valuable email updates whenever the list is updated (by watching for release events).

### Generation via CLI

> To use the CLI, you need to have the best-of generator installed via pip:
> `pip install best-of`

```bash
best-of generate [OPTIONS] PATH
```

Generates a best-of markdown page from a `yaml` file.

**Arguments**:

* `PATH`: Path to the `yaml` file containing the best-of metadata (e.g. `./projects.yaml`).

**Options**:

* `-g`, `--github-key` `TEXT`: GitHub API Token (from https://github.com/settings/tokens).
* `-l`, `--libraries-key` `TEXT`: Libraries.io API Key (from https://libraries.io/api).
* `--help`: Show this message and exit.

### Generation via GitHub Action

> πŸ§™β€β™‚οΈ If you want to create your own best-of list, we strongly recommend to follow [this guide](https://github.com/best-of-lists/best-of/blob/main/create-best-of-list.md). With the guide, it will only take about 3 minutes to get you started. It already includes this GitHub Action and some other useful template files. Further manual steps for setting up the GitHub Action are not required.

The [best-of-update-action](https://github.com/marketplace/actions/best-of-update-action) makes it very easy to set-up automated scheduled updates for your best-of markdown page. Please refer to the [best-of-update-action documentation](https://github.com/marketplace/actions/best-of-update-action) for more detailed information about the GitHub Action and the workflow.

### Generation via Python API

> _Usage of the Python API is not well documented yet and currently not recommended._

The best-of generator can also be used and integrated via its Python API. The full Python API documentation can be found [here](https://github.com/best-of-lists/best-of-generator/blob/main/docs/README.md).

### Updating Best-of Generator

## Known Issues

The generated README file is not displayed completely (click to expand...)

GitHub only renders the first 512 kb of the main `README.md` file and will cut off the rendered version as soon as it has processed the first 512 kb of the raw markdown content. The rendering is only cut off when viewing the readme on the main repo page. If you directly select the `README.md` file, it will render in its entirety. To mitigate this issue, we optimized the markdown generation to require the minimum amount of characters. However, if you have a very large list of projects (more than 800), you might reach the 512 kb limit (check the file size of the generated `README.md` file). In this case, we suggest to extract some of the categories or projects into smaller best-of lists.

## Contribution

- Pull requests are encouraged and always welcome. Read our [contribution guidelines](https://github.com/best-of-lists/best-of-generator/tree/main/CONTRIBUTING.md) and check out [help-wanted](https://github.com/best-of-lists/best-of-generator/issues?utf8=%E2%9C%93&q=is%3Aopen+is%3Aissue+label%3A"help+wanted"+sort%3Areactions-%2B1-desc+) issues.
- Submit GitHub issues for any [feature request and enhancement](https://github.com/best-of-lists/best-of-generator/issues/new?assignees=&labels=feature&template=02_feature-request.md&title=), [bugs](https://github.com/best-of-lists/best-of-generator/issues/new?assignees=&labels=bug&template=01_bug-report.md&title=), or [documentation](https://github.com/best-of-lists/best-of-generator/issues/new?assignees=&labels=documentation&template=03_documentation.md&title=) problems.
- By participating in this project, you agree to abide by its [Code of Conduct](https://github.com/best-of-lists/best-of-generator/blob/main/.github/CODE_OF_CONDUCT.md).
- The [development section](#development) below contains information on how to build and test the project after you have implemented some changes.

## Development

> _**Requirements**: [Docker](https://docs.docker.com/get-docker/) and [Act](https://github.com/nektos/act#installation) are required to be installed on your machine to execute the containerized build process._

To simplify the process of building this project from scratch, we provide build-scripts - based on [universal-build](https://github.com/ml-tooling/universal-build) - that run all necessary steps (build, check, test, and release) within a containerized environment. To build and test your changes, execute the following command in the project root folder:

```bash
act -b -j build
```

Refer to our [contribution guides](https://github.com/best-of-lists/best-of-generator/blob/main/CONTRIBUTING.md#development-instructions) for more detailed information on our build scripts and development process.

---

Licensed **MIT**. Created and maintained with ❀️  by developers from Berlin.