{"id":16163470,"url":"https://github.com/marianfoo/bestofcapjs-data","last_synced_at":"2025-10-18T15:22:48.024Z","repository":{"id":207534242,"uuid":"719481184","full_name":"marianfoo/bestofcapjs-data","owner":"marianfoo","description":null,"archived":false,"fork":false,"pushed_at":"2024-10-30T03:53:17.000Z","size":743,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-30T06:26:34.134Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/marianfoo.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-16T09:03:40.000Z","updated_at":"2024-10-19T18:14:19.000Z","dependencies_parsed_at":"2023-11-16T10:27:47.979Z","dependency_job_id":"996e8e4f-84f0-404b-98ed-67551f1e01d6","html_url":"https://github.com/marianfoo/bestofcapjs-data","commit_stats":null,"previous_names":["marianfoo/bestofcapjs-data"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marianfoo%2Fbestofcapjs-data","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marianfoo%2Fbestofcapjs-data/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marianfoo%2Fbestofcapjs-data/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marianfoo%2Fbestofcapjs-data/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/marianfoo","download_url":"https://codeload.github.com/marianfoo/bestofcapjs-data/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243954893,"owners_count":20374365,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-10T02:35:51.173Z","updated_at":"2025-10-18T15:22:42.977Z","avatar_url":"https://github.com/marianfoo.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Best of UI5 Data\n\n\"Best of UI5\" is the new entry page for the ui5-community.  \nThis repository will crawl and supply the data for the website.\n\n## Add your package\n\nJust create a [issue with this template in the `bestofui5-data repo`](https://github.com/ui5-community/bestofui5-data/issues/new?assignees=marianfoo\u0026labels=new%20package\u0026template=new_package.md\u0026title=Add%20new%20Package:) with your package and just check if you meet the prerequisites.  \n\n## Description\n\nThe crawler is written in Typescript and will get the latest data every day with a GitHub action worklow.  \nIt will look at every package defined in the [`sources.json`](https://github.com/ui5-community/bestofui5-data/blob/main/sources.json) file.\nCurrently it´s looking at data from GitHub and NPM.  \nIf you´re looking for the latest data files, they are in the `live-data` branch and in the [`data`](https://github.com/ui5-community/bestofui5-data/tree/live-data/data) folder.  \n\n# Technical\n\n## General\n\nThe source code is written in typescript and in the [`src`](https://github.com/ui5-community/bestofui5-data/tree/main/src) folder.  \nThe workflow is will run every day with a GitHub action and triggers the `build` command in the `package.json` file.  \nData is collected via GitHub and NPM APIs. For GitHub only the authenticated access makes sense because the API rate limit is 60 requests per hour.  \nIt will collect metadata from GitHub and NPM, Readme and Historic Downloads.  \n\n## index.ts\n\nIndex.ts is the initial file for all following processes.  \nIt starts to read the `sources.json` file to determine which packages have to be read.\nSince the file is based on the GitHub repositories, the process starts to read them in `gh-repos.ts`.  \nThe next step is to enrich the data from GitHub with NPM data.\nFrom the returned data, the unique types and tags are selected, as well as the individual versions.  \nThe package data and types/tags are written to `data.json` and the versions to `versions.json`.\n\n## gh-repos.ts\n\nThe GitHub process starts with the `get` method.\nBefore the data is retrieved, a distinction must be made whether the repo is a mono or single repo.\nFor example, the repo [ui5-ecosystem-showcase](https://github.com/ui5-community/ui5-ecosystem-showcase) is a monorepo with many middleware and tasks.\nWith `getRepoInfo` the metadata is retrieved from the GitHub repo.\nThis is done using this [GitHub Repositories API](https://docs.github.com/en/rest/repos/repos).\nAdditionally, `updatedAt` is determined by when the last commit was on the default branch (currently only by `generators`).\nWith `fetchRepo` data is retrieved directly from the repository.\nHere the `package.json` and the `README.md` for the later representation on the web page.\n\nThe JSDoc are also retrieved with `getJsdoc` if it exists. Currently this is done for the types \"task\" and \"middleware\".\nFor correct processing the `ui5.yaml` is also parsed here.\n\nBecause the types `generator` are not on NPM, an attempt is made to generate a key figure with the cloning statistics.  \nThis happens with the method `updateCloningStats`. For this API special permissions are needed and therefore a special GitHub token must be used (`WORKFLOW_CRAWL_GITHUB_TOKEN`) which has more permissions than the default token.\n\n## npm.ts\n\nThe class `NpmProvider` is there to enrich the GitHub data.\nTherefore the packages array is passed here.\nIt retrieves the metadata from NPM, as well as the historical download counts.\nTo optimize the downloads they are combined for bulk retrieval and retrieved with `getDownloadsBulk`.\nThe following download numbers are currently retrieved:\n\n- current fortnight\n- last fortnight\n- last 30 days\n- last year\n- last year per month\n\nMetadata is also retrieved. Currently the following data is used by NPM:\n\n- created At\n- updated At\n- all versions\n\n## Workflow\n\nTo retrieve the clones statistics from Github, a special GitHub token is used in the workflow (`WORKFLOW_CRAWL_GITHUB_TOKEN`).  \nThis token has more permissions than the default token.  \n\nThe workflow uses `ubuntu-latest` with node 16.  \n\nBasically, the latest data is always published on the 'live-data' branch. These data are partly rebuilt from scratch(`data.json`, `versions.json`) and partly enriched (`clones.json`).  \nFirst the main branch and then the branch `live-data` is cloned to perform a rebase in the `main` branch.  \nThis ensures that the data is reused.  \n\nAfter that the module is installed and with `npm run build` the typescript script is executed.\nThereby the data files are updated.\nAfter that the committed update is pushed to the `live-data` branch.\n\n## Files\n\n### sources.json\n\nFor this file there is a [type definition](https://github.com/ui5-community/bestofui5-data/blob/5860fd33a980bcdc8c23fb3b7bf25d7c36607ebe/src/types.d.ts#L64-L73) how the content should look like.  \n\nFor this file there is a type definition how the content should look like.\nA singlerepo needs:\n\n- owner --\u003e username or organization name\n- repo --\u003e Repository name\n- subpath --\u003e for monorepos, path were the subpackages are located\n- subpackages --\u003e for monorepos, list of subpackages\n- addedToBoUI5 --\u003e timestamp when this package was added to BestofUI5\n- type --\u003e type of the package, see enum [`BoUI5Types`](https://github.com/ui5-community/bestofui5-data/blob/5860fd33a980bcdc8c23fb3b7bf25d7c36607ebe/src/types.d.ts#L1-L8) for this\n- tags --\u003e list of tags\n\n### data.json\n\nThere are two arrays in this file.  \nIn the array Packages are all packages with the information.  \nIn the second array all types/tags are present. This is used in the [Tags View](https://bestofui5.org/#/tags).\n\n### versions.json\n\nThis file is generated from all NPM packages and their versions.  \nThis file is used in the [Timeline View](https://bestofui5.org/#/timeline).\n\n### clones.json\n\nThis file is used only for the generators. Since these do not have an NPM package, an attempt is made to collect a measure via the number of clones of the repository.  \nSince the API only displays the last 15 days, this file will store historical data.\n\n## run locally\n\ngit clone:  \n`git clone https://github.com/ui5-community/bestofui5-data`\n\ninstall:  \n`npm install`\n\nset github token (check which one is four your OS):  \n`export GITHUB_TOKEN=\u003cyour token\u003e`  \n`set GITHUB_TOKEN=\u003cyour token\u003e`  \n`$env:GITHUB_TOKEN=\"\u003cyour token\u003e\"`  \n\nrun crawl:  \n`npm run build`\n\nWhen you run the build command without a github token, the workflow will probably run soon into a rate limit.  \n\nThe crawl will probably also fail when retrieving the clone statistics.\nHowever, this section is in a try/catch and will only show the error.\nThe rest should go through normally.\n\n## License\n\nThis project is licensed under the Apache Software License, version 2.0 except as noted otherwise in the [LICENSE](LICENSE) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarianfoo%2Fbestofcapjs-data","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarianfoo%2Fbestofcapjs-data","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarianfoo%2Fbestofcapjs-data/lists"}