https://github.com/matthiaszepper/nfcore_stats_backend
An application to gather statistics for the nf-core community, build with FastAPI, Pydantic, Celery and SQLAlchemy.
https://github.com/matthiaszepper/nfcore_stats_backend
celery docker docker-compose fastapi microservices nf-core postgresql pydantic redis sqlmodel
Last synced: 3 months ago
JSON representation
An application to gather statistics for the nf-core community, build with FastAPI, Pydantic, Celery and SQLAlchemy.
- Host: GitHub
- URL: https://github.com/matthiaszepper/nfcore_stats_backend
- Owner: MatthiasZepper
- Created: 2022-07-18T19:06:58.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-09-09T19:23:05.000Z (almost 3 years ago)
- Last Synced: 2025-01-20T10:23:04.726Z (5 months ago)
- Topics: celery, docker, docker-compose, fastapi, microservices, nf-core, postgresql, pydantic, redis, sqlmodel
- Language: Python
- Homepage:
- Size: 141 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
Awesome Lists containing this project
README
# nf-core stats backend
The new _nf-core stats backend_ provides simple-to-use REST and GraphQL web services to query/retrieve statistics and metadata about the [nf-core community](https://nf-co.re) and their pipelines. The data is daily aggregated from Github, Twitter, Slack and Youtube. It’s designed with simplicity and performance emphasized.
## Development
### Roadmap
- [x] Scaffold initial project and repo structure: Poetry setup and Docker compose file with containers for the services.
- [x] Decide on the tech stack to use: FastAPI, SQLModel, Pydantic, Celery, PostgreSQL and Redis.
- [x] Write a first simple, scheduled task as demo: Uptime checker for nf-co.re
- [x] Write a first API to retrieve the uptime status of nf-co.re as demo.
- [x] _Derive data models and suitable database table structure (Work in progress: 1/4 done)_
- [x] _Write CRUD logic for the various data types and sources (Work in progress: 1/4 done)_
- [x] Include api.routers and split endpoints to subfiles.
- [ ] Write scheduled tasks to interact with Github API, Twitter API and Slack API to gather stats and other information.
- [ ] Ingest output of the schedulers into the database.
- [ ] Write REST APIs to retrieve the data.
- [ ] Write GraphQL APIs to retrieve the data.
- [ ] Add authentication to the endpoints.
- [ ] Write documentation.
- [ ] Include convenience functions, e.g. the ability to add new domains or accounts to monitor via API calls.
- [ ] Write tests.
- [ ] Integrate and configure alembic for database migrations?### Debugging
To enable debugging code, the container _nfcore_stats_api_ has set `stdin_open` and `tty` true, such that one can attach a terminal to the container. This is most useful in conjunction with `set_trace()`. Put
```python
import pdb; pdb.set_trace()
```anywhere within the body of a function. If that function is executed, you will be able to step through every command and also interactively explore the variables. To do so, you need to first attach a new terminal to the API container
```bash
docker container attach nfcore_stats_api
```and then send requests to the API to trigger the function execution.
### Importing existing data into the database
The new backend has dedicated APIs meant to import the existing JSON files scraped by the current website. To import those
to the database, navigate into the folder containing the existing json files and send them as request bodies to the respective endpoints:```bash
cd /path/to/your/json/files
curl --data-binary "@pipelines.json" -H "Content-Type: application/json" -X PUT http://localhost:8000/import/pipelines
```Mind the `@` symbol preceding the file name. You can also specify `--data-binary "@/path/to/your/json/files/pipelines.json"` if you are dispatching the request from outside the folder.
## Production deployment