{"id":34585188,"url":"https://github.com/gitname/sediment-api","last_synced_at":"2026-04-07T16:31:03.331Z","repository":{"id":228576337,"uuid":"617153228","full_name":"gitname/sediment-api","owner":"gitname","description":"CSV file parser and web server for WHONDRS sediment data (exercise)","archived":false,"fork":false,"pushed_at":"2023-04-11T17:23:41.000Z","size":99,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-14T08:39:03.022Z","etag":null,"topics":["docker","fastapi","mongodb","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gitname.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-03-21T19:59:40.000Z","updated_at":"2023-07-12T00:01:45.000Z","dependencies_parsed_at":"2024-03-19T14:07:14.319Z","dependency_job_id":null,"html_url":"https://github.com/gitname/sediment-api","commit_stats":null,"previous_names":["gitname/sediment-api"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/gitname/sediment-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitname%2Fsediment-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitname%2Fsediment-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitname%2Fsediment-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitname%2Fsediment-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gitname","download_url":"https://codeload.github.com/gitname/sediment-api/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitname%2Fsediment-api/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31520384,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T16:28:08.000Z","status":"ssl_error","status_checked_at":"2026-04-07T16:28:06.951Z","response_time":105,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","fastapi","mongodb","python"],"created_at":"2025-12-24T10:29:48.885Z","updated_at":"2026-04-07T16:31:03.309Z","avatar_url":"https://github.com/gitname.png","language":"Python","readme":"# Sediment API\n\n\u003c!-- Reference: https://docs.github.com/en/actions/monitoring-and-troubleshooting-workflows/adding-a-workflow-status-badge --\u003e\n![test](https://github.com/gitname/sediment-api/actions/workflows/test.yml/badge.svg)\n\n\u003e Welcome to the Sediment API repository! I wrote the code in this repository as part of an exercise.\n\u003e The exercise prompt was, in summary, to (a) write a Python script someone could use to extract data records\n\u003e from a CSV file and store them in a MongoDB database;\n\u003e and (b) create an HTTP API that someone could use to retrieve a data record in JSON format.\n\n## Table of contents\n\n\u003c!-- TOC --\u003e\n* [Sediment API](#sediment-api)\n  * [Table of contents](#table-of-contents)\n  * [Overview](#overview)\n  * [Usage](#usage)\n    * [Exercise-specific](#exercise-specific)\n    * [General](#general)\n  * [Development](#development)\n    * [Environment](#environment)\n    * [Testing](#testing)\n    * [Static type checking](#static-type-checking)\n    * [Code formatting](#code-formatting)\n    * [Dependencies](#dependencies)\n  * [Roadmap](#roadmap)\n\u003c!-- TOC --\u003e\n\n## Overview\n\nThis repository contains Python scripts people can use to extract data from CSV files,\nstore that data in a MongoDB database,\nand provide access to that data via an HTTP API.\n\nThe scripts are:\n\n1. `parser/parser.py`: A file parser people can use to extract data from a CSV file and insert it into a database\n2. `server/server.py`: A web server people can use to provide access to that data via an HTTP API\n\nHere's a diagram showing how data flows into, between, and out of those scripts.\n\n```mermaid\n%% This is a flowchart written using Mermaid syntax.\n%% GitHub will render it as an image.\n%%\n%% References: \n%% - https://mermaid.js.org/syntax/flowchart.html\n%% - https://github.blog/2022-02-14-include-diagrams-markdown-files-mermaid/\n\nflowchart LR\n    parser[[parser.py]]\n    db[(Database)]\n    file[CSV File]\n    client[HTTP Client]\n    server[[server.py]]\n\n    parser --\u003e db\n    db --\u003e server\n\n    subgraph File Parser\n        parser\n    end\n\n    file -. CSV .-\u003e parser\n\n    subgraph Web Server\n        server\n    end\n\n    server -. JSON .-\u003e client\n```\n\n## Usage\n\n### Exercise-specific\n\nHere's how you can produce the behavior described in the exercise prompt.\n\n1. Install [Docker](https://www.docker.com/) onto your computer.\n2. Clone (or download and extract) this repository onto your computer.\n3. Open a console in the root folder of the repository.\n4. Copy the example config file and name the copy \"`.env`\".\n   ```shell\n   cp .env.example .env\n   ```\n5. Start the web server and MongoDB (in Docker containers).\n   ```shell\n   docker-compose up\n   ```\n   \u003e **Note:** That command will run the containers in the foreground, taking over your console. You can open a new\n   \u003e console to issue the remaining commands.\n6. Run the parser (in the `app` container).\n   ```shell\n   docker exec -it app python parser/parser.py parser/example_data/WHONDRS_S19S_Sediment_GrainSize.csv\n   ```\n7. In a web browser, visit http://localhost:8000/samples/S19S_0001_BULK-D\n   - The web browser will show a sample in JSON format.\n\n### General\n\nHere's how you can use the system in general.\n\n1. Do steps 1-5 shown in the \"Exercise-specific\" section above.\n2. (Optional) Put a CSV file you want to parse, anywhere within the repository's file tree.\n   \u003e **Note:** All files within the repository's file tree are accessible within the `app` container\n   \u003e (within the `app` container, the root folder of the repository is located at `/code`).\n3. Run the parser, specifying the path to the CSV file you want to parse.\n   ```shell\n   # Specify the path as it would be specified within the `app` container.\n   docker exec -it app python parser/parser.py \u003cpath_to_csv_file\u003e\n   ```\n   \u003e **Note:** You can specify the path as either an absolute path, using `/code` to refer to the root folder of the\n   \u003e repository (e.g. `/code/path/to/file.csv`); or a relative path, relative to the root folder of the repository\n   \u003e (e.g. `./path/to/file.csv`).\n4. Submit an HTTP GET request to a URL having the format: `http://localhost:8000/samples/\u003csample_id\u003e`\n5. (Optional) Visit the **interactive API documentation** at http://localhost:8000/docs\n\nYou can also **run tests**, **perform static type checking**, and **format the code**.\nInstructions for doing those things are in the \"Development\" section below.\n\n## Development\n\n\u003e **Note:** You can issue all the commands shown in this section from the root folder of the repository.\n\n### Environment\n\nThis repository contains a Docker-based development environment.\n\nYou can configure the development environment (and the Python scripts) by copying the `.env.example` file\nand naming it `.env`.\n\n```shell\ncp .env.example .env\n```\n\n\u003e **Note:** The default values in `.env.example` are adequate for running the Python scripts in\n\u003e the development environment.\n\nYou can then instantiate the development environment by issuing the following command:\n\n```shell\ndocker-compose up\n\n# Or, if you've made changes to the Dockerfile or to `requirements.txt`:\ndocker-compose up --build\n```\n\n\u003e **Note:** That will cause Docker to instantiate a container for each service described in `docker-compose.yml`.\n\u003e - The `mongo` container will automatically start running MongoDB.\n\u003e - The `app` container, which has all the Python scripts' dependencies installed,\n\u003e   will automatically start running the web server.\n\nWith the development environment up and running, you can access a `bash` shell running on the `app` container\nby issuing the following command:\n\n```shell\ndocker exec -it app bash\n```\n\n### Testing\n\nThe tests in this repository were written using [pytest](https://docs.pytest.org/), a Python test framework and test\nrunner.\n\nWith the development environment up and running, you can **run all the tests** in the repository by issuing the\nfollowing command:\n\n```shell\n# From the `app` container:\npytest\n\n# Or, from the Docker host:\ndocker exec -it app pytest\n```\n\n\u003e **Note:** You can invoke `pytest` with the `-v` option to see a list of the tests that were run.\n\nIn addition, you can use the tool, [coverage](https://coverage.readthedocs.io/), to measure code coverage while running\nthe tests—and to subsequently display a code coverage report—by issuing the following command:\n\n```shell\n# From the `app` container:\ncoverage run -m pytest \u0026\u0026 coverage report\n\n# Or, from the Docker host:\ndocker exec -it app bash -c \"coverage run -m pytest \u0026\u0026 coverage report\"\n```\n\n\u003e **Note:** You can invoke `coverage report` with the `-m` option (as in, `coverage report -m`) to see which lines of\n\u003e code were \"missed\" (i.e. not executed).\n\n### Static type checking\n\nYou can use [mypy](https://mypy.readthedocs.io/en/latest/) to perform static type checking on the Python code in this\nrepository.\n\nWith the development environment up and running, you can **perform static type checking** by issuing the following\ncommand:\n\n```shell\n# From the `app` container:\nmypy\n\n# Or, from the Docker host:\ndocker exec -it app mypy\n```\n\n\u003e **Note:** When you run `mypy` as shown above, it will run according to the configuration specified in `mypy.ini`.\n\n### Code formatting\n\nThe Python code in this repository is formatted using [Black](https://black.readthedocs.io/en/stable/), which is\nan \"[opinionated](https://black.readthedocs.io/en/stable/the_black_code_style/index.html)\"—but still PEP 8-compliant—code\nformatter.\n\nWith the development environment up and running, you can **format all the Python code** in the repository by issuing the\nfollowing command:\n\n```shell\n# From the `app` container:\nblack .\n\n# Or, from the Docker host:\ndocker exec -it app black .\n```\n\n### Dependencies\n\nI wrote the Python scripts in this repository using Python 3.10.\n\nThe `requirements.txt` file contains a list of all the dependencies of the Python scripts in this repository.\nI generated the file by issuing the following command:\n\n```shell\n# From the `app` container:\npip freeze \u003e requirements.txt\n\n# Or, from the Docker host:\ndocker exec -it app pip freeze \u003e requirements.txt\n```\n\nThe table below contains the names of all the packages I explicitly installed via `pip install \u003cname\u003e`:\n\n| Name                | Description                    | I use it to...                  | References                                                     |\n|---------------------|--------------------------------|---------------------------------|----------------------------------------------------------------|\n| `black`             | Code formatter                 | Format Python code              | [Documentation](https://black.readthedocs.io/en/stable)        |\n| `coverage`          | Code coverage measurement tool | Measure test coverage           | [Documentation](https://coverage.readthedocs.io/)                 |\n| `fastapi`           | HTTP API framework             | Process HTTP requests           | [Documentation](https://fastapi.tiangolo.com/)                 |\n| `httpx`             | HTTP client                    | Submit HTTP requests (in tests) | [Documentation](https://www.python-httpx.org/)                 |\n| `mypy`              | Static type checker            | Verify data type consistency    | [Documentation](https://mypy.readthedocs.io/en/latest/)        |\n| `pymongo`           | Synchronous MongoDB driver     | Interact with the database      | [Documentation](https://www.mongodb.com/docs/drivers/pymongo/) |\n| `pytest`            | Test framework                 | Run the tests                   | [Documentation](https://docs.pytest.org/en/7.2.x/)             |\n| `python-dotenv`     | Configuration loader           | Read the `.env` file            | [Documentation](https://pypi.org/project/python-dotenv/)       |\n| `typer[all]`        | CLI framework                  | Process CLI input and output    | [Documentation](https://typer.tiangolo.com/)                   |\n| `uvicorn[standard]` | ASGI web server                | Serve the FastAPI app           | [Documentation](https://www.uvicorn.org/)                      |\n\n\u003e **Note:** Packages listed in `requirements.txt` that are not listed above, are packages that were automatically\n\u003e installed by `pip` when I installed the packages listed above. In other words, they are \"dependencies of\n\u003e dependencies\" (i.e. dependencies of the packages listed above).\n\n## Roadmap\n\nHere are some additional things I'm thinking about doing in this repository:\n\n1. Create a Pydantic model representing the \"sample\" object and use it to\n   (a) [validate and sanitize](https://docs.pydantic.dev/usage/validators/) the data extracted from the CSV file\n   (e.g. `\"-9999\" → None`); (b) display the API response's\n   [JSON schema](https://fastapi.tiangolo.com/tutorial/response-model/#see-it-in-the-docs) in the API docs and\n   (c) [filter out](https://fastapi.tiangolo.com/tutorial/response-model/#fastapi-data-filtering)\n   the `_id` field from the API response. Item (a) would happen in `parser.py` and items (b) and (c) would happen\n   in `server.py`. Items (a) and (c) are already happening, but not via a Pydantic model.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgitname%2Fsediment-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgitname%2Fsediment-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgitname%2Fsediment-api/lists"}