{"id":42010463,"url":"https://github.com/nasa/concept-tagging-api","last_synced_at":"2026-01-26T02:23:14.987Z","repository":{"id":45630965,"uuid":"271365544","full_name":"nasa/concept-tagging-api","owner":"nasa","description":"Contains code for the API that takes in text and predicts concepts \u0026 keywords from a list of standardized NASA keywords.  API is for exposing models created with the repository `concept-tagging-training`.  ","archived":false,"fork":false,"pushed_at":"2024-10-25T19:36:19.000Z","size":194,"stargazers_count":20,"open_issues_count":7,"forks_count":13,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-09-25T14:03:11.959Z","etag":null,"topics":["api","concept-tag","flask-application","machine-learning","nasa","nasa-api","nlp","nlp-machine-learning","usg-artificial-intelligence"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nasa.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-06-10T19:21:45.000Z","updated_at":"2024-11-20T19:27:43.000Z","dependencies_parsed_at":"2022-09-10T12:22:26.921Z","dependency_job_id":null,"html_url":"https://github.com/nasa/concept-tagging-api","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/nasa/concept-tagging-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasa%2Fconcept-tagging-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasa%2Fconcept-tagging-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasa%2Fconcept-tagging-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasa%2Fconcept-tagging-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nasa","download_url":"https://codeload.github.com/nasa/concept-tagging-api/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nasa%2Fconcept-tagging-api/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28764947,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-26T00:37:26.264Z","status":"online","status_checked_at":"2026-01-26T02:00:08.215Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","concept-tag","flask-application","machine-learning","nasa","nasa-api","nlp","nlp-machine-learning","usg-artificial-intelligence"],"created_at":"2026-01-26T02:23:14.305Z","updated_at":"2026-01-26T02:23:14.982Z","avatar_url":"https://github.com/nasa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OCIO STI Concept Tagging Service\n\nAn API for exposing models created with [STI concept training](https://github.com/nasa/concept-tagging-training). This project was written about [here](https://strategy.data.gov/proof-points/2019/05/28/improving-data-access-and-data-management-artificial-intelligence-generated-metadata-tags-at-nasa/) for the Federal Data Strategy Incubator Project. A running version of this API may be found [here](http://go.nasa.gov/concepttagger), however, this is a temporary instance for demos purposes. It may not be available long-term. Please do not use it in production or at scale.\n\n### What is Concept Tagging\nBy concept tagging, we mean you can supply text, for example:\n`Volcanic activity, or volcanism, has played a significant role in the geologic evolution of Mars.[2] Scientists have known since the Mariner 9 mission in 1972 that volcanic features cover large portions of the Martian surface.` and get back predicted keywords, like `volcanology, mars surface, and structural properties`, as well as topics, like `space sciences, geosciences`, from a standardized list of several thousand NASA concepts with a probability score for each prediction.\n\n## Index\n1. [Using Endpoint](#using-endpoint)\n    1. [Request](#request)\n    2. [Response](#response)\n2. [Running Your Own Instance](#running-your-own-instance)\n    1. [Installation](#installation)\n        1. [Pull Docker Image](#pull-docker-image)\n        2. [Build Docker Image](#build-docker-image)\n        3. [With Local Python](#with-local-python)\n    2. [Download Models](#downloading-models)\n    3. [Running Service](#running-service)\n        1. [Using Docker](#using-docker)\n        2. [Using Local Python](#using-local-python)\n\n## Using Endpoint\n### Request\nThe endpoint accepts a few fields, shown in this example:\n```json\n{\n    \"text\": [\n        \"Astronauts go on space walks.\",\n        \"Basalt rocks and minerals are on earth.\"\n    ], \n    \"probability_threshold\":\"0.5\",\n    \"topic_threshold\":\"0.9\", \n    \"request_id\":\"example_id10\"\n}\n```\n- **text** *(string or list of strings)* -- The text(s) to be tagged.\n- **probability_threshold** *(float in [0, 1])* -- a threshold under which a concept tag will not be returned by the API. For example, if the threshold is set to 0.8 and a concept only scores 0.5, the concept will be omitted from the response. Setting to 1 will yield no results. Setting to 0 will yield all of the classifiers and their scores, no matter how low.\n- **topic_threshold** *(float in [0, 1])* -- A probability threshold for categories. If a category falls under this threshold, its respective suite of models will not be utilized for prediction. If you set this value to 1, only the generalized concept models will be used for tagging, yielding significant speed gains.\n- **request_id** *(string)* -- an optional ID for your request.  \n\nYou might send this request using curl. In the command below:\n1. Substitute `example_payload_multiple.json` with the path to your json request.\n2. Substitute `http://0.0.0.0:5000/` with the address of the API instance.\n```\ncurl -X POST -H \"Content-Type: application/json\" -d @example_payload_multiple.json http://0.0.0.0:5000/findterms/\n```\n### Response\nYou will then receive a response like that [here](docs/multiple_response.json). In the `payload`, you will see multiple fields, including:\n- **features** -- words and phrases directly extracted from the document. \n- **sti_keywords** -- concepts and their prediction scores. \n- **topic_probability** -- model scores for all of the categories.\n\n## Running Your Own Instance\n### Installation\nFor most people, the simplest installation entails [building the docker image](#build-docker-image), [downloading the models](#downloading-models), and [running the docker container](#using-docker).\n\n\n#### Build Docker Image\nFirst, clone this repository and enter its root.\nNow, you can build the image with:\n```\ndocker build -t concept_tagging_api:example .\n```\n\\* Developers should look at the `make build` command in the [Makefile](Makefile). It has an automated process for tagging the image with useful metadata.\n\n#### With Local Python\n\\* tested with python:3.7  \nFirst, clone this repository and enter its root.  \nNow, create a virtual environment. For example, using [venv](https://docs.python.org/3/library/venv.html):\n```\npython -m venv venv\nsource venv/bin/activate\n```\nNow install the requirements with:\n```\nmake requirements\n```\n\n### Downloading Models\nThen, you need to download the machine learning models upon which the service relies. \n\nYou can find zipped file which contains all of the models [here](https://data.nasa.gov/docs/datasets/public/concept_tagging_models/10_23_2019.zip). Now, to get the models in the right place and unzip:\n```bash\nmkdir models\nmv \u003cYOUR_ZIPPED_MODELS_NAME\u003e.zip models\ncd models\nunzip \u003cYOUR_ZIPPED_MODELS_NAME\u003e.zip\n```\nAlternatively, the models can also be downloaded from data.nasa.gov where they are named \u003ca href='https://data.nasa.gov/Software/STI-Tagging-Models/jd6d-mr3p'\u003eSTI Tagging Models\u003c/a\u003e. However, they download slower from that location.\n\n### Running Service\n\n#### Using Docker\nWith the docker image and model files in place, you can now run the service with a simple docker command. In the below command be sure to:\n 1. Substitute `concept_tagging_api:example` for the name of your image.\n 2. Substitute `$(pwd)/models/10_23_2019` to the path to your models directory. \n 3. Substitute `5001` with the port on your local machine from which you wish to access the API.\n```\ndocker run -it \\\n    -p 5001:5000 \\\n    -v $(pwd)/models/10_23_2019:/home/service/models/experiment \\\n    concept_tagging_api:example\n```\n\nNote that you you may experience permission errors when you start the container. To resolve this issue, set the user and group of your `models` directory to 999. This is the uid for the user \n\n**optional**\nThe entrypoint to the docker image is [gunicorn](https://docs.gunicorn.org/en/stable/index.html), a python WSGI HTTP Server which runs our flask app. You can optionally pass additionally arguments to gunicorn. For example:\n```bash\ndocker run -it \\\n    -p 5001:5000 \\\n    -v $(pwd)/models/10_23_2019:/home/service/models/experiment \\\n    concept_tagging_api:example --timeout 9000 \n```\nSee [here](https://docs.gunicorn.org/en/stable/design.html#async-workers) for more information about design considerations for these gunicorn settings.\n\n#### Pitfalls \u0026 Gotchas to Remeber\n- If you run this on a cloud service and run an upgrade on everything out of date for security reasons, you may need to run `sudo service docker stop`\nand then `sudo service docker start` to get docker going again. You'll also have to find the docker container that you had last running and restart it.\n- If you run the docker container as described above, remember to try the URL of your service with the proper port at the end of the URL. \n\n#### Using Local Python\nWith the requirements installed and the model files in place, you can now run the service with python locally. \nIn the command below, substitute `models/test` with the path to your models directory. For example, if you followed the example from [With Bucket Access](#with-bucket-access), it will be `models/10_23_2019`.\n```\nexport MODELS_DIR=models/test; \\\npython service/app.py\n```\n#### If you were a part of the legacy concept tagger api development team and need access to test server that's no longer available as of 11/9/23, please email us [here](mailto:hq-open-innovation@mail.nasa.gov).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnasa%2Fconcept-tagging-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnasa%2Fconcept-tagging-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnasa%2Fconcept-tagging-api/lists"}