{"id":34516800,"url":"https://github.com/clementsicard/un-semun","last_synced_at":"2026-03-16T06:08:36.324Z","repository":{"id":185122763,"uuid":"669757420","full_name":"ClementSicard/un-semun","owner":"ClementSicard","description":"Repository for SemUN 🇺🇳 project","archived":false,"fork":false,"pushed_at":"2023-11-02T16:35:52.000Z","size":340,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2023-11-03T17:02:22.913Z","etag":null,"topics":["full-stack","graph-db","ner","nlp","united-nations"],"latest_commit_sha":null,"homepage":"","language":"Makefile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ClementSicard.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-07-23T10:31:33.000Z","updated_at":"2023-11-02T17:21:28.000Z","dependencies_parsed_at":"2023-07-31T20:13:10.800Z","dependency_job_id":"518d1bf2-700f-43d6-b33c-40443465b7ef","html_url":"https://github.com/ClementSicard/un-semun","commit_stats":null,"previous_names":["clementsicard/un-semun"],"tags_count":1,"template":null,"template_full_name":null,"purl":"pkg:github/ClementSicard/un-semun","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClementSicard%2Fun-semun","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClementSicard%2Fun-semun/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClementSicard%2Fun-semun/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClementSicard%2Fun-semun/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ClementSicard","download_url":"https://codeload.github.com/ClementSicard/un-semun/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ClementSicard%2Fun-semun/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30570235,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-16T06:02:37.763Z","status":"ssl_error","status_checked_at":"2026-03-16T06:02:14.913Z","response_time":96,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["full-stack","graph-db","ner","nlp","united-nations"],"created_at":"2025-12-24T04:23:40.492Z","updated_at":"2026-03-16T06:08:36.314Z","avatar_url":"https://github.com/ClementSicard.png","language":"Makefile","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🇺🇳 SemUN repository\n\nRepository for SemUN project. It is composed of a docker-compose stack, with:\n\n- An API ([`un-semun-api`](un-semun-api))\n- A frontend ([`un-semun-front`](un-semun-front))\n- An NLP pipeline ([`un-ml-pipeline`](un-semun-db)).\n- A Neo4j graph database (`neo4j` service in [`docker-compose.yml`](docker-compose.yml))\n- Scripts to populate the database ([`un-semun-misc`](un-semun-misc))\n- A scraper for the United Nations Digital Library\n\n[![Built with OrbStack](https://img.shields.io/badge/built%20with-OrbStack-pink.svg)](https://orbstack.dev) [![Built with neo4j](https://img.shields.io/badge/built%20with-Neo4j-purple.svg)](https://neo4j.com) ![Packages](https://img.shields.io/badge/package%20manager-poetry-blue) ![Linter](https://img.shields.io/badge/built%20with-React-orange) ![Version](https://img.shields.io/github/v/release/ClementSicard/un-semun?display_name=tag\u0026label=version\u0026logo=python\u0026logoColor=white)[![Built with HuggingFace](https://img.shields.io/badge/built%20with-Hugging%20Face%20🤗-cyan.svg)](https://huggingface.co) [![Built with spaCy](https://img.shields.io/badge/built%20with-spaCy-09a3d5.svg)](https://spacy.io)\n\n## Table of Contents\n\n- [🇺🇳 SemUN repository](#-semun-repository)\n  - [Table of Contents](#table-of-contents)\n    - [Description \\\u0026 Paper](#description--paper)\n    - [Running the project](#running-the-project)\n      - [Install requirements](#install-requirements)\n      - [Run the project](#run-the-project)\n      - [Stop the stack](#stop-the-stack)\n    - [Ingest documents using the ML pipeline API](#ingest-documents-using-the-ml-pipeline-api)\n\n### Description \u0026 Paper\n\n- To have more information on the project, please refer to the [project proposal](docs/project-proposal.pdf)\n- For more details about the final result, please refer to the [paper](https://github.com/ClementSicard/un-semun-paper/blob/main/paper.pdf)\n\n### Running the project\n\n#### Install requirements\n\nYou also need to have Docker installed, I'm using [OrbStack](https://orbstack.dev/) as a Docker desktop client for macOS, but regular Docker installation works perfectly fine as well.\n\n#### Run the project\n\nWhen Docker is setup, you just have to run:\n\n```bash\n# Start the containers\ndocker-compose up -d\n```\n\nOpen the frontend at [http://localhost:8080/](http://localhost:8080) if using Docker Desktop or [http://un-semun-frontend.un-semun.orb.local/](http://un-semun-frontend.un-semun.orb.local/) if using OrbStack.\n\n#### Stop the stack\n\nTo stop the stack, just run:\n\n```bash\ndocker-compose down\n```\n\nYou are all set! 🎉\n\n### Ingest documents using the ML pipeline API\n\nTo ingest documents, you can use the ML pipeline API. You can find more information about it in the [`README.md`](https://github.com/ClementSicard/un-ml-pipeline/blob/main/README.md) of the `un-ml-pipeline` folder.\n\nYou basically need to send a `POST` request to the `/run` endpoint at URL `http://un-semun-api.un-semun.orb.local` with a JSON body containing the following fields:\n\n```json\n[\n    {\"recordId\": \"\u003crecord_id_0\u003e\"},\n    {\"recordId\": \"\u003crecord_id_1\u003e\"},\n    {\"recordId\": \"\u003crecord_id_2\u003e\"},\n    ...\n]\n```\n\nYou can also send a `POST` request to the `/run_search` endpoint, at the same URL, with a natural language query to the UN Digital Library. The API will then scrape the results and ingest them in the database.\n\n```json\n{\n  \"q\": \"\u003cquery\u003e\"\n}\n```\n\nYou can also include a limit number of results to scrape, by adding a field `\"n\": \u003cvalue\u003e` in the payload.\n\nFor instance:\n\n```json\n{\n  \"q\": \"Women in peacekeeping\",\n  \"n\": 256\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclementsicard%2Fun-semun","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclementsicard%2Fun-semun","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclementsicard%2Fun-semun/lists"}