{"id":27629976,"url":"https://github.com/dkeygit/elqm","last_synced_at":"2026-05-15T08:34:35.794Z","repository":{"id":289170630,"uuid":"970291632","full_name":"dkeyGit/elqm","owner":"dkeyGit","description":"Energy-Law Query-Master: A highly modular end-to-end RAG-based question answering system for legal documents from EUR-Lex.","archived":false,"fork":false,"pushed_at":"2025-04-21T22:03:22.000Z","size":4600,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-23T16:16:51.346Z","etag":null,"topics":["eur-lex","gradio","langchain-python","llm","ollama","question-answering","retrieval-augmented-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dkeyGit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-21T19:34:30.000Z","updated_at":"2025-04-21T22:03:26.000Z","dependencies_parsed_at":"2025-04-22T04:48:03.055Z","dependency_job_id":null,"html_url":"https://github.com/dkeyGit/elqm","commit_stats":null,"previous_names":["dkeygit/elqm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkeyGit%2Felqm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkeyGit%2Felqm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkeyGit%2Felqm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkeyGit%2Felqm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dkeyGit","download_url":"https://codeload.github.com/dkeyGit/elqm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250468277,"owners_count":21435453,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["eur-lex","gradio","langchain-python","llm","ollama","question-answering","retrieval-augmented-generation"],"created_at":"2025-04-23T16:16:55.072Z","updated_at":"2025-10-14T18:34:37.214Z","avatar_url":"https://github.com/dkeyGit.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\n    \u003cimg style=\"width: 150px\" src=\"elqm_icon.png\" alt=\"Illustration icon: A modern light bulb design, with its filament shaped as a balance scale representing law. Encapsulating the bulb is a speech bubble, with a question mark and an answer tick, symbolizing the Q\u0026A aspect.\"\u003e\n\u003c/h1\u003e\n\n\n\u003ch1 align=\"center\" style=\"margin-top: 0px;\"\u003eELQM: Energy-Law Query-Master\u003c/h1\u003e\n\u003ch2 align=\"center\" style=\"margin-top: 0px;\"\u003eNatural Language Processing with Transformers\u003c/h2\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n\u003c/div\u003e\n\n# Introduction\nWe develop ELQM, a RAG-based question answering system for eurpopean energy law acquired from [EUR-Lex](https://eur-lex.europa.eu/search.html?name=browse-by%3Alegislation-in-force\u0026type=named\u0026displayProfile=allRelAllConsDocProfile\u0026qid=1696858573178\u0026CC_1_CODED=12). ELQM comprises a full end-to-end pipeline, including data scraping, preprocessing, splitting, vectorization, storage, retrieval, and answer generation with chat-based LLMs. Our work also focuses on usability, providing three access points and linking to source documents for transparency.\n# Requirements\n\n### Hardware\n- 16 GB RAM\n- 12 GB VRAM\n    - by default CUDA is used\n- 25 GB storage space\n    - 6 GB cache for all configurations\n    - 7 GB environment\n    - ~5 GB for *each* `llama2` and `mistral` model\n\n## Software\n- Python 3.10\n- `pip` \u003e= [24.0](https://github.com/google/sentencepiece/issues/378)\n- [Ollama](https://ollama.ai/download)\n- Ubuntu \u003e= 22.04 (optional, for `GPT4AllEmbeddings` which requires glibc)\n- For SparkNLP: Java OpenJDK or similar (see https://pypi.org/project/spark-nlp/)\n\n# Getting Started\n### 1. Clone the repository\n\n```sh\ngit clone https://github.com/dkeyGit/elqm\ncd elqm\n```\n\n### 2. Install the package\n\nOptional: Create a virtual environment:\n\n**conda:**\n\n```sh\nconda create -n elqm python=3.11 [ipykernel]\nconda activate elqm\n```\n\nOptional: Install ipykernel to use the environment in Jupyter Notebook\n\n**venv:**\n\n```bash\npython3 -m venv elqmVenv\nsource elqmVenv/bin/activate\n```\n\nThen, install the package via\n\n```sh\npip install --upgrade pip\npip install -e .\n```\n\n### 3. Scrape the data\nScrape the EUR-Lex data with\n\n```sh\nelqm scrape-data\n```\n\nAlternatively, you can download the scraped data from our [Huggingface dataset](https://huggingface.co/datasets/ELQM/elqm-raw) and move its contents into `/data`\n\n### 4. Install Ollama models\n\n1. Run the Ollama backend via\n\n```sh\nollama serve\n```\n\n2. Pull the desired Ollama model, e.g. `mistral`\n\n```sh\nollama pull mistral\n```\n\nTo generate the oracle dataset, we use the `llama2` model:\n\n```sh\nollama pull llama2\n```\n\n# Usage\n\n**Gradio Frontend**\n```sh\nelqm gui -c configs/prompts/256_5_5_nlc_bge_fn_mistral_h2.yaml\n```\n\n**CLI**\n```sh\nelqm run -c configs/prompts/256_5_5_nlc_bge_fn_mistral_h2.yaml\n```\n\n**Python API**\n```python\nfrom dynaconf import Dynaconf\nimport os\n\nfrom elqm import ELQMPipeline\nfrom elqm.utils import get_dir\n\nconfig = Dynaconf(settings_files=os.path.join(get_dir(\"configs\", \"prompts\"), \"256_5_5_nlc_bge_fn_mistral_h2.yaml\"))\nelqm = ELQMPipeline(config)\n\nprint(elqm.answer(\"Which CIE LUV does a model supporting greater than 99 % of the sRGB colour space translate to?\"))\n```\n\n\n# Development\n\n### Setup\nTo set up the development environment, run the following commands:\n\n```sh\npip install -e .[dev]\npre-commit install\n```\n\n### Tests\n\nTo run the tests locally, run the following commands:\n\n```sh\nollama serve\npytest tests --cov src\n```\n\n# Citation\nIf you use ELQM: Energy-Law Query-Master for your research, please cite it using the following\n\n```bibtex\n@software{elqm_energy_law_query_master_2023,\n    author = {Daniel Knorr and Paul Saegert and Nikita Tatsch},\n    title = {ELQM: Energy-Law Query-Master},\n    month = mar,\n    year = 2024,\n    publisher = {GitHub},\n    version = {1.0.0},\n    url = {https://github.com/dkeyGit/elqm}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkeygit%2Felqm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdkeygit%2Felqm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkeygit%2Felqm/lists"}