https://github.com/engisalor/lmf
LMF-CLI: run LLM tasks with LangChain
https://github.com/engisalor/lmf
applied-linguistics langchain-python language-model
Last synced: about 1 month ago
JSON representation
LMF-CLI: run LLM tasks with LangChain
- Host: GitHub
- URL: https://github.com/engisalor/lmf
- Owner: engisalor
- License: gpl-3.0
- Created: 2025-06-17T11:48:54.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-11-19T09:26:23.000Z (7 months ago)
- Last Synced: 2026-05-27T10:38:38.461Z (about 1 month ago)
- Topics: applied-linguistics, langchain-python, language-model
- Language: Python
- Homepage:
- Size: 159 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LMF: a CLI for generative language model experiments with LangChain
This repo is for designing, organizing and running LLM experiments with Python and [LangChain](https://docs.langchain.com/oss/python) (a language model framework, LMF). It has a modular structure for building just about any type of chatbot or generative LLM task supported by LangChain.
We use LMF for doing applied linguistics research. See our conference article for [eLex 2025](https://elex.link/elex2025/wp-content/uploads/eLex2025-50-Isaacs_etal.pdf).
## Introduction
The repo is managed with a [Makefile](./Makefile) and the [UV python package](https://docs.astral.sh/uv/). The Makefile has a few commands for defining dependencies, running tests, and getting started with an example project and configurations.
So far, running [Ollama](https://ollama.com/) and [HuggingFace](https://huggingface.co/) (locally) and [OpenAI](https://openai.com/) (paid API) is implemented.
### CLI basics
The main command `lmf` is available with a virtual environment `.venv` activated and the repo's dependencies installed.
See `lmf --help` for overall usage and `lmf --help` for individual commands.
`lmf` has a few primary commands:
- `prepare` to prepare final prompts: from no-frills system prompts to more advanced techniques like few-shot semantic similarity example selection with a separate embeddings model and vector store
- `query` to send prompts to models: configurable to allow for multiple runs, hyperparameters, chat model types, and other options
- `clear` to delete data generated by `prepare` and `query`
### Design and stability
We designed LMF to conduct applied linguistics research, with all its specific needs and quirks. Hopefully it's easy to use and modify, but it **should not be considered a stable dependency**. Forking the repo and reviewing new commits would be prudents.
## Understanding projects
Any LLM experiment/job/set of input data is referred to as a project. Projects are located in the `project/` directory and generally require three files:
- `examples.yml` with examples for compiling few-shot tasks (left empty if none)
- `inputs.yml` with human prompt(s) to send to the LLM
- `system.yml` with a system prompt
Projects are independent from other configurations to allow for easily swapping LLMs, changing configurations, and testing performance. Each project directory is a self-contained set of data.
## Example project
Here is what the `wizard-of-math` project looks like. It's adapted from LangChain's dynamic example selector documentation.
This project is intended for use with a semantic similarity selector, with the goal of showing an LLM how to do math with the `+` symbol replaced with a bird emoji and to respond to a question about horses.
An embeddings model (separate from the chat model) is used to provide the best examples to each input. With the `lmf prepare` command, the final prompts are compiled, including the system prompt and dynamically selected examples for each input. Then `lmf query` is executed, sending final prompts to the desired chat model.
### Initial project data
```yml
# examples.yml
- input: 2 🦜 2
output: "4"
- input: 2 🦜 3
output: "5"
- input: 2 🦜 4
output: "6"
- input: What did the cow say to the moon?
output: Nothing at all.
- input: Write me a poem about the moon.
output: One for the moon, and one for me, who are we to talk about the moon?
- input: Tell me about horses.
output: Horses are mammals.
# inputs.yml
- input: About horses...
- input: What's 3 🦜 3?
# system.yml
You are a wondrous wizard of math.
```
### Task execution
To run a task, first download the required models. LMF's default embeddings model is from HuggingFace and the chat model is from Ollama.
```bash
ollama pull qwen3:1.7b
huggingface-cli download Qwen/Qwen3-Embedding-4B
```
The task can then be executed in a single line:
```bash
lmf -r 3 -p wizard-of-math -f temperature-0 prepare query --temperature 0.0
```
Configuration:
- `-r 3` defines how many runs (repeated executions) should be completed
- `-p wizard-of-math` defines the current project directory
- `-f temperature-0` sets a filename prefix for the current command
- `prepare` generates the final prompts
- `query --temperature 0.0` runs the task with the default model with a temperature of 0
Modified versions of the task using different models or other parameters can also be run. Outputs are saved to `/project/wizard-of-math/output/`. For example:
```bash
lmf -p wizard-of-math -f gemma3-temp0.5 prepare query --model gemma3:12b --temperature 0.5
```
Each run of each version of the executed task is saved separately: just make sure filenames are set to be unique, as existing files get overwritten.
To do a more systematic evaluation of how LLMs complete a task, run a series of commands, where each command tests one configuration. For example, these commands generate final prompts with `lmf prepare` using a number of LLMs. We can inspect the generated prompts to determine which embeddings model achieves the best dynamic example selector results.
```bash
lmf -p wizard-of-math clear
lmf -p wizard-of-math -f 1-qwen3-e-0.6B prepare
lmf -p wizard-of-math -f 2-nomic-embed-text prepare --embeddings Ollama --model nomic-embed-text:latest
lmf -p wizard-of-math -f 3-ollama-qwen3-1.7b prepare --embeddings Ollama --model qwen3:1.7b
```
## Components and recipes
The example project gets us started and doesn't require writing any code or changing underlying components of LMF. For more in-depth modifications, run `lmf COMMAND --help` to see what can be defined by each command. A few components are available as-is, but adding new ones to the Python modules is straightforward.
For example, `query` accepts different chat model providers (`Ollama`, `OpenAI`), which must be set to access the models each provider has. Also, default outputs are unstructured (a typical chatbot conversation), but structured outputs can be set to return data as Python objects/JSON data. For example, the `SemanticRelationTriple` structured output could be used for entity-relation extraction tasks.
More likely, you'll need to define your own component classes. New components can be added to the respective Python module, such as [schema.py](./src/lmf/schema.py) for structured outputs. Append your own modifications above the line `### add new classes above this line ###`, using the default classes as a reference, and your new component will automatically be available in the CLI, e.g., by executing `lmf ... query --output-structure MyNewStructuredOutput`.
### `query` arguments and underlying components
```bash
Usage: lmf query [OPTIONS]
Executes LLM final prompts with a model, model provider and output
structure.
Options:
-m, --model TEXT Name of model (download models beforehand)
[default: qwen3:1.7b]
--chat-model CHAT_MODEL.PY A chat model chat model class from chat.py
[default: Ollama]
--chat-model-param TEXT A parameter to pass to the chat model in the
format 'key=value'
-o, --output-structure SCHEMA.PY
A structured output class from schema.py
[default: Unstructured]
--sample INTEGER Sample size (run first N prompts in a file;
0 == all) [default: 0]
--random / --no-random Toggle sample randomization [default: no-
random]
--temperature FLOAT RANGE Model temperature (0.0 = more deterministic
/ 1.0 = more variable) [default: 0.0;
0.0<=x<=1.0]
--timeout INTEGER Response timeout (for cloud providers)
[default: 300]
--max-tokens INTEGER Model maximum tokens per response [default:
10000]
--think / --no-think Toggle model thinking [default: no-think]
--rate-limiter RATE_LIMITER.PY A rate limiter class from rate_limiter.py
[default: NoRateLimiter]
--help Show this message and exit.
RECIPES *case insensitive*
Chat_models:
- Ollama
- OpenAI
Output_structures:
- Unstructured
- UnstructuredThink
- Hypernym
- Entity
- EntityList
- SemanticRelationTriple
- EntityRelationExtractor
Rate_limiters:
- NoRateLimiter
- Memory
```
## Environment variables
Setting environment variables may be necessary, like the example below.
```bash
# use huggingface offline
HF_HUB_OFFLINE=1
# pytorch settings
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
# API keys for external providers
OPENAI_API_KEY=
```
## Citing
Please cite this paper:
```bibtex
@inproceedings{
address = {Bled, Slovenia},
title = {Inductive {Categorization} for {Conceptual} {Analysis} with {LLMs}: {A} {Case} {Study} from the {Humanitarian} {Encyclopedia}},
url = {https://elex.link/elex2025/wp-content/uploads/eLex2025-50-Isaacs_etal.pdf},
booktitle = {Electronic lexicography in the 21st century ({eLex} 2023): {Intelligent} lexicography. {Proceedings} of the {eLex} 2025 conference},
publisher = {Lexical Computing},
author = {Isaacs, Loryn and Chambó, Santiago and León-Araúz, Pilar},
editor = {Kosem, Iztok and Jakubíček, Miloš and Medveď, Marek and Zgaga, Karolina and Arhar Holdt, Špela and Munda, Tina and Salgado, Ana},
year = {2025},
pages = {866--887},
}
```