https://github.com/chigwell/hist-pred-extractor
Extracts & structures historical predictions from text using llmatch-messages.
https://github.com/chigwell/hist-pred-extractor
archiveresearch contextanalysis dataenrichment historiantools historicalpredictions llmatchmessages nlp outcomeassessment predictionvalidation predictiveanalysis researchassistant semanticanalysis structuredoutput temporalinference textextraction
Last synced: about 1 month ago
JSON representation
Extracts & structures historical predictions from text using llmatch-messages.
- Host: GitHub
- URL: https://github.com/chigwell/hist-pred-extractor
- Owner: chigwell
- Created: 2025-12-21T19:50:24.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-12-21T19:50:36.000Z (about 2 months ago)
- Last Synced: 2025-12-23T08:14:51.873Z (about 2 months ago)
- Topics: archiveresearch, contextanalysis, dataenrichment, historiantools, historicalpredictions, llmatchmessages, nlp, outcomeassessment, predictionvalidation, predictiveanalysis, researchassistant, semanticanalysis, structuredoutput, temporalinference, textextraction
- Language: Python
- Homepage: https://pypi.org/project/hist-pred-extractor/
- Size: 3.91 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# hist-pred-extractor
[](https://badge.fury.io/py/hist-pred-extractor)
[](https://opensource.org/licenses/MIT)
[](https://pepy.tech/project/hist-pred-extractor)
[](https://www.linkedin.com/in/eugene-evstafev-716669181/)
**hist-pred-extractor** is a tiny Python package that extracts and structures historical predictions (foresights, prophecies, or prognostications) from free‑form text.
Given a passage about a historical figure or event, the tool uses a LLM to locate statements that contain a prediction, validates them against a strict regex pattern, and returns a clean, structured list of the extracted predictions together with their context.
The package is useful for historians, researchers, educators, and anyone who wants to systematically analyse historical foresights.
---
## Features
- **One‑function API** – just pass a string (and optionally a custom LLM or API key).
- **LLM‑agnostic** – defaults to `ChatLLM7` (via `langchain_llm7`) but works with any LangChain chat model.
- **Regex‑validated output** – ensures extracted data follows the pattern defined in `prompts.pattern`.
- **Zero‑configuration** – works out of the box with the free tier of LLM7.
---
## Installation
```bash
pip install hist_pred_extractor
```
---
## Quick Start
```python
from hist_pred_extractor import hist_pred_extractor
# Simple call – uses the default ChatLLM7 and the LLM7_API_KEY
text = """
In 1846, the astronomer John Herschel wrote: "In the next fifty years, the continents will drift apart,
forming the Atlantic as we know it today." This prediction was later confirmed by plate tectonics.
"""
predictions = hist_pred_extractor(user_input=text)
print(predictions)
# Example output:
# [
# "In the next fifty years, the continents will drift apart, forming the Atlantic as we know it today."
# ]
```
---
## API Reference
### `hist_pred_extractor(user_input, llm=None, api_key=None) → List[str]`
| Parameter | Type | Description |
|-----------|------|-------------|
| **user_input** | `str` | The raw text containing historical statements to be analysed. |
| **llm** | `Optional[BaseChatModel]` | A LangChain chat model instance. If omitted, the function creates a `ChatLLM7` instance automatically. |
| **api_key** | `Optional[str]` | API key for LLM7. If not supplied, the function reads the `LLM7_API_KEY` environment variable, falling back to the default free‑tier key. |
**Returns:** `List[str]` – a list of extracted prediction strings that match the internal regex pattern.
---
## Using a Custom LLM
You can supply any LangChain‑compatible chat model, e.g. OpenAI, Anthropic, or Google Gemini.
### OpenAI
```python
from langchain_openai import ChatOpenAI
from hist_pred_extractor import hist_pred_extractor
llm = ChatOpenAI(model="gpt-4o-mini")
predictions = hist_pred_extractor(user_input="...", llm=llm)
```
### Anthropic
```python
from langchain_anthropic import ChatAnthropic
from hist_pred_extractor import hist_pred_extractor
llm = ChatAnthropic()
predictions = hist_pred_extractor(user_input="...", llm=llm)
```
### Google Gemini
```python
from langchain_google_genai import ChatGoogleGenerativeAI
from hist_pred_extractor import hist_pred_extractor
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")
predictions = hist_pred_extractor(user_input="...", llm=llm)
```
---
## API Key & Rate Limits
- The default free tier of **LLM7** provides enough calls for typical research tasks.
- For higher throughput, set your own key:
```bash
export LLM7_API_KEY="your_api_key_here"
```
or pass it directly:
```python
predictions = hist_pred_extractor(user_input="...", api_key="your_api_key_here")
```
You can obtain a free API key by registering at **https://token.llm7.io/**.
---
## Contributing & Support
- **Issue Tracker:**
- Feel free to open a GitHub issue for bugs, feature requests, or questions.
---
## License
This project is licensed under the MIT License.
---
## Author
**Eugene Evstafev** –
GitHub: [chigwell](https://github.com/chigwell)
---
## Acknowledgements
- **ChatLLM7** from the `langchain_llm7` package:
- **LangChain** framework for unified LLM access.