https://github.com/chigwell/hist-pred-extractor

Extracts & structures historical predictions from text using llmatch-messages.
https://github.com/chigwell/hist-pred-extractor

archiveresearch contextanalysis dataenrichment historiantools historicalpredictions llmatchmessages nlp outcomeassessment predictionvalidation predictiveanalysis researchassistant semanticanalysis structuredoutput temporalinference textextraction

Last synced: about 1 month ago
JSON representation

Extracts & structures historical predictions from text using llmatch-messages.

Host: GitHub
URL: https://github.com/chigwell/hist-pred-extractor
Owner: chigwell
Created: 2025-12-21T19:50:24.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2025-12-21T19:50:36.000Z (about 2 months ago)
Last Synced: 2025-12-23T08:14:51.873Z (about 2 months ago)
Topics: archiveresearch, contextanalysis, dataenrichment, historiantools, historicalpredictions, llmatchmessages, nlp, outcomeassessment, predictionvalidation, predictiveanalysis, researchassistant, semanticanalysis, structuredoutput, temporalinference, textextraction
Language: Python
Homepage: https://pypi.org/project/hist-pred-extractor/
Size: 3.91 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # hist-pred-extractor

[![PyPI version](https://badge.fury.io/py/hist-pred-extractor.svg)](https://badge.fury.io/py/hist-pred-extractor)

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

[![Downloads](https://static.pepy.tech/badge/hist-pred-extractor)](https://pepy.tech/project/hist-pred-extractor)

[![LinkedIn](https://img.shields.io/badge/LinkedIn-blue)](https://www.linkedin.com/in/eugene-evstafev-716669181/)

**hist-pred-extractor** is a tiny Python package that extracts and structures historical predictions (foresights, prophecies, or prognostications) from free‑form text.  

Given a passage about a historical figure or event, the tool uses a LLM to locate statements that contain a prediction, validates them against a strict regex pattern, and returns a clean, structured list of the extracted predictions together with their context.

The package is useful for historians, researchers, educators, and anyone who wants to systematically analyse historical foresights.

---

## Features

- **One‑function API** – just pass a string (and optionally a custom LLM or API key).  

- **LLM‑agnostic** – defaults to `ChatLLM7` (via `langchain_llm7`) but works with any LangChain chat model.  

- **Regex‑validated output** – ensures extracted data follows the pattern defined in `prompts.pattern`.  

- **Zero‑configuration** – works out of the box with the free tier of LLM7.

---

## Installation

```bash

pip install hist_pred_extractor

```

---

## Quick Start

```python

from hist_pred_extractor import hist_pred_extractor

# Simple call – uses the default ChatLLM7 and the LLM7_API_KEY

text = """

In 1846, the astronomer John Herschel wrote: "In the next fifty years, the continents will drift apart,

forming the Atlantic as we know it today." This prediction was later confirmed by plate tectonics.

"""

predictions = hist_pred_extractor(user_input=text)

print(predictions)

# Example output:

# [

#   "In the next fifty years, the continents will drift apart, forming the Atlantic as we know it today."

# ]

```

---

## API Reference

### `hist_pred_extractor(user_input, llm=None, api_key=None) → List[str]`

| Parameter | Type | Description |

|-----------|------|-------------|

| **user_input** | `str` | The raw text containing historical statements to be analysed. |

| **llm** | `Optional[BaseChatModel]` | A LangChain chat model instance. If omitted, the function creates a `ChatLLM7` instance automatically. |

| **api_key** | `Optional[str]` | API key for LLM7. If not supplied, the function reads the `LLM7_API_KEY` environment variable, falling back to the default free‑tier key. |

**Returns:** `List[str]` – a list of extracted prediction strings that match the internal regex pattern.

---

## Using a Custom LLM

You can supply any LangChain‑compatible chat model, e.g. OpenAI, Anthropic, or Google Gemini.

### OpenAI

```python

from langchain_openai import ChatOpenAI

from hist_pred_extractor import hist_pred_extractor

llm = ChatOpenAI(model="gpt-4o-mini")

predictions = hist_pred_extractor(user_input="...", llm=llm)

```

### Anthropic

```python

from langchain_anthropic import ChatAnthropic

from hist_pred_extractor import hist_pred_extractor

llm = ChatAnthropic()

predictions = hist_pred_extractor(user_input="...", llm=llm)

```

### Google Gemini

```python

from langchain_google_genai import ChatGoogleGenerativeAI

from hist_pred_extractor import hist_pred_extractor

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

predictions = hist_pred_extractor(user_input="...", llm=llm)

```

---

## API Key & Rate Limits

- The default free tier of **LLM7** provides enough calls for typical research tasks.  

- For higher throughput, set your own key:

```bash

export LLM7_API_KEY="your_api_key_here"

```

or pass it directly:

```python

predictions = hist_pred_extractor(user_input="...", api_key="your_api_key_here")

```

You can obtain a free API key by registering at **https://token.llm7.io/**.

---

## Contributing & Support

- **Issue Tracker:**   

- Feel free to open a GitHub issue for bugs, feature requests, or questions.

---

## License

This project is licensed under the MIT License.

---

## Author

**Eugene Evstafev** –   

GitHub: [chigwell](https://github.com/chigwell)

---

## Acknowledgements

- **ChatLLM7** from the `langchain_llm7` package:   

- **LangChain** framework for unified LLM access.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/chigwell/hist-pred-extractor

Awesome Lists containing this project

README