https://github.com/sepinetam/texiv
A machine learning–based package for transforming text into instrumental variables (IV).
https://github.com/sepinetam/texiv
econometrics instrumental-variables iv iv-2sls machine-learning stata texiv
Last synced: 2 months ago
JSON representation
A machine learning–based package for transforming text into instrumental variables (IV).
- Host: GitHub
- URL: https://github.com/sepinetam/texiv
- Owner: SepineTam
- License: agpl-3.0
- Created: 2025-06-29T18:31:31.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2025-08-09T04:01:14.000Z (10 months ago)
- Last Synced: 2026-03-25T06:15:16.419Z (3 months ago)
- Topics: econometrics, instrumental-variables, iv, iv-2sls, machine-learning, stata, texiv
- Language: Python
- Homepage:
- Size: 3.3 MB
- Stars: 5
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
---
A machine learning–based package for transforming text into instrumental variables (IV).


[](https://pypi.org/project/texiv/)
[](https://pepy.tech/projects/texiv)
[](LICENSE)
[](https://github.com/sepinetam/texiv/issues/new)

[](https://github.com/sepinetam/texiv/wiki)
[](https://deepwiki.com/SepineTam/TexIV)
---
## 🌰 Example
Visit [Stata Example File](source/example/dofiles/main.do) and [Python Example File](source/example/pyscript/main.py).
There is a step-by-step example.
## ✨ Feature
- Support multiple Chinese word segmentation and embedding methods
- Customizable stopwords
- Support keyword relevance filtering and two-stage filtering
- Output includes frequency, total count, and ratio statistics
## 📦 Requirements
- Python 3.11+
- Recommended to use virtual environment (e.g., `venv` or `conda`)
## 🚀 Quickly Start
### Install
```bash
pip install texiv
```
### Usage
```python
from typing import List
from texiv import TexIV
texiv = TexIV()
content: str = "This is a test text..."
keywords: List[str] = ["keyword1", "keyword2", "keyword3"]
texiv.texiv_it(content, keywords)
```
Output example:
```
{'freq': 7, 'count': 34, 'rate': 0.20588235294117646}
```
## 🖥️ Command Line Tool
The project also provides a command-line interface that can be used directly after installation:
```bash
texiv --help
```
## 🛠️ Configuration
All models and parameters can be adjusted through configuration files in `~/.texiv/config.toml`.
## 📄 License
This project is licensed under the GNU Affero General Public License v3.0. See [LICENSE](LICENSE) for details.
**Note:** Commercial use requires compliance with AGPL-3.0 terms, including source code disclosure for network services.