Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nicolay-r/AREkit
Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML
https://github.com/nicolay-r/AREkit
bert datasets frames language-models neural-networks nlp pandas pandas-dataframe prompt prompting relation-extraction sentiment-analysis tensorflow
Last synced: 13 days ago
JSON representation
Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML
- Host: GitHub
- URL: https://github.com/nicolay-r/AREkit
- Owner: nicolay-r
- License: mit
- Created: 2019-12-03T20:20:46.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2024-09-29T16:31:34.000Z (about 2 months ago)
- Last Synced: 2024-09-29T16:34:47.396Z (about 2 months ago)
- Topics: bert, datasets, frames, language-models, neural-networks, nlp, pandas, pandas-dataframe, prompt, prompting, relation-extraction, sentiment-analysis, tensorflow
- Language: Python
- Homepage: https://nicolay-r.github.io/arekit-page/
- Size: 22.4 MB
- Stars: 56
- Watchers: 6
- Forks: 3
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AREkit 0.25.0
![](https://img.shields.io/badge/Python-3.9+-brightgreen.svg)
**AREkit** (Attitude and Relation Extraction Toolkit) --
is a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news.## Description
This toolkit aims at memory-effective data processing in Relation Extraction (RE) related tasks.
> Figure: AREkit pipelines design. More on
> **[ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction](https://www.ecir2024.org/accepted-paper/)** paperIn particular, this framework serves the following features:
* ➿ [pipelines](https://github.com/nicolay-r/AREkit/wiki/Pipelines:-Text-Opinion-Annotation) and iterators for handling large-scale collections serialization without out-of-memory issues.
* 🔗 EL (entity-linking) API support for objects,
* ➰ avoidance of cyclic connections,
* :straight_ruler: distance consideration between relation participants (in `terms` or `sentences`),
* 📑 relations annotations and filtering rules,
* *️⃣ entities formatting or masking, and more.The core functionality includes:
* API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support
for sentence level relations preparation (dubbed as contexts);
* API for contexts extraction;
* Relations transferring from sentence-level onto document-level, and more.## Installation
```bash
pip install git+https://github.com/nicolay-r/[email protected]
```## Usage
Please follow the **[tutorial section on project Wiki](https://github.com/nicolay-r/AREkit/wiki/Tutorials)** for mode details.
## How to cite
A great research is also accompanied by the faithful reference.
if you use or extend our work, please cite as follows:```bibtex
@inproceedings{rusnachenko2024arelight,
title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},
author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},
booktitle={European Conference on Information Retrieval},
year={2024},
organization={Springer}
}
```