Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nicolay-r/AREkit

Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML
https://github.com/nicolay-r/AREkit

bert datasets frames language-models neural-networks nlp pandas pandas-dataframe prompt prompting relation-extraction sentiment-analysis tensorflow

Last synced: 13 days ago
JSON representation

Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML

Awesome Lists containing this project

README

        

# AREkit 0.25.0

![](https://img.shields.io/badge/Python-3.9+-brightgreen.svg)



**AREkit** (Attitude and Relation Extraction Toolkit) --
is a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news.

## Description

This toolkit aims at memory-effective data processing in Relation Extraction (RE) related tasks.



> Figure: AREkit pipelines design. More on
> **[ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction](https://www.ecir2024.org/accepted-paper/)** paper

In particular, this framework serves the following features:
* ➿ [pipelines](https://github.com/nicolay-r/AREkit/wiki/Pipelines:-Text-Opinion-Annotation) and iterators for handling large-scale collections serialization without out-of-memory issues.
* 🔗 EL (entity-linking) API support for objects,
* ➰ avoidance of cyclic connections,
* :straight_ruler: distance consideration between relation participants (in `terms` or `sentences`),
* 📑 relations annotations and filtering rules,
* *️⃣ entities formatting or masking, and more.

The core functionality includes:
* API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support
for sentence level relations preparation (dubbed as contexts);
* API for contexts extraction;
* Relations transferring from sentence-level onto document-level, and more.

## Installation

```bash
pip install git+https://github.com/nicolay-r/[email protected]
```

## Usage

Please follow the **[tutorial section on project Wiki](https://github.com/nicolay-r/AREkit/wiki/Tutorials)** for mode details.

## How to cite
A great research is also accompanied by the faithful reference.
if you use or extend our work, please cite as follows:

```bibtex
@inproceedings{rusnachenko2024arelight,
title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},
author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},
booktitle={European Conference on Information Retrieval},
year={2024},
organization={Springer}
}
```