https://github.com/nicolay-r/AREkit

Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML
https://github.com/nicolay-r/AREkit

batching bulk-operation frames language-model nlp pipelines pipelines-library relation-extraction relationship-extraction sentiment-analysis

Last synced: 3 months ago
JSON representation

Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML

Host: GitHub
URL: https://github.com/nicolay-r/AREkit
Owner: nicolay-r
License: mit
Created: 2019-12-03T20:20:46.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2025-01-18T11:40:41.000Z (6 months ago)
Last Synced: 2025-03-30T06:04:43.561Z (3 months ago)
Topics: batching, bulk-operation, frames, language-model, nlp, pipelines, pipelines-library, relation-extraction, relationship-extraction, sentiment-analysis
Language: Python
Homepage: https://nicolay-r.github.io/arekit-page/
Size: 22.4 MB
Stars: 63
Watchers: 5
Forks: 3
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-sentiment-attitude-extraction - [github - applicable-paper]](https://arxiv.org/pdf/2006.13730.pdf) (Frameworks)
awesome-sentiment-attitude-extraction - [github - applicable-paper]](https://arxiv.org/pdf/2006.13730.pdf) (Frameworks)

README

        # AREkit 0.25.1

![](https://img.shields.io/badge/Python-3.9+-brightgreen.svg)

[![PyPI downloads](https://img.shields.io/pypi/dm/arekit.svg)](https://pypistats.org/packages/arekit)



    



**AREkit** (Attitude and Relation Extraction Toolkit) --

is a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news. 

## Description

This toolkit aims at memory-effective data processing in [Relation Extraction (RE)](https://nlpprogress.com/english/relationship_extraction.html) related tasks.



    



> Figure: AREkit pipelines design. More on 

> **[ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction](https://link.springer.com/chapter/10.1007/978-3-031-56069-9_23)** paper

In particular, this framework serves the following features: 

* ➿ [pipelines](https://github.com/nicolay-r/AREkit/wiki/Pipelines:-Text-Opinion-Annotation) and iterators for handling large-scale collections serialization without out-of-memory issues.

* 🔗 EL (entity-linking) API support for objects, 

* ➰ avoidance of cyclic connections,

* :straight_ruler: distance consideration between relation participants (in `terms` or `sentences`),

* 📑 relations annotations and filtering rules,

* *️⃣ entities formatting or masking, and more.

The core functionality includes: 

* API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support 

for sentence level relations preparation (dubbed as contexts);

* API for contexts extraction;

* Relations transferring from sentence-level onto document-level, and more.

## Installation 

```bash

pip install git+https://github.com/nicolay-r/[email protected]

```

## Usage

Please follow the **[tutorial section on project Wiki](https://github.com/nicolay-r/AREkit/wiki/Tutorials)** for mode details.

## How to cite

A great research is also accompanied by the faithful reference. 

if you use or extend our work, please cite as follows:

```bibtex

@inproceedings{rusnachenko2024arelight,

  title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},

  author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},

  booktitle={European Conference on Information Retrieval},

  year={2024},

  organization={Springer}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nicolay-r/AREkit

Awesome Lists containing this project

README