https://github.com/dargones/sentence_alignment_tools
A collection of tools for sentence alignement
https://github.com/dargones/sentence_alignment_tools
alignment complex-word-identification cwi newsela nlp text-simplification
Last synced: 8 months ago
JSON representation
A collection of tools for sentence alignement
- Host: GitHub
- URL: https://github.com/dargones/sentence_alignment_tools
- Owner: Dargones
- Created: 2018-06-05T23:20:10.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2019-01-18T14:54:13.000Z (over 7 years ago)
- Last Synced: 2025-04-05T01:51:12.324Z (about 1 year ago)
- Topics: alignment, complex-word-identification, cwi, newsela, nlp, text-simplification
- Language: Python
- Homepage:
- Size: 105 KB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: newselautil.py
Awesome Lists containing this project
README
# Sentence Alignment Tools
This repository contains the code that can be used to align sentences accross articles adopted to different readability levels.
The current code is designed to be used specifically with the [Newsela Data](https://newsela.com/data/). The code is an implementation of a modified version of the algorithm described by [Paetzold and Specia](https://arxiv.org/pdf/1612.04113.pdf) can be used to obtain datasets for CWI (Complex Word Identification) tasks. Please, note that the
newer version of this code is now a part of [another repository](https://github.com/seanderson/lexical-simplification/tree/master/src).