Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/madrugado/language-identification

Few models for language identification and code switching tasks
https://github.com/madrugado/language-identification

Last synced: about 2 months ago
JSON representation

Few models for language identification and code switching tasks

Awesome Lists containing this project

README

        

This is my tryout for language identification.

### Problems

1. There are three languages: Spanish (ES), Portuguese (PT-PT) and English (EN) which need to be identified given a sentence.

2. There are two language variants: European Portuguese (PT-PT) and Brazilian Portuguese (PT-BR), they should be tell apart given a sentence.

3. There are tweets written English and Spanish. Each token in a tweet should be identified to belong to 'en', 'es' or 'other' class.

There are additional info for problems 1 and 2 in [readme](./langid/README.md) under langid folder, and about the third problem in [readme](./code-switching/README.md) under code-switching folder.