Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pharo-ai/Polyglot
A library for Natural Language Processing
https://github.com/pharo-ai/Polyglot
natural-language-processing nlp pharo
Last synced: 3 months ago
JSON representation
A library for Natural Language Processing
- Host: GitHub
- URL: https://github.com/pharo-ai/Polyglot
- Owner: pharo-ai
- License: mit
- Archived: true
- Created: 2019-05-31T12:20:14.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-09-25T12:47:04.000Z (about 1 year ago)
- Last Synced: 2024-05-18T21:52:26.916Z (6 months ago)
- Topics: natural-language-processing, nlp, pharo
- Language: Smalltalk
- Homepage:
- Size: 81.1 KB
- Stars: 7
- Watchers: 6
- Forks: 3
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-pharo-ml - Polyglot - a natural language processing library implemented in Pharo by Nikhil Pinnaparaju during Google Summer of Code. (Natural Language Processing)
README
# Polyglot
[![Build Status](https://travis-ci.org/PolyMathOrg/Polyglot.svg?branch=master)](https://travis-ci.org/PolyMathOrg/Polyglot)
[![Build status](https://ci.appveyor.com/api/projects/status/nk84odcludj242lw?svg=true)](https://ci.appveyor.com/project/nikhilpinnaparaju/Polyglot)
[![Coverage Status](https://coveralls.io/repos/github/PolyMathOrg/Polyglot/badge.svg?branch=master)](https://coveralls.io/github/PolyMathOrg/Polyglot?branch=master)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/PolyMathOrg/Polyglot/master/LICENSE)
[![Pharo version](https://img.shields.io/badge/Pharo-6.1-%23aac9ff.svg)](https://pharo.org/download)
[![Pharo version](https://img.shields.io/badge/Pharo-7.0-%23aac9ff.svg)](https://pharo.org/download)
[![Pharo version](https://img.shields.io/badge/Pharo-8.0-%23aac9ff.svg)](https://pharo.org/download)This repository is marked as a public archive and it will be deleted in the future. It contains duplicated code that it can be found in the other pharo-ai repositories. We encourage you to look into the other NLP repositories that we have inside pharo-ai.
A library for Natural Language Processing implemented in Pharo. To get more information, check out the [Polyglot Booklet](https://github.com/SquareBracketAssociates/Booklet-Polyglot).
## Installation
To install Polyglot, go to the Playground (`Ctrl+OW`) in your fresh Pharo image and execute the following Metacello script (select it and press Do-it button or `Ctrl+D`):```smalltalk
Metacello new
baseline: 'Polyglot';
repository: 'github://PolyMathOrg/Polyglot/src';
load.
```## List of Supported Features
- Tokenization
- N-grams
- Term Frequency-Inverse Document Frequency Scoring
- N-Gram Language Modelling
- Stemming
- Part of Speech Tagging
- Named Entity Recognizer
- Dependency Parser
- Modified Atlas Bridge
- Common Vector Metrics## Google Summer of Code 2019 Report
### Author: Nikhil PinnaparajuOrganisation: [Pharo](https://pharo.org/)
Project: [Polyglot](https://github.com/PolyMathOrg/Polyglot)
Mentors: Oleksandr Zaitsev, Alexandre Bergel
A library for Natural Language Processing implemented in Pharo.
## Features Implemented
- Tokenization
- N-grams
- Term Frequency-Inverse Document Frequency Scoring
- N-Gram Language Modelling
- Stemming
- Part of Speech Tagging
- Named Entity Recognizer
- Dependency Parser
- Modified Atlas Bridge
- Common Vector Metrics## Code Contribution
- [Commits - Polyglot](https://github.com/PolyMathOrg/Polyglot/commits/master?author=nikhilpinnaparaju).- [Pull Requests - Polyglot](https://github.com/PolyMathOrg/Polyglot/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Anikhilpinnaparaju+).
- [Issues Raised - PolyMath](https://github.com/PolyMathOrg/PolyMath/issues?utf8=%E2%9C%93&q=is%3Aissue+author%3Anikhilpinnaparaju+).
- [Pull Requests - PolyMath](https://github.com/PolyMathOrg/PolyMath/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Anikhilpinnaparaju+).
## Documentation
### Blog Posts
- [Representing Documents as Vectors and Visualizing them Using Polyglot in Pharo](https://medium.com/@nikhilpinnaparaju/representing-documents-as-vectors-and-visualizing-them-using-polyglot-in-pharo-73887e8bb418)
- [Stemming in Polyglot](https://medium.com/@nikhilpinnaparaju/stemming-in-polyglot-2672a349e15)
- [Working with the Atlas Pharo-Python Bridge](https://medium.com/@nikhilpinnaparaju/working-with-the-atlas-pharo-python-bridge-1ad6ba356f7)
- [Polyglot for Large Corpora](https://medium.com/@nikhilpinnaparaju/polyglot-for-large-corpora-71267c525876)
- [Introducing Polyglot](https://link.medium.com/XrrMmBsfPX)
- [Tokenization — GSoC with Pharo Consortium](https://link.medium.com/YlaK5QtfPX)
- [Community Bonding Period — GSoC with Pharo Consortium](https://link.medium.com/WyjlwqwfPX)
- [Architecture Design For an NLP Library](https://link.medium.com/Az8fikxfPX)
- [PCA in Pharo using PolyMath, DataFrame and Roassal](https://link.medium.com/qcRTM0yfPX)
- [My Journey Into Google Summer of Code — 2019](https://link.medium.com/PZ6Zd4zfPX)### Booklets
- [The Polyglot Booklet](https://github.com/SquareBracketAssociates/Booklet-Polyglot)
- [Documentation for Polyglot](https://github.com/nikhilpinnaparaju/Polyglot-Documentation)## Project Demonstration/Presentation
- [Polyglot ESUG Presention V1.0](https://drive.google.com/file/d/18J2bgdrJ6Dhbaxg1-N3hC8-ZD7y0P9yp/view?usp=sharing)
- [Polyglot ESUG Presention V2.0](https://drive.google.com/file/d/1PgNS1xpwos1txeclhbzFfVlbrbs54pGs/view?usp=sharing)