https://github.com/waikato/wordcontextmatrix
Final Repository for Honours Project
https://github.com/waikato/wordcontextmatrix
java moa online-learning
Last synced: 9 months ago
JSON representation
Final Repository for Honours Project
- Host: GitHub
- URL: https://github.com/waikato/wordcontextmatrix
- Owner: Waikato
- License: gpl-3.0
- Created: 2018-08-13T04:44:53.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-11-05T20:11:39.000Z (over 7 years ago)
- Last Synced: 2025-10-10T17:14:18.191Z (9 months ago)
- Topics: java, moa, online-learning
- Language: Java
- Size: 36.1 KB
- Stars: 0
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Learning an Incremental Opinion Lexicon from Twitter Streams
This project takes as input (and strong assumption) a seed lexicon of known words with their
polarities and a file of arbitrary size to simulate an input stream. The program then builds
a word context matrix from the input stream by sliding a window over the input. Word vectors
are formed from this process. When a word has been seen a specified number of times, it is
sent to a classifier to be trained or tested.
It is hoped that this application and implementation of common techniques in a novel manner
will yield comparable results to established methods but will use significantly less resources
to do so.
## Required files and inputs
* Seed lexicon (words and polarities)
* File for streaming of tweets (the content)
## Usage
When compiled, the program is run from the command line by passing in the following arguemnts:
[SeedLexicon][InputFile][OutputFileName][VocabularySize][ContextVectorSize][WindowSize]