Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/wellecks/word_aligner

Word Aligner for Machine Translation
https://github.com/wellecks/word_aligner

Last synced: 17 days ago
JSON representation

Word Aligner for Machine Translation

Host: GitHub
URL: https://github.com/wellecks/word_aligner
Owner: wellecks
Created: 2014-01-26T21:11:45.000Z (almost 11 years ago)
Default Branch: master
Last Pushed: 2014-02-07T04:11:50.000Z (almost 11 years ago)
Last Synced: 2024-10-29T20:12:56.642Z (2 months ago)
Language: FORTRAN
Homepage:
Size: 16 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        ##Word Aligner

####CIS526, Machine Translation, HW1

**Sean Welleck**

This project is related to the problem of aligning words from a source and target language.

The project contains three models:

- IBM Model 1

- IBM Model 2

- Bayesian Aligner

And symmetrization to combine the results of two models.

Run ```python run_alignment.py > output.txt``` to train the models and output alignments to output.txt.

-----

#####model.py

Contains the model implementations, ```IBMM1()```, ```IBMM2()```, ```BayesM()```. 

Each model extends the ```Model()``` class and must implement the ```train()``` and ```align()``` functions.

#####aligner.py

Contains top level functions for using the models:

```python

# loading data

data = aligner.load_input(e_file, f_file, num_sents)

```

```python

# training models

ibm_model1 = aligner.train_model(IBMM1(), data, num_iters)

ibm_model2 = aligner.train_model(IBMM2(), data, num_iters)

```

```python

# getting alignments using a trained model

m1_alignments = aligner.align(ibm_model1, data)

m2_alignments = aligner.align(ibm_model2, data)

```

```python

# symmetrizing output from two models

sym_alignments = aligner.symmetrize_all(m1_alignments, m2_alignments)

```

```python

# printing alignments

print_output(sym_alignments)

```