Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/camertron/tmx-parser

Parser for the Translation Memory eXchange (.tmx) file format.
https://github.com/camertron/tmx-parser

Last synced: 2 months ago
JSON representation

Parser for the Translation Memory eXchange (.tmx) file format.

Awesome Lists containing this project

README

        

tmx-parser
=================

Parser for the Translation Memory eXchange (.tmx) file format.

## Installation

`gem install tmx-parser`

## Usage

```ruby
require 'tmx-parser'
```

## Functionality

Got a .tmx file you need to parse? Just use the `TmxParser#load` method. It'll return an enumerable `TmxParser::Document` object for your iterating pleasure:

```ruby
doc = TmxParser.load(File.open('path/to/my.tmx'))
doc.each do |unit|
...
end
```

You can also pass a string to `#load`:

```ruby
doc = TmxParser.load(File.read('path/to/my.tmx'))
```

The parser works in a streaming fashion, meaning it tries not to hold the entire source document in memory all at once. It will instead yield each translation unit incrementally.

## Translation Units

Translation units are simple Ruby objects that contain properties (tmx `` elements) and variants (tmx `tuv` elements). You can also retrieve the tuid (translation unit id) and segtype (segment type). Given this document:

```xml



0
six.hours
6 hours
6 Stunden

```

Here's what you can do:

```ruby
doc.each do |unit|
unit.tuid # => "79b371014a8382a3b6efb86ec6ea97d9"
unit.segtype # => "block"

unit.properties.keys # => ["x-segment-id", "x-some-property"]
unit.properties['x-segment-id'].value # => "0"

variant = unit.variants.first
variant.locale # => "en-US"
variant.elements # => ["6 hours"]
end
```

## Placeholders

Let's consider a different document:

```xml



0

{0} sessions


{0} Einheiten


```

The placeholders will be added to the variant's `elements` array:

```ruby
doc.each do |unit|
variant = unit.variants.first
variant.elements # => ["#", " sessions"]
end
```

Begin paired tags (tmx `bpt` elements) and end paired tags (tmx `ept` elements) are handled the same way.

## See Also

* TMX file format: [http://www.gala-global.org/oscarStandards/tmx/tmx14b.html](http://www.gala-global.org/oscarStandards/tmx/tmx14b.html)

## Requirements

No external requirements.

## Running Tests

`bundle exec rspec` should do the trick :)

## Authors

* Cameron C. Dutro: http://github.com/camertron