https://github.com/bluedynamics/vdexcsv

Converts CSV files to IMS VDEX XML (Vocabulary Definition Exchange Format)
https://github.com/bluedynamics/vdexcsv
Last synced: 10 months ago
JSON representation
Converts CSV files to IMS VDEX XML (Vocabulary Definition Exchange Format)
Host: GitHub
URL: https://github.com/bluedynamics/vdexcsv
Owner: bluedynamics
License: other
Created: 2011-06-20T17:56:11.000Z (almost 15 years ago)
Default Branch: master
Last Pushed: 2014-10-11T22:10:12.000Z (over 11 years ago)
Last Synced: 2025-07-23T13:34:18.434Z (11 months ago)
Language: Python
Homepage: http://pypi.python.org/pypi/vdexcsv/
Size: 247 KB
Stars: 0
Watchers: 7
Forks: 4
Open Issues: 0
Metadata Files:
- Readme: README.rst
- Changelog: HISTORY.rst
- License: LICENSE.rst
Awesome Lists containing this project

README

          Converter from CSV file to a multilingual IMS VDEX vocabulary XML file

======================================================================

VDEX is a very good standardized format for multilingual vocabularies, 

ontologies, etc. It just sucks to create its XML manually. There is poor editor 

support. But everybody has Excel, well, but almost everybody knows how to create 

tables. So let the user create a sheet with a column of keys for each term and 

for each language a column with the translated terms value. 

A flat vocabulary

-----------------

=== ======= ======== =========

key english german   italian

=== ======= ======== =========

k01 ant     Ameise   formica

k02 bee     Biene    ape   

k03 wasp    Wespe    vespa

k04 hornet  Hornisse calabrone

=== ======= ======== =========

As a CSV this looks like::

    "key";"english";"german";"italian"

    "k01";"ant";"Ameise";"formica"

    "k02";"bee";"Biene";"ape"

    "k03";"wasp";"Wespe";"vespa"

    "k04";"hornet";"Hornisse";"calabrone"

After running through csv2vdex, called like so::

    csv2vdex insects 'insects,Insekten,insetto' \

             insects.csv insects.xml --languages en,de,it --startrow 1

This results in such a VDEX XML::

    

      insects

      

        insects

        Insekten

        insetto

      

      

        k01

        

          ant

          Ameise

          formica

        

      

      

        k02

        

          bee

          Biene

          ape

        

      

      

        k03

        

          wasp

          Wespe

          vespa

        

      

      

        k04

        

          hornet

          Hornisse

          calabrone

        

      

    

A tree vocabulary

-----------------

If we want to have a tree-like vocabulary, the key is used to define the level.

Here a dot is used as delimiter.

===== ====================

key   term value

===== ====================

nwe   North-west of Europe

nwe.1 A. m. iberica

nwe.2 A. m. intermissa

nwe.3 A. m. lihzeni

nwe.4 A. m. mellifera

nwe.5 A. m. sahariensis

swe   South-west of Europe

swe.1 A. m. carnica

swe.2 A. m. cecropia

swe.3 A. m. ligustica

swe.4 A. m. macedonica

swe.5 A. m. ruttneri

swe.6 A. m. sicula

===== ====================

As a CSV it looks like::

    "key";"term value"

    "nwe";"North-west of Europe"

    "nwe.1";"A. m. iberica"

    "nwe.2";"A. m. intermissa"

    "nwe.3";"A. m. lihzeni"

    "nwe.4";"A. m. mellifera"

    "nwe.5";"A. m. sahariensis"

    "swe";"South-west of Europe"

    "swe.1";"A. m. carnica"

    "swe.2";"A. m. cecropia"

    "swe.3";"A. m. ligustica"

    "swe.4";"A. m. macedonica"

    "swe.5";"A. m. ruttneri"

    "swe.6";"A. m. sicula"

After running through csv2vdex, called like so::

    csv2vdex beeeurope 'European Honey Bees' bees.csv bees.xml -s 1

    

The result is::

    

      beeeurope

      

        European Honey Bees

      

      

        nwe

        

          North-west of Europe

        

        

          nwe.1

          

            A. m. iberica

          

        

        

          nwe.2

          

            A. m. intermissa

          

        

        

          nwe.3

          

            A. m. lihzeni

          

        

        

          nwe.4

          

            A. m. mellifera

          

        

        

          nwe.5

          

            A. m. sahariensis

          

        

      

      

        swe

        

          South-west of Europe

        

        

          swe.1

          

            A. m. carnica

          

        

        

       

          swe.2

          

            A. m. cecropia

          

        

        

          swe.3

          

            A. m. ligustica

          

        

        

          swe.4

          

            A. m. macedonica

          

        

        

          swe.5

          

            A. m. ruttneri

          

        

        

          swe.6

          

            A. m. sicula

          

        

      

    

A tree-vocabulary with descriptions 

------------------------------------

================== ================ ===================================================

key                english          description

================== ================ ===================================================

field_work_terms   Field work terms

field_work_terms.1 Acidification    Acidification is a process. It happens naturall ...

field_work_terms.2 Aquifer          If you get a shovel and dig at the ground below ...

field_work_terms.3 Biodiversity     This has many contentious meanings but for our ...

================== ================ ===================================================

As a CSV this looks like::

    field_work_terms,Field work terms,

    field_work_terms.1,Acidification,"Acidification is a process. It happens naturally ..."

    field_work_terms.2,Aquifer,"If you get a shovel and dig at the ground below your ..."

    field_work_terms.3,Biodiversity,"This has many contentious meanings but for our ..."

After running through csv2vdex, called like so::

    csv2vdex --description True --csvdelimiter "," terms "Terminology" terms.csv terms.xml

This results in such a VDEX XML::

    

      terms

      

        Terminology

      

      

        field_work_terms

        

          Field work terms

        

        

          

        

        

          field_work_terms.1

          

            Acidification

          

          

            Acidification is a process. It happens naturally ...

          

        

        

          field_work_terms.2

          

            Aquifer

          

          

            If you get a shovel and dig at the ground below your ...

          

        

        

          field_work_terms.3

          

            Biodiversity

          

          

            This has many contentious meanings but for our ...

          

        

      

    

Help Text

=========

::

	usage: csv2vdex [-h] [--languages [LANGUAGES]] [--startrow [STARTROW]]

            [--description [DESCRIPTION]] [--keycolumn [KEYCOLUMN]]

            [--startcolumn [STARTCOLUMN]]

			[--ordered [ORDERED]] [--dialect [DIALECT]]

			[--csvdelimiter [CSVDELIMITER]]

			[--treedelimiter [TREEDELIMITER]] [--encoding [ENCODING]]

			id name source target

	csv2vdex: error: too few arguments

	jensens@minime:~/workspace/vdexcsv$ ./bin/csv2vdex --help

	usage: csv2vdex [-h] [--languages [LANGUAGES]] [--startrow [STARTROW]]

            [--description [DESCRIPTION]] [--keycolumn [KEYCOLUMN]]

            [--startcolumn [STARTCOLUMN]]

			[--ordered [ORDERED]] [--dialect [DIALECT]]

			[--csvdelimiter [CSVDELIMITER]]

			[--treedelimiter [TREEDELIMITER]] [--encoding [ENCODING]]

			id name source target

	Converts CSV files to VDEX XML

	positional arguments:

	  id                    unique identifier of vocabulary

	  name                  Human readable name of vocabulary. If more than one

				language is given separate each langstring by a comma

				and provide same order as argument --languages

	  source                CSV file to read from

	  target                XML target file

	optional arguments:

	  -h, --help            show this help message and exit

	  --languages [LANGUAGES], -l [LANGUAGES]

				Comma separated list of ISO-language codes. Default:

				en

      --description

                Whether the terms have descriptions. If so, each term takes 

                up two columns per language: one for the caption and one for

                the description.

	  --startrow [STARTROW], -r [STARTROW]

				number of row in CSV file where to begin reading,

				starts with 0, default 0.

	  --keycolumn [KEYCOLUMN], -k [KEYCOLUMN]

				number of column with the keys of the vocabulary,

				start with 0, default 0.

	  --startcolumn [STARTCOLUMN], -s [STARTCOLUMN]

				number of column with the first langstring of the

				vocabulary. It assumes n + number languages of columns

				after this, starts counting with 0, default 1.

                If terms include description, it assumes two columns 

                per language.

	  --ordered [ORDERED], -o [ORDERED]

				Whether vocabulary is ordered or not, Default: True

	  --dialect [DIALECT]   CSV dialect, default excel.

	  --csvdelimiter [CSVDELIMITER]

				CSV delimiter of the source file, default semicolon.

	  --treedelimiter [TREEDELIMITER]

				Delimiter used to split the key the vocabulary into a

				path to determine the position in the tree, default

				dot.

	  --encoding [ENCODING], -e [ENCODING]

                                Encoding of input file. Default: utf-8

  

Source Code

===========

.. image:: https://travis-ci.org/bluedynamics/vdexcsv.png?branch=master   :target: https://travis-ci.org/bluedynamics/vdexcsv

The sources are in a GIT DVCS with its main branches at 

`github `_.

We'd be happy to see many forks and pull-requests to make vdexcsv even better.

Contributors

============

- Jens W. Klein 

- Peter Holzer 

- Jean Jordaan
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bluedynamics/vdexcsv

Awesome Lists containing this project

README