https://github.com/montrealcorpustools/spade

Anything SPADE-related not covered in another repository, including scripts for analyzing SPADE datasets using PolyglotDB.
https://github.com/montrealcorpustools/spade

Last synced: 7 months ago
JSON representation

Anything SPADE-related not covered in another repository, including scripts for analyzing SPADE datasets using PolyglotDB.

Host: GitHub
URL: https://github.com/montrealcorpustools/spade
Owner: MontrealCorpusTools
License: mit
Created: 2017-09-29T17:48:25.000Z (over 8 years ago)
Default Branch: main
Last Pushed: 2022-12-27T02:17:00.000Z (over 3 years ago)
Last Synced: 2025-05-07T21:03:40.774Z (about 1 year ago)
Language: R
Homepage:
Size: 4.45 MB
Stars: 1
Watchers: 7
Forks: 4
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # SPADE

Basic setup

===========

Linux (Ubuntu is used as the example) is the recommended system for setting up and running

analysis scripts.  Windows 10 can use the Linux Subsystem.

Setting up Python

1. Install Miniconda (https://conda.io/miniconda.html)

2. Create polyglot environment (`conda create -n polyglot python=3.6 numpy scipy`)

3. Activate polyglot environment (Mac/Linux: `source activate polyglot`, Windows: `activate polyglot`)

4. Install PolyglotDB (`pip install polyglotdb pyyaml`)

Ensure Praat is on the path

1. Download Praat if necessary (http://www.fon.hum.uva.nl/praat/download_linux.html, barren edition recommended, rename to praat)

2. The command `which praat` should point to the barren edition downloaded.

Setting up Polyglot-server

==========================

Follow instructions here: http://polyglot-server.readthedocs.io/en/latest/getting_started.html

Use local instance

Follow instructions here: http://polyglotdb.readthedocs.io/en/latest/getting_started.html#set-up-local-database

Running analysis scripts

========================

All meta information about a corpus is stored in YAML files.  These YAML files specify where the corpus is located on your computer, where any enrichment information (from unisyn, or speaker CSV files), and necessary sets of segments for particular analyses (i.e., what are vowels, what are any other syllabic segments, what are the sibilant segments to analyze).

Using Raleigh as example

1. Clone this repo (`git clone https://github.com/MontrealCorpusTools/SPADE.git`) or download (https://github.com/MontrealCorpusTools/SPADE/archive/master.zip)

2. Modify YAML config files to specify correct directories for corpus (see, e.g., Raleigh.yaml in the Raleigh directory), and for `unisyn_spade` repo (see https://github.com/mlml/unisyn_spade)

   for the csv files)

3. In temerminal, `cd /path/to/SPADE`

4. Run formant analysis script (`python formant.py Raleigh`)

5. Run sibilant analysis script (`python sibilant.py Raleigh`)

Running analysis scripts on a new corpus

========================================

1. Make a new directory for the new corpus

2. Create a new yaml file with the same name

3. Populate the fields (corpus_directory, input_format, dialect_code, unisyn_spade_directory, speaker_enrichment_file, speakers, vowel_inventory, stressed_vowels, sibilant_segments)

4. Run as above

Issues

======

If any issues are encountered, first try upgrading the PolyglotDB package (`pip install -U polyglotdb`).

If this does not solve the problem, or if no update is available, please see the Polyglot-users forum (https://groups.google.com/forum/#!forum/polyglot-users)

and post a new question if the issue has not already been addressed.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/montrealcorpustools/spade

Awesome Lists containing this project

README