https://github.com/fbkarsdorp/nnfit
Classifying Evolutionary Forces in Language Change
https://github.com/fbkarsdorp/nnfit
cultural-evolution drift language-change neural-network
Last synced: 9 months ago
JSON representation
Classifying Evolutionary Forces in Language Change
- Host: GitHub
- URL: https://github.com/fbkarsdorp/nnfit
- Owner: fbkarsdorp
- Created: 2020-02-10T14:51:51.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-07-06T22:01:49.000Z (almost 3 years ago)
- Last Synced: 2024-04-16T04:51:02.402Z (about 2 years ago)
- Topics: cultural-evolution, drift, language-change, neural-network
- Language: Jupyter Notebook
- Homepage: https://doi.org/10.1017/ehs.2020.52
- Size: 5.41 MB
- Stars: 2
- Watchers: 4
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# [Classifying Evolutionary Forces in Language Change Using Neural Networks](https://doi.org/10.1017/ehs.2020.52)
A fundamental problem in research into language and cultural change is the difficulty of
distinguishing processes of stochastic drift (also known as neutral evolution) from
processes that are subject to certain selection pressures. In this article, we describe a
new technique based on Deep Neural Networks, in which we reformulate the detection of
evolutionary forces in cultural change as a binary classification task. Using Residual
Networks for time series trained on artificially generated samples of cultural change, we
demonstrate that this technique is able to efficiently, accurately and consistently learn
which aspects of the time series are distinctive for drift and selection. We compare the
model with a recently proposed statistical test, the Frequency Increment Test, and show
that the neural time series classification system provides a possible solution to some of
the key problems of this test.
DOI: https://doi.org/10.1017/ehs.2020.52
## Getting started
See the [supplementary materials](https://doi.org/10.5281/zenodo.4061776) for a brief tutorial
describing how to train your own models.
## Data
Code to reconstruct the past-tense data set can be obtained from
https://github.com/mnewberry/ldrift. To run the past-tense analysis in
`notebooks/past-tense.ipynb`, save the frequency list under `data/coha-past-tense.txt`.
## Requirements
All code is implemented in Python 3.7. A detailed list of the requirements to run the code
can be found in the `requirements.txt` file. This repository might be updated. To use the
code used to run the analyses in the paper, please download the submission release:
https://github.com/fbkarsdorp/nnfit/releases/tag/v1.0
## Training
To train your own models, run `src/train.py` and follow the instructions therein.
---

This work is licensed under a Creative Commons Attribution 4.0 International License.