Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/quadrismegistus/cadence
Rhythm analysis toolkit in Python
https://github.com/quadrismegistus/cadence
nlp python rhythm
Last synced: 28 days ago
JSON representation
Rhythm analysis toolkit in Python
- Host: GitHub
- URL: https://github.com/quadrismegistus/cadence
- Owner: quadrismegistus
- License: mit
- Created: 2021-02-19T15:18:45.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2023-09-29T00:11:12.000Z (about 1 year ago)
- Last Synced: 2024-10-04T16:17:52.580Z (about 2 months ago)
- Topics: nlp, python, rhythm
- Language: Jupyter Notebook
- Homepage:
- Size: 3.96 MB
- Stars: 12
- Watchers: 5
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.ipynb
- License: LICENSE
Awesome Lists containing this project
README
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cadence\n",
"\n",
"A rhythm analysis toolkit, gathering multiple parsing engines:\n",
"* [Prosodic](https://github.com/quadrismegistus/prosodic) for fast English and Finnish metrical scansion.\n",
"* Cadence itself for slower but exhaustive, MaxEnt-able metrical scansion."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Quickstart"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install\n",
"\n",
"#### 1. Install python package\n",
"```\n",
"# install from pypi\n",
"pip install -U cadences # \"cadence\" was taken :-/\n",
"\n",
"# or from github very latest\n",
"pip install -U git+https://github.com/quadrismegistus/cadence\n",
"```\n",
"\n",
"#### 2. Insteall espeak (TTS)\n",
"\n",
"Install espeak, free TTS software, to 'sound out' unknown words. See [here](http://espeak.sourceforge.net/download.html) for all downloads. For Mac or Linux, you can use:\n",
"```\n",
"apt-get install espeak # linux\n",
"brew install espeak # mac\n",
"```\n",
"If you're on mac and don't have brew installed, do so [here](https://brew.sh/)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# this should work following installation\n",
"import cadence as cd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load texts"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"sonnetXIV = \"\"\"\n",
"How can I then return in happy plight,\n",
"That am debarred the benefit of rest?\n",
"When day’s oppression is not eased by night,\n",
"But day by night and night by day oppressed,\n",
"And each, though enemies to either’s reign,\n",
"Do in consent shake hands to torture me,\n",
"The one by toil, the other to complain\n",
"How far I toil, still farther off from thee.\n",
"I tell the day, to please him thou art bright,\n",
"And dost him grace when clouds do blot the heaven:\n",
"So flatter I the swart-complexiond night,\n",
"When sparkling stars twire not thou gildst the even.\n",
"But day doth daily draw my sorrows longer,\n",
"And night doth nightly make grief’s length seem stronger.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# These are identical\n",
"sonnet = cd.Verse(sonnetXIV)\n",
"sonnet = cd.Text(sonnetXIV, linebreaks=True, phrasebreaks=False)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n","
"\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"\n",
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" word_ispunc\n",
" \n",
" \n",
" para_i\n",
" sent_i\n",
" sentpart_i\n",
" line_i\n",
" word_i\n",
" word_str\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" How\n",
" 0\n",
" \n",
" \n",
" 2\n",
" can\n",
" 0\n",
" \n",
" \n",
" 3\n",
" I\n",
" 0\n",
" \n",
" \n",
" 4\n",
" then\n",
" 0\n",
" \n",
" \n",
" 5\n",
" return\n",
" 0\n",
" \n",
" \n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" \n",
" \n",
" 4\n",
" 17\n",
" 14\n",
" 15\n",
" grief's\n",
" 0\n",
" \n",
" \n",
" 16\n",
" length\n",
" 0\n",
" \n",
" \n",
" 17\n",
" seem\n",
" 0\n",
" \n",
" \n",
" 18\n",
" stronger\n",
" 0\n",
" \n",
" \n",
" 19\n",
" .\n",
" 1\n",
" \n",
" \n",
"\n",
"135 rows × 1 columns
\n",
"
],
"text/plain": [
" word_ispunc\n",
"para_i sent_i sentpart_i line_i word_i word_str \n",
"1 1 1 1 1 How 0\n",
" 2 can 0\n",
" 3 I 0\n",
" 4 then 0\n",
" 5 return 0\n",
"... ...\n",
" 4 17 14 15 grief's 0\n",
" 16 length 0\n",
" 17 seem 0\n",
" 18 stronger 0\n",
" 19 . 1\n",
"\n",
"[135 rows x 1 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Tokenize\n",
"sonnet.words()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n","
"\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"\n",
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" prom_strength\n",
" prom_stress\n",
" prom_weight\n",
" word_isfunc\n",
" word_ispunc\n",
" word_nsyll\n",
" \n",
" \n",
" para_i\n",
" sent_i\n",
" sentpart_i\n",
" line_i\n",
" word_i\n",
" word_str\n",
" word_tok\n",
" word_ipa_i\n",
" word_ipa\n",
" syll_i\n",
" syll_str\n",
" syll_ipa\n",
" syll_stress\n",
" syll_weight\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" How\n",
" how\n",
" 1\n",
" haʊ\n",
" 1\n",
" How\n",
" haʊ\n",
" U\n",
" H\n",
" NaN\n",
" 0.0\n",
" 1.0\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 2\n",
" can\n",
" can\n",
" 1\n",
" kæn\n",
" 1\n",
" can\n",
" kæn\n",
" U\n",
" H\n",
" NaN\n",
" 0.0\n",
" 1.0\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 3\n",
" I\n",
" i\n",
" 1\n",
" 'aɪ\n",
" 1\n",
" I\n",
" 'aɪ\n",
" P\n",
" H\n",
" 1.0\n",
" 1.0\n",
" 1.0\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 2\n",
" aɪ\n",
" 1\n",
" I\n",
" aɪ\n",
" U\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 1.0\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 4\n",
" then\n",
" then\n",
" 1\n",
" 'ðɛn\n",
" 1\n",
" then\n",
" 'ðɛn\n",
" P\n",
" H\n",
" 1.0\n",
" 1.0\n",
" 1.0\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" \n",
" \n",
" 4\n",
" 17\n",
" 14\n",
" 16\n",
" length\n",
" length\n",
" 1\n",
" 'lɛŋkθ\n",
" 1\n",
" length\n",
" 'lɛŋkθ\n",
" P\n",
" H\n",
" NaN\n",
" 1.0\n",
" 1.0\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 17\n",
" seem\n",
" seem\n",
" 1\n",
" 'siːm\n",
" 1\n",
" seem\n",
" 'siːm\n",
" P\n",
" H\n",
" NaN\n",
" 1.0\n",
" 1.0\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 18\n",
" stronger\n",
" stronger\n",
" 1\n",
" 'strɔːŋ.ɛː\n",
" 1\n",
" stron\n",
" 'strɔːŋ\n",
" P\n",
" H\n",
" 1.0\n",
" 1.0\n",
" 1.0\n",
" 0.0\n",
" 0\n",
" 2\n",
" \n",
" \n",
" 2\n",
" ger\n",
" ɛː\n",
" U\n",
" L\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0\n",
" 2\n",
" \n",
" \n",
" 19\n",
" .\n",
" \n",
" 0\n",
" \n",
" 0\n",
" .\n",
" \n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" 1\n",
" 0\n",
" \n",
" \n",
"\n",
"186 rows × 6 columns
\n",
"
],
"text/plain": [
" prom_strength ... word_nsyll\n",
"para_i sent_i sentpart_i line_i word_i word_str word_tok word_ipa_i word_ipa syll_i syll_str syll_ipa syll_stress syll_weight ... \n",
"1 1 1 1 1 How how 1 haʊ 1 How haʊ U H NaN ... 1\n",
" 2 can can 1 kæn 1 can kæn U H NaN ... 1\n",
" 3 I i 1 'aɪ 1 I 'aɪ P H 1.0 ... 1\n",
" 2 aɪ 1 I aɪ U H 0.0 ... 1\n",
" 4 then then 1 'ðɛn 1 then 'ðɛn P H 1.0 ... 1\n",
"... ... ... ...\n",
" 4 17 14 16 length length 1 'lɛŋkθ 1 length 'lɛŋkθ P H NaN ... 1\n",
" 17 seem seem 1 'siːm 1 seem 'siːm P H NaN ... 1\n",
" 18 stronger stronger 1 'strɔːŋ.ɛː 1 stron 'strɔːŋ P H 1.0 ... 2\n",
" 2 ger ɛː U L 0.0 ... 2\n",
" 19 . 0 0 . NaN NaN NaN ... 0\n",
"\n",
"[186 rows x 6 columns]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Syllabify\n",
"sonnet.sylls()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n","
"\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"\n",
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" dep_head\n",
" dep_type\n",
" pos_case\n",
" pos_definite\n",
" pos_degree\n",
" pos_gender\n",
" pos_mood\n",
" pos_number\n",
" pos_person\n",
" pos_polarity\n",
" pos_poss\n",
" pos_prontype\n",
" pos_tense\n",
" pos_upos\n",
" pos_verbform\n",
" pos_voice\n",
" pos_xpos\n",
" word_depth\n",
" \n",
" \n",
" para_i\n",
" sent_i\n",
" word_i\n",
" word_str\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" 1\n",
" 1\n",
" 1\n",
" How\n",
" 5\n",
" advmod\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Int\n",
" \n",
" ADV\n",
" \n",
" \n",
" WRB\n",
" 4\n",
" \n",
" \n",
" 2\n",
" can\n",
" 5\n",
" aux\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" AUX\n",
" Fin\n",
" \n",
" MD\n",
" 4\n",
" \n",
" \n",
" 3\n",
" I\n",
" 5\n",
" nsubj\n",
" Nom\n",
" \n",
" \n",
" \n",
" \n",
" Sing\n",
" 1\n",
" \n",
" \n",
" Prs\n",
" \n",
" PRON\n",
" \n",
" \n",
" PRP\n",
" 5\n",
" \n",
" \n",
" 4\n",
" then\n",
" 5\n",
" advmod\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Dem\n",
" \n",
" ADV\n",
" \n",
" \n",
" RB\n",
" 5\n",
" \n",
" \n",
" 5\n",
" return\n",
" 0\n",
" root\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" VERB\n",
" Inf\n",
" \n",
" VB\n",
" 5\n",
" \n",
" \n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" \n",
" \n",
" 4\n",
" 15\n",
" grief's\n",
" 16\n",
" compound\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Plur\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" NOUN\n",
" \n",
" \n",
" NNS\n",
" 8\n",
" \n",
" \n",
" 16\n",
" length\n",
" 14\n",
" obj\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Sing\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" NOUN\n",
" \n",
" \n",
" NN\n",
" 8\n",
" \n",
" \n",
" 17\n",
" seem\n",
" 14\n",
" xcomp\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" VERB\n",
" Inf\n",
" \n",
" VB\n",
" 10\n",
" \n",
" \n",
" 18\n",
" stronger\n",
" 17\n",
" xcomp\n",
" \n",
" \n",
" Cmp\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" ADJ\n",
" \n",
" \n",
" JJR\n",
" 12\n",
" \n",
" \n",
" 19\n",
" .\n",
" 5\n",
" punct\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" PUNCT\n",
" \n",
" \n",
" .\n",
" 3\n",
" \n",
" \n",
"\n",
"135 rows × 18 columns
\n",
"
],
"text/plain": [
" dep_head dep_type ... pos_xpos word_depth\n",
"para_i sent_i word_i word_str ... \n",
"1 1 1 How 5 advmod ... WRB 4\n",
" 2 can 5 aux ... MD 4\n",
" 3 I 5 nsubj ... PRP 5\n",
" 4 then 5 advmod ... RB 5\n",
" 5 return 0 root ... VB 5\n",
"... ... ... ... ... ...\n",
" 4 15 grief's 16 compound ... NNS 8\n",
" 16 length 14 obj ... NN 8\n",
" 17 seem 14 xcomp ... VB 10\n",
" 18 stronger 17 xcomp ... JJR 12\n",
" 19 . 5 punct ... . 3\n",
"\n",
"[135 rows x 18 columns]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Syntax-parse\n",
"sonnet.syntax()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": "ROOT/0SBARQ/0WHADVP/0WRB/0HowSQ/0MD/0canNP/0PRP/0IADVP/0RB/0thenVP/0VB/0returnPP/0IN/0inNP/0NP/0JJ/0happyNN/0plight,/0,SBAR/0WHNP/0WDT/0ThatS/0VP/0VBP/0amVP/0VBN/0debarredNP/0NP/0DT/0theNN/0benefitPP/0IN/0ofNP/0NN/0rest./0?",
"text/plain": [
"CadenceMetricalTree('ROOT/0', [CadenceMetricalTree('SBARQ/0', [CadenceMetricalTree('WHADVP/0', [CadenceMetricalTree('WRB/0', ['How'])]), CadenceMetricalTree('SQ/0', [CadenceMetricalTree('MD/0', ['can']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('PRP/0', ['I'])]), CadenceMetricalTree('ADVP/0', [CadenceMetricalTree('RB/0', ['then'])]), CadenceMetricalTree('VP/0', [CadenceMetricalTree('VB/0', ['return']), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['in']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NP/0', [CadenceMetricalTree('JJ/0', ['happy']), CadenceMetricalTree('NN/0', ['plight'])]), CadenceMetricalTree(',/0', [',']), CadenceMetricalTree('SBAR/0', [CadenceMetricalTree('WHNP/0', [CadenceMetricalTree('WDT/0', ['That'])]), CadenceMetricalTree('S/0', [CadenceMetricalTree('VP/0', [CadenceMetricalTree('VBP/0', ['am']), CadenceMetricalTree('VP/0', [CadenceMetricalTree('VBN/0', ['debarred']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NP/0', [CadenceMetricalTree('DT/0', ['the']), CadenceMetricalTree('NN/0', ['benefit'])]), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['of']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NN/0', ['rest'])])])])])])])])])])])]), CadenceMetricalTree('./0', ['?'])])])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Show sentences\n",
"sentence = sonnet.sent(1)\n",
"sentence.mtree()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Stress grid of sentence inferred from syntactic tree\n",
"# using metricaltree\n",
"sentence.grid()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Parse text"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "68801072ee594378800c0b07337c961e",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Metrically parsing line units: 0%| | 0/14 [00:00, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" How can I then return in happy plight,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" That am debarred the benefit of rest?"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" When day's oppression is not eased by night,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" But day by night and night by day oppressed,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" And each, though enemies to either's reign,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" Do in consent shake hands to torture me,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" The one by toil, the other to complain"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" How far I toil, still farther off from thee."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" I tell the day, to please him thou art bright,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" And dost him grace when clouds do blot the heaven:"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" So flatter I the swart- complexiond night,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" When sparkling stars twire not thou gildst the even."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" But day doth daily draw my sorrows longer,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" And night doth nightly make grief's length seem stronger."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n","
"\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"\n",
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" *total\n",
" *s_unstressed\n",
" *unres_across\n",
" *unres_within\n",
" *w_peak\n",
" *w_stressed\n",
" dep_head\n",
" dep_type\n",
" mtree_ishead\n",
" num_parses\n",
" pos_case\n",
" pos_definite\n",
" pos_degree\n",
" pos_gender\n",
" pos_mood\n",
" pos_number\n",
" pos_person\n",
" pos_polarity\n",
" pos_poss\n",
" pos_prontype\n",
" pos_tense\n",
" pos_upos\n",
" pos_verbform\n",
" pos_voice\n",
" pos_xpos\n",
" prom_lstress\n",
" prom_pstrength\n",
" prom_pstress\n",
" prom_strength\n",
" prom_stress\n",
" prom_tstress\n",
" prom_weight\n",
" word_depth\n",
" word_isfunc\n",
" word_ispunc\n",
" word_nsyll\n",
" \n",
" \n",
" para_i\n",
" unit_i\n",
" parse_rank\n",
" is_troch\n",
" parse_i\n",
" parse\n",
" parse_str\n",
" sent_i\n",
" sentpart_i\n",
" line_i\n",
" combo_i\n",
" slot_i\n",
" slot_meter\n",
" syll_str_parse\n",
" word_i\n",
" word_str\n",
" word_tok\n",
" word_ipa_i\n",
" word_ipa\n",
" syll_i\n",
" syll_str\n",
" syll_ipa\n",
" syll_stress\n",
" syll_weight\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" 1\n",
" 1\n",
" 1\n",
" 0\n",
" 1\n",
" wwSSwSwSwS\n",
" 𝖧𝗈𝗐 𝖼𝖺𝗇 𝗜 𝙩𝙝𝙚𝙣 𝗋𝖾𝘁𝘂𝗿𝗻 𝗂𝗇 𝗵𝗮𝗽𝗉𝗒 𝗽𝗹𝗶𝗴𝗵𝘁,\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" w\n",
" 𝖧𝗈𝗐\n",
" 1\n",
" How\n",
" how\n",
" 1\n",
" haʊ\n",
" 1\n",
" How\n",
" haʊ\n",
" U\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 5\n",
" advmod\n",
" NaN\n",
" 4\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Int\n",
" \n",
" ADV\n",
" \n",
" \n",
" WRB\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 0.000000\n",
" 1.0\n",
" 4\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 2\n",
" w\n",
" 𝖼𝖺𝗇\n",
" 2\n",
" can\n",
" can\n",
" 1\n",
" kæn\n",
" 1\n",
" can\n",
" kæn\n",
" U\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 5\n",
" aux\n",
" NaN\n",
" 4\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" AUX\n",
" Fin\n",
" \n",
" MD\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 0.333333\n",
" 1.0\n",
" 4\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 3\n",
" s\n",
" 𝗜\n",
" 3\n",
" I\n",
" i\n",
" 1\n",
" 'aɪ\n",
" 1\n",
" I\n",
" 'aɪ\n",
" P\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 5\n",
" nsubj\n",
" NaN\n",
" 4\n",
" Nom\n",
" \n",
" \n",
" \n",
" \n",
" Sing\n",
" 1\n",
" \n",
" \n",
" Prs\n",
" \n",
" PRON\n",
" \n",
" \n",
" PRP\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 1.0\n",
" 1.0\n",
" 0.000000\n",
" 1.0\n",
" 5\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 4\n",
" s\n",
" 𝙩𝙝𝙚𝙣\n",
" 4\n",
" then\n",
" then\n",
" 1\n",
" 'ðɛn\n",
" 1\n",
" then\n",
" 'ðɛn\n",
" P\n",
" H\n",
" 1.0\n",
" 0.0\n",
" 1.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 5\n",
" advmod\n",
" NaN\n",
" 4\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Dem\n",
" \n",
" ADV\n",
" \n",
" \n",
" RB\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 1.0\n",
" 1.0\n",
" 0.000000\n",
" 1.0\n",
" 5\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 5\n",
" w\n",
" 𝗋𝖾\n",
" 5\n",
" return\n",
" return\n",
" 1\n",
" rɪ.'tɛːn\n",
" 1\n",
" re\n",
" rɪ\n",
" U\n",
" L\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0\n",
" root\n",
" NaN\n",
" 4\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" VERB\n",
" Inf\n",
" \n",
" VB\n",
" NaN\n",
" NaN\n",
" NaN\n",
" 0.0\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 5\n",
" 0.0\n",
" 0\n",
" 2\n",
" \n",
" \n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" \n",
" \n",
" 14\n",
" 2\n",
" 0\n",
" 1\n",
" wSwSwSSwSSw\n",
" 𝖠𝗇𝖽 𝗻𝗶𝗴𝗵𝘁 𝖽𝗈𝗍𝗁 𝗻𝗶𝗴𝗵𝗍𝗅𝗒 𝗺𝗮𝗸𝗲 𝙜𝙧𝙞𝙚𝙛'𝙨 𝘭𝘦𝘯𝘨𝘵𝘩 𝘀𝗲𝗲𝗺 𝙨𝙩𝙧𝙤𝙣𝗀𝖾𝗋.\n",
" 4\n",
" 17\n",
" 14\n",
" 1\n",
" 8\n",
" w\n",
" 𝘭𝘦𝘯𝘨𝘵𝘩\n",
" 16\n",
" length\n",
" length\n",
" 1\n",
" 'lɛŋkθ\n",
" 1\n",
" length\n",
" 'lɛŋkθ\n",
" P\n",
" H\n",
" 1.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 1.0\n",
" 14\n",
" obj\n",
" NaN\n",
" 2\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Sing\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" NOUN\n",
" \n",
" \n",
" NN\n",
" 1.0\n",
" 0.0\n",
" 0.0\n",
" NaN\n",
" 1.0\n",
" 0.750000\n",
" 1.0\n",
" 8\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 9\n",
" s\n",
" 𝘀𝗲𝗲𝗺\n",
" 17\n",
" seem\n",
" seem\n",
" 1\n",
" 'siːm\n",
" 1\n",
" seem\n",
" 'siːm\n",
" P\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 14\n",
" xcomp\n",
" NaN\n",
" 2\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" VERB\n",
" Inf\n",
" \n",
" VB\n",
" 1.0\n",
" 0.0\n",
" 0.0\n",
" NaN\n",
" 1.0\n",
" 0.750000\n",
" 1.0\n",
" 10\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 10\n",
" s\n",
" 𝙨𝙩𝙧𝙤𝙣\n",
" 18\n",
" stronger\n",
" stronger\n",
" 1\n",
" 'strɔːŋ.ɛː\n",
" 1\n",
" stron\n",
" 'strɔːŋ\n",
" P\n",
" H\n",
" 1.0\n",
" 0.0\n",
" 1.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 17\n",
" xcomp\n",
" NaN\n",
" 2\n",
" \n",
" \n",
" Cmp\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" ADJ\n",
" \n",
" \n",
" JJR\n",
" 1.0\n",
" 1.0\n",
" 1.0\n",
" 1.0\n",
" 1.0\n",
" 1.000000\n",
" 1.0\n",
" 12\n",
" 0.0\n",
" 0\n",
" 2\n",
" \n",
" \n",
" 11\n",
" w\n",
" 𝗀𝖾𝗋\n",
" 18\n",
" stronger\n",
" stronger\n",
" 1\n",
" 'strɔːŋ.ɛː\n",
" 2\n",
" ger\n",
" ɛː\n",
" U\n",
" L\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 17\n",
" xcomp\n",
" NaN\n",
" 2\n",
" \n",
" \n",
" Cmp\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" ADJ\n",
" \n",
" \n",
" JJR\n",
" NaN\n",
" NaN\n",
" NaN\n",
" 0.0\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 12\n",
" 0.0\n",
" 0\n",
" 2\n",
" \n",
" \n",
" 12\n",
" NaN\n",
" .\n",
" 19\n",
" .\n",
" \n",
" 0\n",
" \n",
" 0\n",
" .\n",
" \n",
" NaN\n",
" NaN\n",
" 0.0\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" 5\n",
" punct\n",
" NaN\n",
" 2\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" PUNCT\n",
" \n",
" \n",
" .\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" 3\n",
" NaN\n",
" 1\n",
" 0\n",
" \n",
" \n",
"\n",
"303 rows × 36 columns
\n",
"
],
"text/plain": [
" *total ... word_nsyll\n",
"para_i unit_i parse_rank is_troch parse_i parse parse_str sent_i sentpart_i line_i combo_i slot_i slot_meter syll_str_parse word_i word_str word_tok word_ipa_i word_ipa syll_i syll_str syll_ipa syll_stress syll_weight ... \n",
"1 1 1 0 1 wwSSwSwSwS 𝖧𝗈𝗐 𝖼𝖺𝗇 𝗜 𝙩𝙝𝙚𝙣 𝗋𝖾𝘁𝘂𝗿𝗻 𝗂𝗇 𝗵𝗮𝗽𝗉𝗒 𝗽𝗹𝗶𝗴𝗵𝘁, 1 1 1 1 1 w 𝖧𝗈𝗐 1 How how 1 haʊ 1 How haʊ U H 0.0 ... 1\n",
" 2 w 𝖼𝖺𝗇 2 can can 1 kæn 1 can kæn U H 0.0 ... 1\n",
" 3 s 𝗜 3 I i 1 'aɪ 1 I 'aɪ P H 0.0 ... 1\n",
" 4 s 𝙩𝙝𝙚𝙣 4 then then 1 'ðɛn 1 then 'ðɛn P H 1.0 ... 1\n",
" 5 w 𝗋𝖾 5 return return 1 rɪ.'tɛːn 1 re rɪ U L 0.0 ... 2\n",
"... ... ... ...\n",
" 14 2 0 1 wSwSwSSwSSw 𝖠𝗇𝖽 𝗻𝗶𝗴𝗵𝘁 𝖽𝗈𝗍𝗁 𝗻𝗶𝗴𝗵𝗍𝗅𝗒 𝗺𝗮𝗸𝗲 𝙜𝙧𝙞𝙚𝙛'𝙨 𝘭𝘦𝘯𝘨𝘵𝘩 𝘀𝗲𝗲𝗺... 4 17 14 1 8 w 𝘭𝘦𝘯𝘨𝘵𝘩 16 length length 1 'lɛŋkθ 1 length 'lɛŋkθ P H 1.0 ... 1\n",
" 9 s 𝘀𝗲𝗲𝗺 17 seem seem 1 'siːm 1 seem 'siːm P H 0.0 ... 1\n",
" 10 s 𝙨𝙩𝙧𝙤𝙣 18 stronger stronger 1 'strɔːŋ.ɛː 1 stron 'strɔːŋ P H 1.0 ... 2\n",
" 11 w 𝗀𝖾𝗋 18 stronger stronger 1 'strɔːŋ.ɛː 2 ger ɛː U L 0.0 ... 2\n",
" 12 NaN . 19 . 0 0 . NaN NaN 0.0 ... 0\n",
"\n",
"[303 rows x 36 columns]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Parse lines (verse)\n",
"sonnet.parse()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prose"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"melville=\"\"\"Is it that by its indefiniteness it shadows forth the heartless voids\n",
"and immensities of the universe, and thus stabs us from behind with the thought of annihilation,\n",
"when beholding the white depths of the milky way? Or is it, that as in essence\n",
"whiteness is not so much a colour as the visible absence of colour; and at the same time the concrete of all colours;\n",
"is it for these reasons that there is such a dumb blankness, full of meaning,\n",
"in a wide landscape of snows: a colourless, all-colour of atheism from which we shrink?\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"# So are these\n",
"text = cd.Text(melville, linebreaks=False, phrasebreaks=True)\n",
"text = cd.Prose(melville)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": "ROOT/0SQ/0VBZ/0IsNP/0PRP/0itNP/0IN/0thatPP/0IN/0byNP/0PRP$/0itsNN/0indefinitenessNP/0PRP/0itVP/0VP/0VBZ/0shadowsPRT/0RP/0forthNP/0NP/0DT/0theJJ/0heartlessNNS/0voidsCC/0andNP/0NP/0NNS/0immensitiesPP/0IN/0ofNP/0DT/0theNN/0universe,/0,CC/0andADVP/0RB/0thusVP/0VBZ/0stabsNP/0PRP/0usPP/0IN/0fromPP/0IN/0behindPP/0IN/0withNP/0NP/0DT/0theNN/0thoughtPP/0IN/0ofNP/0NN/0annihilation,/0,SBAR/0WHADVP/0WRB/0whenS/0VP/0VBG/0beholdingNP/0NP/0DT/0theJJ/0whiteNNS/0depthsPP/0IN/0ofNP/0DT/0theJJ/0milkyNN/0way./0?",
"text/plain": [
"CadenceMetricalTree('ROOT/0', [CadenceMetricalTree('SQ/0', [CadenceMetricalTree('VBZ/0', ['Is']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('PRP/0', ['it'])]), CadenceMetricalTree('NP/0', [CadenceMetricalTree('IN/0', ['that'])]), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['by']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('PRP$/0', ['its']), CadenceMetricalTree('NN/0', ['indefiniteness'])])]), CadenceMetricalTree('NP/0', [CadenceMetricalTree('PRP/0', ['it'])]), CadenceMetricalTree('VP/0', [CadenceMetricalTree('VP/0', [CadenceMetricalTree('VBZ/0', ['shadows']), CadenceMetricalTree('PRT/0', [CadenceMetricalTree('RP/0', ['forth'])]), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NP/0', [CadenceMetricalTree('DT/0', ['the']), CadenceMetricalTree('JJ/0', ['heartless']), CadenceMetricalTree('NNS/0', ['voids'])]), CadenceMetricalTree('CC/0', ['and']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NP/0', [CadenceMetricalTree('NNS/0', ['immensities'])]), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['of']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('DT/0', ['the']), CadenceMetricalTree('NN/0', ['universe'])])])])])]), CadenceMetricalTree(',/0', [',']), CadenceMetricalTree('CC/0', ['and']), CadenceMetricalTree('ADVP/0', [CadenceMetricalTree('RB/0', ['thus'])]), CadenceMetricalTree('VP/0', [CadenceMetricalTree('VBZ/0', ['stabs']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('PRP/0', ['us'])]), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['from']), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['behind']), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['with']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NP/0', [CadenceMetricalTree('DT/0', ['the']), CadenceMetricalTree('NN/0', ['thought'])]), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['of']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NN/0', ['annihilation'])])])])])])]), CadenceMetricalTree(',/0', [',']), CadenceMetricalTree('SBAR/0', [CadenceMetricalTree('WHADVP/0', [CadenceMetricalTree('WRB/0', ['when'])]), CadenceMetricalTree('S/0', [CadenceMetricalTree('VP/0', [CadenceMetricalTree('VBG/0', ['beholding']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('NP/0', [CadenceMetricalTree('DT/0', ['the']), CadenceMetricalTree('JJ/0', ['white']), CadenceMetricalTree('NNS/0', ['depths'])]), CadenceMetricalTree('PP/0', [CadenceMetricalTree('IN/0', ['of']), CadenceMetricalTree('NP/0', [CadenceMetricalTree('DT/0', ['the']), CadenceMetricalTree('JJ/0', ['milky']), CadenceMetricalTree('NN/0', ['way'])])])])])])])])]), CadenceMetricalTree('./0', ['?'])])])"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text.sent(1).mtree()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "5f42d3049c974a33931ff3e256f3dbb9",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Metrically parsing line units: 0%| | 0/14 [00:00, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" Is it that by its indefiniteness it shadows forth"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" the heartless voids and immensities of the universe,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" and thus"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" stabs us from behind with the thought of annihilation,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" when beholding the white depths of the milky way?"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" Or is it,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" that as in essence whiteness is not so"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" much a colour as the visible absence of colour;"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" and at the same time the concrete of all colours;"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" is it for these reasons that there is such a dumb blankness,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" full of meaning,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" in a wide landscape of snows:"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" a colourless,"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/markdown": [
" all- colour of atheism from which we shrink?"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n","
"\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"\n",
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" *total\n",
" *s_unstressed\n",
" *unres_across\n",
" *unres_within\n",
" *w_peak\n",
" *w_stressed\n",
" dep_head\n",
" dep_type\n",
" mtree_ishead\n",
" num_parses\n",
" pos_case\n",
" pos_definite\n",
" pos_degree\n",
" pos_gender\n",
" pos_mood\n",
" pos_number\n",
" pos_person\n",
" pos_polarity\n",
" pos_poss\n",
" pos_prontype\n",
" pos_tense\n",
" pos_upos\n",
" pos_verbform\n",
" pos_xpos\n",
" prom_lstress\n",
" prom_pstrength\n",
" prom_pstress\n",
" prom_strength\n",
" prom_stress\n",
" prom_tstress\n",
" prom_weight\n",
" word_depth\n",
" word_isfunc\n",
" word_ispunc\n",
" word_nsyll\n",
" \n",
" \n",
" para_i\n",
" unit_i\n",
" parse_rank\n",
" is_troch\n",
" parse_i\n",
" parse\n",
" parse_str\n",
" sent_i\n",
" sentpart_i\n",
" line_i\n",
" combo_i\n",
" slot_i\n",
" slot_meter\n",
" syll_str_parse\n",
" word_i\n",
" word_str\n",
" word_tok\n",
" word_ipa_i\n",
" word_ipa\n",
" syll_i\n",
" syll_str\n",
" syll_ipa\n",
" syll_stress\n",
" syll_weight\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" 1\n",
" 1\n",
" 1\n",
" 0\n",
" 1\n",
" wSwwSwSwSwwSSw\n",
" 𝖨𝗌 𝗶𝘁 𝗍𝗁𝖺𝗍 𝖻𝗒 𝙞𝙩𝙨 𝗂𝗇𝗱𝗲𝖿𝗂𝙣𝙞𝗍𝖾𝗇𝖾𝗌𝗌 𝘪𝘵 𝘀𝗵𝗮𝗱𝗼𝘄𝘀 𝘧𝘰𝘳𝘵𝘩\n",
" 1\n",
" 1\n",
" 1\n",
" 12\n",
" 1\n",
" w\n",
" 𝖨𝗌\n",
" 1\n",
" Is\n",
" is\n",
" 2\n",
" ɪz\n",
" 1\n",
" Is\n",
" ɪz\n",
" U\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 8\n",
" aux\n",
" 0.0\n",
" 27\n",
" \n",
" \n",
" \n",
" \n",
" Ind\n",
" Sing\n",
" 3\n",
" \n",
" \n",
" \n",
" Pres\n",
" AUX\n",
" Fin\n",
" VBZ\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.666667\n",
" 1.0\n",
" 3\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 2\n",
" s\n",
" 𝗶𝘁\n",
" 2\n",
" it\n",
" it\n",
" 1\n",
" 'ɪt\n",
" 1\n",
" it\n",
" 'ɪt\n",
" P\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 8\n",
" nsubj\n",
" NaN\n",
" 27\n",
" Nom\n",
" \n",
" \n",
" Neut\n",
" \n",
" Sing\n",
" 3\n",
" \n",
" \n",
" Prs\n",
" \n",
" PRON\n",
" \n",
" PRP\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 1.0\n",
" 1.0\n",
" 0.333333\n",
" 1.0\n",
" 4\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 3\n",
" w\n",
" 𝗍𝗁𝖺𝗍\n",
" 3\n",
" that\n",
" that\n",
" 2\n",
" ðət\n",
" 1\n",
" that\n",
" ðət\n",
" U\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 8\n",
" mark\n",
" NaN\n",
" 27\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" SCONJ\n",
" \n",
" IN\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.333333\n",
" 1.0\n",
" 4\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 4\n",
" w\n",
" 𝖻𝗒\n",
" 4\n",
" by\n",
" by\n",
" 1\n",
" baɪ\n",
" 1\n",
" by\n",
" baɪ\n",
" U\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 6\n",
" case\n",
" NaN\n",
" 27\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" ADP\n",
" \n",
" IN\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 0.333333\n",
" 1.0\n",
" 4\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 5\n",
" s\n",
" 𝙞𝙩𝙨\n",
" 5\n",
" its\n",
" its\n",
" 1\n",
" ɪts\n",
" 1\n",
" its\n",
" ɪts\n",
" U\n",
" H\n",
" 1.0\n",
" 1.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 6\n",
" nmod:poss\n",
" 0.0\n",
" 27\n",
" \n",
" \n",
" \n",
" Neut\n",
" \n",
" Sing\n",
" 3\n",
" \n",
" Yes\n",
" Prs\n",
" \n",
" PRON\n",
" \n",
" PRP$\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 0.333333\n",
" 1.0\n",
" 5\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
" \n",
" \n",
" 14\n",
" 8\n",
" 0\n",
" 8\n",
" wSwwSwSwwSwS\n",
" 𝖺𝗅𝗅- 𝗰𝗼𝗅𝗈𝗎𝗋 𝘰𝘧 𝗮𝘁𝗁𝖾𝗶𝗌𝗆 𝘧𝘳𝘰𝘮 𝘄𝗵𝗶𝗰𝗵 𝗐𝖾 𝘀𝗵𝗿𝗶𝗻𝗸?\n",
" 2\n",
" 11\n",
" 6\n",
" 14\n",
" 10\n",
" w\n",
" 𝘧𝘳𝘰𝘮\n",
" 66\n",
" from\n",
" from\n",
" 1\n",
" frʌm\n",
" 1\n",
" from\n",
" frʌm\n",
" U\n",
" H\n",
" 1.0\n",
" 0.0\n",
" 1.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 67\n",
" case\n",
" NaN\n",
" 8\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" ADP\n",
" \n",
" IN\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 0.250000\n",
" 1.0\n",
" 13\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 11\n",
" s\n",
" 𝘄𝗵𝗶𝗰𝗵\n",
" 67\n",
" which\n",
" which\n",
" 1\n",
" 'wɪʧ\n",
" 1\n",
" which\n",
" 'wɪʧ\n",
" P\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 69\n",
" obl\n",
" NaN\n",
" 8\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" Rel\n",
" \n",
" PRON\n",
" \n",
" WDT\n",
" 0.0\n",
" NaN\n",
" 0.0\n",
" 1.0\n",
" 1.0\n",
" 0.000000\n",
" 1.0\n",
" 14\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 12\n",
" w\n",
" 𝗐𝖾\n",
" 68\n",
" we\n",
" we\n",
" 2\n",
" wiː\n",
" 1\n",
" we\n",
" wiː\n",
" U\n",
" L\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 69\n",
" nsubj\n",
" NaN\n",
" 8\n",
" Nom\n",
" \n",
" \n",
" \n",
" \n",
" Plur\n",
" 1\n",
" \n",
" \n",
" Prs\n",
" \n",
" PRON\n",
" \n",
" PRP\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.250000\n",
" 0.0\n",
" 14\n",
" 1.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 13\n",
" s\n",
" 𝘀𝗵𝗿𝗶𝗻𝗸\n",
" 69\n",
" shrink\n",
" shrink\n",
" 1\n",
" 'ʃrɪŋk\n",
" 1\n",
" shrink\n",
" 'ʃrɪŋk\n",
" P\n",
" H\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 0.0\n",
" 63\n",
" acl:relcl\n",
" NaN\n",
" 8\n",
" \n",
" \n",
" \n",
" \n",
" Ind\n",
" Plur\n",
" 1\n",
" \n",
" \n",
" \n",
" Pres\n",
" VERB\n",
" Fin\n",
" VBP\n",
" 1.0\n",
" 1.0\n",
" 1.0\n",
" NaN\n",
" 1.0\n",
" 1.000000\n",
" 1.0\n",
" 14\n",
" 0.0\n",
" 0\n",
" 1\n",
" \n",
" \n",
" 14\n",
" NaN\n",
" ?\n",
" 70\n",
" ?\n",
" \n",
" 0\n",
" \n",
" 0\n",
" ?\n",
" \n",
" NaN\n",
" NaN\n",
" 0.0\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" 3\n",
" punct\n",
" NaN\n",
" 8\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" PUNCT\n",
" \n",
" .\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" NaN\n",
" 3\n",
" NaN\n",
" 1\n",
" 0\n",
" \n",
" \n",
"\n",
"1241 rows × 35 columns
\n",
"
],
"text/plain": [
" *total ... word_nsyll\n",
"para_i unit_i parse_rank is_troch parse_i parse parse_str sent_i sentpart_i line_i combo_i slot_i slot_meter syll_str_parse word_i word_str word_tok word_ipa_i word_ipa syll_i syll_str syll_ipa syll_stress syll_weight ... \n",
"1 1 1 0 1 wSwwSwSwSwwSSw 𝖨𝗌 𝗶𝘁 𝗍𝗁𝖺𝗍 𝖻𝗒 𝙞𝙩𝙨 𝗂𝗇𝗱𝗲𝖿𝗂𝙣𝙞𝗍𝖾𝗇𝖾𝗌𝗌 𝘪𝘵 𝘀𝗵𝗮𝗱𝗼𝘄𝘀 𝘧𝘰𝘳𝘵𝘩 1 1 1 12 1 w 𝖨𝗌 1 Is is 2 ɪz 1 Is ɪz U H 0.0 ... 1\n",
" 2 s 𝗶𝘁 2 it it 1 'ɪt 1 it 'ɪt P H 0.0 ... 1\n",
" 3 w 𝗍𝗁𝖺𝗍 3 that that 2 ðət 1 that ðət U H 0.0 ... 1\n",
" 4 w 𝖻𝗒 4 by by 1 baɪ 1 by baɪ U H 0.0 ... 1\n",
" 5 s 𝙞𝙩𝙨 5 its its 1 ɪts 1 its ɪts U H 1.0 ... 1\n",
"... ... ... ...\n",
" 14 8 0 8 wSwwSwSwwSwS 𝖺𝗅𝗅- 𝗰𝗼𝗅𝗈𝗎𝗋 𝘰𝘧 𝗮𝘁𝗁𝖾𝗶𝗌𝗆 𝘧𝘳𝘰𝘮 𝘄𝗵𝗶𝗰𝗵 𝗐𝖾 𝘀𝗵𝗿𝗶𝗻𝗸? 2 11 6 14 10 w 𝘧𝘳𝘰𝘮 66 from from 1 frʌm 1 from frʌm U H 1.0 ... 1\n",
" 11 s 𝘄𝗵𝗶𝗰𝗵 67 which which 1 'wɪʧ 1 which 'wɪʧ P H 0.0 ... 1\n",
" 12 w 𝗐𝖾 68 we we 2 wiː 1 we wiː U L 0.0 ... 1\n",
" 13 s 𝘀𝗵𝗿𝗶𝗻𝗸 69 shrink shrink 1 'ʃrɪŋk 1 shrink 'ʃrɪŋk P H 0.0 ... 1\n",
" 14 NaN ? 70 ? 0 0 ? NaN NaN 0.0 ... 0\n",
"\n",
"[1241 rows x 35 columns]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text.parse()"
]
}
],
"metadata": {
"interpreter": {
"hash": "96e96d1fcde428da9c8322daedfd0e8890a2dfa3c4fb6b7de685db4b856c7b39"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.11"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}