Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/codersales/msc-y1-s1-w10-thu-lang-eng-python-12-2h-lecture
MSc-Y1-S1-W10-Thu-Lang-Eng-Python-12-2h-Lecture | Summary attempt
https://github.com/codersales/msc-y1-s1-w10-thu-lang-eng-python-12-2h-lecture
Last synced: about 5 hours ago
JSON representation
MSc-Y1-S1-W10-Thu-Lang-Eng-Python-12-2h-Lecture | Summary attempt
- Host: GitHub
- URL: https://github.com/codersales/msc-y1-s1-w10-thu-lang-eng-python-12-2h-lecture
- Owner: CoderSales
- License: mit
- Created: 2023-11-16T13:09:27.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2023-11-16T13:21:13.000Z (12 months ago)
- Last Synced: 2023-11-16T14:29:52.883Z (12 months ago)
- Size: 7.81 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MSc-Y1-S1-W10-Thu-Lang-Eng-Python-12-2h-Lecture
## Description
MSc-Y1-S1-W10-Thu-Lang-Eng-Python-12-2h-Lecture | Summary attempt## Content
NLTK
Tokenisation
To find page:
Step 1:
- [Not to be confused with tokenization (lexical analysis).](https://en.wikipedia.org/wiki/Tokenization_(data_security))
Step 2: click through to lexical analysis, which links to the relevant section of that page:
- "A lexical token is a string with an assigned and thus identified meaning, in contrast to the probabilistic token used in large language models." [Lexical analysis > Lexical token and lexical tokenization | Wikipedia](https://en.wikipedia.org/wiki/Lexical_analysis#Tokenization)
Stemming - "reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form." [Stemming | Wikipedia](https://en.wikipedia.org/wiki/Stemming)
Lemmatization - "the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form." [Lemmatization | Wikipedia](https://en.wikipedia.org/wiki/Lemmatization)
## References
Language Engineering Module