https://github.com/asadiahmad/ngram-spark-wikipedia
Calculating Ngram with PySpark for wikipedia text
https://github.com/asadiahmad/ngram-spark-wikipedia
big-data ngram nlp pyspark spark wikipedia-dataset
Last synced: about 2 months ago
JSON representation
Calculating Ngram with PySpark for wikipedia text
- Host: GitHub
- URL: https://github.com/asadiahmad/ngram-spark-wikipedia
- Owner: AsadiAhmad
- License: mit
- Created: 2024-06-03T19:00:14.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-03T20:11:01.000Z (over 1 year ago)
- Last Synced: 2025-07-10T19:27:00.631Z (3 months ago)
- Topics: big-data, ngram, nlp, pyspark, spark, wikipedia-dataset
- Language: Jupyter Notebook
- Homepage:
- Size: 101 KB
- Stars: 29
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Ngram-Spark-Wikipedia
Calculating Ngram with PySpark for wikipedia text[](https://colab.research.google.com/drive/1aevaYj5zy76PU1YvxThGnT0L5a05Ycvx?usp=sharing)