https://github.com/shruthimohan03/basic-sentence-generation-using-ngram
Generating similar sentences given input sentences using n-gram approach
https://github.com/shruthimohan03/basic-sentence-generation-using-ngram
natural-language-processing ngrams sentence-generation
Last synced: 6 months ago
JSON representation
Generating similar sentences given input sentences using n-gram approach
- Host: GitHub
- URL: https://github.com/shruthimohan03/basic-sentence-generation-using-ngram
- Owner: shruthimohan03
- Created: 2024-04-10T14:34:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-21T08:20:58.000Z (8 months ago)
- Last Synced: 2025-02-21T09:28:32.460Z (8 months ago)
- Topics: natural-language-processing, ngrams, sentence-generation
- Language: Jupyter Notebook
- Homepage:
- Size: 2.74 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Sentence Generation using N-grams from Text Data
This code demonstrates the process of generating sentences using an n-gram model based on a dataset of text. Here's a brief description of each part:
1. Reading and Preprocessing Data:
- The code reads a dataset from a CSV file named 'Context.csv' using pandas.
- It selects only the 'Text' column from the dataset and takes a subset of the data (first 3000 rows).
- The text data is converted to lowercase and tokenized using NLTK's word_tokenize function.2. Creating the N-gram Model:
- The code defines a function named `generate_ngram` that creates n-grams (bigrams in this case) from a list of words and stores them in a dictionary named `master_ngram`.
- For each sentence in the preprocessed data, the `generate_ngram` function is called to create and store the n-grams in the `master_ngram` dictionary.3. Generating Sentences:
- Another function named `generate_sentence` is defined to generate sentences based on the provided n-gram model.
- Given an input sentence, the code preprocesses it, creates n-grams, and appends them to the `master_ngram` dictionary.
- The `generate_sentence` function then uses the input sentence and the n-gram model to generate a new sentence.4. Example Usage:
- The code provides three example input sentences: "I want to understand", "I want to know how someone", and "I want to make a statement".
- Each input sentence undergoes preprocessing, n-gram creation, and sentence generation using the previously defined functions.
- The generated sentences based on each input are printed out.Overall, this code demonstrates a basic approach to generating sentences using an n-gram model trained on a dataset of text. It shows how to preprocess the data, create the n-gram model, and use it to generate sentences based on user-provided input.