{"id":22752042,"url":"https://github.com/zeuscoderbe/next_word_predicting","last_synced_at":"2025-03-30T06:43:38.785Z","repository":{"id":235050354,"uuid":"786304919","full_name":"ZeusCoderBE/Next_word_predicting","owner":"ZeusCoderBE","description":"NLP bacsic","archived":false,"fork":false,"pushed_at":"2024-06-04T14:43:17.000Z","size":6382,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-05T08:51:28.638Z","etag":null,"topics":["deep-learning","generative-ai","machine-learning","nlp-machine-learning","rnn-tensorflow","text-generator"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ZeusCoderBE.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-14T03:26:43.000Z","updated_at":"2024-06-04T14:43:21.000Z","dependencies_parsed_at":"2024-04-22T04:23:26.684Z","dependency_job_id":"ab5665b9-48d5-4ed4-b980-6e5f0e636561","html_url":"https://github.com/ZeusCoderBE/Next_word_predicting","commit_stats":null,"previous_names":["zeuscoderbe/next_word_predicting"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeusCoderBE%2FNext_word_predicting","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeusCoderBE%2FNext_word_predicting/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeusCoderBE%2FNext_word_predicting/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeusCoderBE%2FNext_word_predicting/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ZeusCoderBE","download_url":"https://codeload.github.com/ZeusCoderBE/Next_word_predicting/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246285668,"owners_count":20752953,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","generative-ai","machine-learning","nlp-machine-learning","rnn-tensorflow","text-generator"],"created_at":"2024-12-11T05:09:25.441Z","updated_at":"2025-03-30T06:43:38.769Z","avatar_url":"https://github.com/ZeusCoderBE.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Applying Artificial Neural Networks to Build Text Generation Models as Part of the Generative AI Problem \n\n## WorkFlow\n\n![text_generation (2)](https://github.com/ZeusCoderBE/Next_word_predicting/assets/117000361/ad93bb7e-158b-4214-860d-e54ab19de370)\n\n\n\n### Visualization after word embedding\n![image](https://github.com/ZeusCoderBE/Next_word_predicting/assets/117000361/396a132a-3b41-4032-89f3-7546abffdf31)\n\n\n### Environment setup\n1. **Install Python libraries:** `numpy`,`tensorflow`,`scikit-learn`, `regex` .\n2. **Dataset:**\n- Text data is used from \"Land Law\"\n  \n- Link: https://thuvienphapluat.vn/van-ban/Bat-dong-san/Luat-Dat-dai-2024-31-2024-QH15-523642.aspx\n\n### Steps to build a model in the project\n1. **Preprocessing:**\n   \n           - Perform data preprocessing steps, including sentence extraction, meaningful word matching, white space removal, punctuation removal, word dictionary   \n            generation, and input sequence generation using the n-gram method.\n   \n3. **Word Embedding:**\n   \n           -I use a word embedding layer to reduce the representation size of words. To improve computing and learning abilities.\n   \n           -I represent words in a multidimensional vector space to capture semantic relationships.\n\n4. **Recurrent Neural Network (RNN):**\n   \n           - I built a deep learning architecture, including embedding layers and SimpleRNN to train the model.\n   \n           - I used TensorFlow and Keras libraries to develop and evaluate the model\n   \n6. **Performance evaluation**\n\n           - Used metrics such as accuracy, precision, recall, and F1 score to evaluate the performance of RNN models on the test set.\n\n### Libraries and Technology\n- **Programming language:** Python\n- **Main libraries:** numpy, scikit-learn, tensorFlow, regex \n- **Model:** RNN\n\n\n### Oriented development\n- I will use a larger data set to build the model.\n- I will use other models such as LSTM and GRU to overcome the limitation of the RNN model that the derivative in the back propagation process is exploded or vanishing.\n- I will use the contextual word separation method for better performance\n### Conclusion\n  - This project provides an overview of different NLP techniques and how to implement them for text processing and analysis. The models were trained and evaluated on real-world text data to ensure their effectiveness in handling natural language tasks.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzeuscoderbe%2Fnext_word_predicting","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzeuscoderbe%2Fnext_word_predicting","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzeuscoderbe%2Fnext_word_predicting/lists"}