https://github.com/kmock930/natural-language-processing
This project contains codes and paperwork based on the course CSI5386 at University of Ottawa (delivered by Professor Dr. Diana Inkpen).
https://github.com/kmock930/natural-language-processing
bert bigram-modeling corpus-linguistics distilbert fasttext-embeddings glove-embeddings hugging-face-transformers large-language-models lemmatizer logistic-regression macro-micro-f1 natural-language-processing paraphrase-minilm pos-tagging roberta-large sbert stopwords text-embedding-ada-002 universal-sentence-encoder word-tokenizer
Last synced: 3 months ago
JSON representation
This project contains codes and paperwork based on the course CSI5386 at University of Ottawa (delivered by Professor Dr. Diana Inkpen).
- Host: GitHub
- URL: https://github.com/kmock930/natural-language-processing
- Owner: kmock930
- Created: 2025-01-13T21:34:16.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-05-02T00:56:04.000Z (5 months ago)
- Last Synced: 2025-05-02T01:34:50.300Z (5 months ago)
- Topics: bert, bigram-modeling, corpus-linguistics, distilbert, fasttext-embeddings, glove-embeddings, hugging-face-transformers, large-language-models, lemmatizer, logistic-regression, macro-micro-f1, natural-language-processing, paraphrase-minilm, pos-tagging, roberta-large, sbert, stopwords, text-embedding-ada-002, universal-sentence-encoder, word-tokenizer
- Language: Jupyter Notebook
- Homepage:
- Size: 39.5 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Natural Language Processing Work
## [Assignment 1](./Assignment%201/README.md) - Corpus analysis and sentence embeddings
## [Assignment 2](./Assignment%202/README.md) - Machine-Generated Text Detection

## [Seminar Research](./Seminar%20Paper/Paper%20Presentation%20-%20Group%202.pdf) - Depression Detection
Given the rising popularity of social media, there is a risk of negative impacts such as cyberbullying, causing mental health distress to some users. As a result, we dived into an exploration of depression detection with the **DORIS framework** proposed by Lan X., Cheng Y., Sheng L., Gao C., and Li Y. It also forms a basis for our project which aims to perform a NLP-based model targetting suicide detection.## [Project](./Project/README.md)
### Summary of Our Work
* [Project's Proposal](./Project/CSI5386_Natural_Language_Processing_Project_Proposal.pdf)
* [Presenting from the NLP's Perspective](./Project/Project%20Presentation%20-%20NLP%20Aspects.pdf)
Our project analyzes suicidal intentions from popular social media platforms, and trains the best model for suicidal detection. Here are the models that we've used.
* [Our Report](./Project/CSI5386_NLP_Project_Report___Kelvin__Jenifer__Sabrina.pdf)
### Baseline Model
### Fine-Tuning a Deep Learning based Transformer - DistilBERT

### Added Custom Layers on top of Fine-Tuned DistilBERT

### LLM-based Model

## Execution Guide
* [**TMUX**](tmux.md) for idling long executions