https://github.com/thangtran3112/machine-learning

NLP, Neural networks, pytorch, tensorflow, AWS Sagemaker fine-tuning
https://github.com/thangtran3112/machine-learning

artificial-neural-networks aws-bedrock aws-sagemaker gensim gru-neural-networks keras lemmatization lstm-neural-networks nltk numpy one-hot-encoding pandas python recurrent-neural-network scikit-learn tensorflow tfidf-vectorizer word2vec

Last synced: 6 months ago
JSON representation

NLP, Neural networks, pytorch, tensorflow, AWS Sagemaker fine-tuning

Host: GitHub
URL: https://github.com/thangtran3112/machine-learning
Owner: thangtran3112
License: mit
Created: 2024-11-22T03:49:56.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-03-16T20:10:49.000Z (11 months ago)
Last Synced: 2025-03-16T21:26:01.061Z (11 months ago)
Topics: artificial-neural-networks, aws-bedrock, aws-sagemaker, gensim, gru-neural-networks, keras, lemmatization, lstm-neural-networks, nltk, numpy, one-hot-encoding, pandas, python, recurrent-neural-network, scikit-learn, tensorflow, tfidf-vectorizer, word2vec
Language: Jupyter Notebook
Homepage: https://movie-review-sentiment-rnn.streamlit.app
Size: 195 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # machine-learning

## Initializations

```bash

    conda create -p venv python=3.12

    conda activate venv/

```

## Tokenization

- Paragraph to sentences (Tokenization into sentence)

- Sentence to words/vocabolary (Tokenization into words)

## NLTK (Tokenization library)

- Alternative: `spacy`

```bash

  pip install ntlk

  pip install numpy

```

- Or install those libraries, just for the Conda environment:

```bash

  conda install -p venv/ nltk

  conda install -p venv/ numpy

```

- [Install skykit-learn](https://scikit-learn.org/stable/install.html). This may install dependency from `numpy` and `scipy`

```bash

  conda uninstall numpy # just in case

  conda install -c conda-forge scikit-learn

```

- Or install with `pip` under conda environment. If there is stack error, uninstall `numpy`

```bash

  pip unstall numpy

  pip install --force-reinstall scikit-learn

```

- One time action to download punkt_tab, in python code:

```python

  import nltk

  nltk.download('punkt')  # Download the tokenizer models

  from nltk.tokenize import word_tokenize, sent_tokenize

  text = "Hello, world! This is NLTK's tokenizer."

  words = word_tokenize(text)  # Tokenizes into words

  sentences = sent_tokenize(text)  # Tokenizes into sentences

```

### Lemmatization

- Require to download: nltk.download('wordnet')

```python

  nltk.download('wordnet')

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/thangtran3112/machine-learning

Awesome Lists containing this project

README