Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tbhaxor/password-strength-clf
An example of Jax usage with the new Keras 3.0 API release
https://github.com/tbhaxor/password-strength-clf
deep-learning jax keras keras-models keras-neural-networks keras3 machine-learning neural-network python python3
Last synced: about 1 month ago
JSON representation
An example of Jax usage with the new Keras 3.0 API release
- Host: GitHub
- URL: https://github.com/tbhaxor/password-strength-clf
- Owner: tbhaxor
- License: mit
- Created: 2023-12-02T19:10:32.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-29T01:52:11.000Z (12 months ago)
- Last Synced: 2024-05-02T06:08:42.750Z (8 months ago)
- Topics: deep-learning, jax, keras, keras-models, keras-neural-networks, keras3, machine-learning, neural-network, python, python3
- Language: Python
- Homepage:
- Size: 119 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Password Strength Classifier
An example of Jax usage with the new Keras 3.0 API release
## Why this Project?
Being a fan of [keras framework](https://keras.io/about/), the anouncement of [Keras 3](https://keras.io/keras_3/) is a big YEAH! moment for me. I can now experiment with other deep learning frameworks, like [Pytorch](https://pytorch.org/) or the [Jax](https://jax.readthedocs.io/), with just two or three lines of modification in the current codebase. Yeah, it is no longer limited to using TensorFlow alone!
Any Keras model that only uses [built-in layers](https://keras.io/2.15/api/layers/) will immediately work with all supported backends. In fact, your existing tf.keras models that only use [built-in layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers) can start running in JAX and PyTorch right away! That's right, your codebase just gained a whole new set of capabilities.
I've been using Jax because it's the only one that works on CUDA 12 local setup without the need for extra cudnn downloads. I intend to create a simple password strength classifier project to show how simple it is to configure the jax backend for the keras.
## TL;DR
1. Install the CUDA 12.x compatible jax and latest version of keras from pip
```console
pip install -U pip
pip install -U "jax[cuda12_local]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install keras
```
2. Configure the backend before keras import
```python
import os
os.environ["KERAS_BACKEND"] = "jax"import keras
```> **Note** If you used environment variables outside the code, you can use `keras.backend.backend()` to retrive the name of the backend.
## What's new
Although, Keras 3 is a total rewrite of the previous codebase. These are a few of the highlights I want to share with you.
1. **Framework agnostic** — You can pick the framework that suits you best, and switch from one to another based on your current goals.
2. **Model parallism** — It was [previously accomplished with Tensorflow](https://www.tensorflow.org/guide/distributed_training), but now it includes [distribution namespace](https://keras.io/guides/distribution/), making it simple to perform model parallelism, data parallelism, and combinations of the two.
3. **Universal data pipelines** — The Keras 3 `fit()` / `evaluate()` / `predict()` routines are compatible with [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) objects, with PyTorch [`DataLoader`](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html) objects, with [NumPy arrays](https://numpy.org/doc/stable/reference/generated/numpy.array.html), [Pandas dataframes](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) ‐ regardless of the backend you're using.
4. **Ops namespace** — In case you don't want to extend the existing layers, you can use [`keras.ops`](https://keras.io/api/ops/) to create components (like arbitrary custom layers or pretrained models) that works across all backends. It provides numpy API (Not something "_NumPy-like_" ‐ just literally the NumPy AP) and neural-network functions (softmax, conv and what not).[**And more...**](https://keras.io/keras_3/)
## Requirements
- Python 3.11.x
- (Optional) Poetry 1.7.0
- CUDA 12.2, cuDNN 8.9, NCCL 2.16## Setup
1. Clone the repository
```console
git clone --depth=1 https://github.com/tbhaxor/password-strength-clf.git
cd password-strength-clf
```2. (Optional: Using pip) Configure and activate the virtual environment
```console
virtualenv venv
source venv/bin/activate
```3. Install the dependencies
```console
pip install -r requirements.txt
```
Or, with poetry
```console
poetry install
```> **Note** Poetry automatically creates a virtual environment if it does not exists, and installs the packages into it.
4. Download the [dataset](https://www.kaggle.com/datasets/bhavikbb/password-strength-classifier-dataset/data) from the kaggle or create the CSV in format with 2 columns
|Column Name|Order|Description|
|:--:|:---:|:---:|
|`password`|1|Textual password in raw format, hashed password are not allowed.|
|`strength`|2|Strength class of the corresponding password where **0** means WEAK, **1** means MODERATE and **2** means STRONG.|## Training the Model
The [`train.py`](#training-customization) script takes the input CSV file as required
```console
$ python train.py /path/to/data.csv --passwords HELLO hello h3ll000 H3ll00W@rld h3LL00@@102030
....
Running user provided tests
Password: HELLO Strength: WEAK
Password: hello Strength: WEAK
Password: h3ll000 Strength: WEAK
Password: H3ll00W@rld Strength: MODERATE
Password: h3LL00@@102030 Strength: STRONG
```> **Note** As a bare minimum, only the path to the input CSV file is required to begin training and save the best model for it. However, I recommend that you use the `--passwords` argument to predict the classes using the best model for the custom passwords.
## Training Customization
Use "`python train.py -h`" to get the arguments provided by train.py script
```console
usage: train.py [-h] [--model-path] [--learning-rate] [--dropout-rate]
[--epochs] [--batch-size] [--vocab CHARACTERS]
[--validation-split] [--passwords PASSWORD [PASSWORD ...]]
filePassword strength classifier trainer script
positional arguments:
file path of the csv dataset fileoptions:
-h, --help show this help message and exit
--model-path path to save only weights of the best model (default: trained-models/best_model)
--learning-rate learning rate for adam optimizer (default: 0.01)
--dropout-rate dropout rate to be used between last hidden layer and the output layer (default: 0.3)
--epochs number of epochs to train for (default: 5)
--batch-size batch size to use (default: 32)
--vocab CHARACTERS vocabulary for the passwords to train the model on (default: None)
--validation-split validation split during model training (default: 0.2)
--passwords PASSWORD [PASSWORD ...]
sample passwords to predict and show the performance (default: None)
```## Using Pretrained Model
If you've already [trained](#training-the-model) the model and want to use it in your application, use the following code.
```py
from utils import read_data, get_vectorized, get_model, CLASS_NAMES
from keras import ops
from pathlib import PathX = get_vectorized(CHARSET, ["your password"])
saved_model = Path("/path/to/weights")
if not saved_model.exists():
raise FileNotFoundError(f"Path '{saved_model}' does not exist!")model = get_model(X.shape[1:], model_path="/path/to/weights")
y_predicted = model.predict(X, verbose=False)
y_class_id = ops.argmax(y_predicted, axis=1)
y_class = CLASS_NAMES[y_class_id[0]]
```## Contact Me
Email: tbhaxor _at_ gmail _dot_ com
Discord: @tbhaxor.com
Twitter: @tbhaxor
LinkedIn: @tbhaxor