Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/huu4ontocord/m3rlin
Multilingual, Multimodal, Multidomain (M3) Model
https://github.com/huu4ontocord/m3rlin
Last synced: 10 days ago
JSON representation
Multilingual, Multimodal, Multidomain (M3) Model
- Host: GitHub
- URL: https://github.com/huu4ontocord/m3rlin
- Owner: huu4ontocord
- License: apache-2.0
- Created: 2023-10-19T19:18:15.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-22T22:28:51.000Z (about 1 year ago)
- Last Synced: 2024-02-16T02:29:21.384Z (9 months ago)
- Language: Python
- Size: 52.7 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# M3rlin
Multilingual, Multimodal, Multidomain (M3) Model### Details
- We are using training code Openlm and FMengine which can run on JUWELS
- The code in this repo is the M3rlin specific code, which is data loading and interleaving of embeddings and an extra mse loss.
- Clone or pip install the openlm or FMengine code directly to use for training.### TODO:
- add extraction of image or embeddings form hf dataset or jsonl (jsonl is usually faster)
- test that the embedding is saved to webdataset format
- test loading embeddings and token ids in train.py
- write the code to insert token_id into the token sequences
- embeddings should be saved in a 3D tensor (batch, embedding_id, embedding_dim) and returned
- positions are an array of 3D tensor (batch, sequence_id, column_id)
- need to add MSE loss
- confirm and test MSE loss
- Add up-projection and down-projection for embedings input and output embedding
- Add Peft and freezing base model