https://github.com/Zheng-Chong/FashionMatrix

Fashion Matrix is dedicated to bridging various visual and language models and continuously refining its capabilities as a comprehensive fashion AI assistant. This project will continue to update new features and optimization effects.
https://github.com/Zheng-Chong/FashionMatrix

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/Zheng-Chong/FashionMatrix
Owner: Zheng-Chong
License: mit
Created: 2023-07-08T11:27:00.000Z (almost 2 years ago)
Default Branch: v2-0
Last Pushed: 2023-08-28T04:21:06.000Z (almost 2 years ago)
Last Synced: 2023-08-28T05:47:41.934Z (almost 2 years ago)
Language: Jupyter Notebook
Homepage: https://zheng-chong.github.io/FashionMatrix/
Size: 56.7 MB
Stars: 51
Watchers: 4
Forks: 1
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-Segment-Anything - [code

README

        # Fashion Matrix: Editing Photos by Just Talking 

[![Framework: PyTorch](https://img.shields.io/badge/Framework-PyTorch-orange.svg)](https://pytorch.org/)

[![License](https://img.shields.io/badge/License-MIT-red.svg)](https://opensource.org/licenses/MIT)

[[`Project page`](https://zheng-chong.github.io/FashionMatrix/)]

[[`ArXiv`](https://arxiv.org/abs/2307.13240)]

[[`PDF`](https://arxiv.org/pdf/2307.13240.pdf)]

[[`Video`](https://www.youtube.com/watch?v=1z-v0RSleMg&t=3s)]

[[`Demo(temporarily offline)`](https://0742dc8730a5a94a7a.gradio.live)]

Fashion Matrix is dedicated to bridging various visual and language models and continuously refining its capabilities as a comprehensive fashion AI assistant. 

This project will continue to update new features and optimization effects.



  



## Updates

- **`2023/08/01`**: **Code** of v2,0 released.

- **`2023/08/01`**: **Code** of v1.1 is released. The details are a bit different from the original version (Paper).

- **`2023/08/01`**: [**Demo(Label) v1.1**](https://0742dc8730a5a94a7a.gradio.live) with new *AI model* function and security updates is released.

- **`2023/07/28`**: Demo(Label) v1.0 is released.

- **`2023/07/26`**: [**Video**](https://www.youtube.com/watch?v=1z-v0RSleMg&t=3s) and [**Project Page**](https://zheng-chong.github.io/FashionMatrix/) are released.

- **`2023/07/25`**: [**Arxiv Preprint**](https://arxiv.org/abs/2307.13240) is released.

## Versions

**April 28, 2023**

*Fashion Matrix (Label version) v2.0*

We have simplified the utilization of the support model, employing fewer models and GPU memory, while also retaining the original image resolution (up to 1024x1024).

**April 01, 2023**

*Fashion Matrix (Label version) v1.1*

We updated the use of ControlNet, currently using inpaint, openpose, lineart and (softedge).

+ Add the task **AI model**, which can replace the model while keeping the pose and outfits.

+ Add **NSFW (Not Safe For Work) detection** to avoid inappropriate using.

**July 28, 2023**

*Fashion Matrix (Label version) v1.0*

+ Basic functions: replace, remove, add, and recolor.

## Installation

You can follow the steps indicated in the [Installation Guide](INSTALL.md) for environment configuration and model deployment,

and models except LLM can be deployed on a single GPU with 13G+ VRAM.

(In the case of sacrificing some functions, A simplified version of Fashion Matrix can be realized without LLM. 

Maybe the simplified version of Fashion Matrix will be released in the future)

## Acknowledgement

Our work is based on the following excellent works:

 [Realistic Vision](https://civitai.com/models/4201/realistic-vision-v20) is a finely calibrated model derived from 

[Stable Diffusion](https://github.com/Stability-AI/stablediffusion) v1.5, designed to enhance the realism of generated 

images, with a particular focus on human portraits.

[ControlNet](https://github.com/lllyasviel/ControlNet-v1-1-nightly) v1.1 offers more comprehensive and user-friendly 

conditional control models, enabling

[the concurrent utilization of multiple ControlNets](https://huggingface.co/docs/diffusers/v0.18.2/en/api/pipelines/controlnet#diffusers.StableDiffusionControlNetPipeline).

This significantly broadens the potential and applicability of text-to-image techniques.

[BLIP](https://github.com/salesforce/BLIP) facilitates a rapid visual question-answering within our system.

[Grounded-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything) create a very interesting demo by combining

[Grounding DINO](https://github.com/IDEA-Research/GroundingDINO) and

[Segment Anything](https://github.com/facebookresearch/segment-anything) which aims to detect and segment anything with text inputs!

[Matting Anything Model (MAM)](https://github.com/SHI-Labs/Matting-Anything) is an efficient and

versatile framework for estimating the alpha matte ofany instance in an image with flexible and interactive

visual or linguistic user prompt guidance.

[Detectron2](https://github.com/facebookresearch/detectron2) is a next generation library that provides state-of-the-art 

detection and segmentation algorithms. The DensePose code we adopted is based on Detectron2.

[Graphonomy](https://github.com/Gaoyiminggithub/Graphonomy) has the capacity for swift and effortless analysis of 

diverse anatomical regions within the human body.

## Citation

```bibtex

 @misc{chong2023fashion,

      title={Fashion Matrix: Editing Photos by Just Talking},

      author={Zheng Chong and Xujie Zhang and Fuwei Zhao and Zhenyu Xie and Xiaodan Liang},

      year={2023},

      eprint={2307.13240},

      archivePrefix={arXiv},

      primaryClass={cs.CV}

    }

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Zheng-Chong/FashionMatrix

Awesome Lists containing this project

README