Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rudy2steiner/animephotomaker
About PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
https://github.com/rudy2steiner/animephotomaker
anime maker photo
Last synced: 6 days ago
JSON representation
About PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
- Host: GitHub
- URL: https://github.com/rudy2steiner/animephotomaker
- Owner: rudy2steiner
- Created: 2024-02-24T15:22:22.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-02-24T15:53:05.000Z (9 months ago)
- Last Synced: 2024-02-24T16:43:31.924Z (9 months ago)
- Topics: anime, maker, photo
- Language: TypeScript
- Homepage: https://www.animephotomaker.com
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
## PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
[[Paper](https://huggingface.co/papers/2312.04461)] [[Project Page](https://photo-maker.github.io)] [[Model Card](https://huggingface.co/TencentARC/PhotoMaker)][[🤗 Demo (Realistic)](https://huggingface.co/spaces/TencentARC/PhotoMaker)] [[🤗 Demo (Stylization)](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style)]
If the ID fidelity is not enough for you, please try our [stylization application](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style), you may be pleasantly surprised.
---
Official implementation of **[PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding](https://huggingface.co/papers/2312.04461)**.
### 🌠 **Key Features:**
1. Rapid customization **within seconds**, with no additional LoRA training.
2. Ensures impressive ID fidelity, offering diversity, promising text controllability, and high-quality generation.
3. Can serve as an **Adapter** to collaborate with other Base Models alongside LoRA modules in community.---
![photomaker_demo_fast](https://github.com/TencentARC/PhotoMaker/assets/21050959/e72cbf4d-938f-417d-b308-55e76a4bc5c8)
## 🚩 **New Features/Updates**
- ✅ Jan. 15, 2024. We release PhotoMaker.---
## 🔥 **Examples**
### Realistic generation
- [![Huggingface PhotoMaker](https://img.shields.io/static/v1?label=Demo&message=Huggingface%20Gradio&color=orange)](https://huggingface.co/spaces/TencentARC/PhotoMaker)
- [**PhotoMaker notebook demo**](photomaker_demo.ipynb)
### Stylization generation
Note: only change the base model and add the LoRA modules for better stylization
- [![Huggingface PhotoMaker-Style](https://img.shields.io/static/v1?label=Demo&message=Huggingface%20Gradio&color=orange)](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style)
- [**PhotoMaker-Style notebook demo**](photomaker_style_demo.ipynb)
# 🔧 Dependencies and Installation
- Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
- [PyTorch >= 2.0.0](https://pytorch.org/)
```bash
pip install -r requirements.txt
```# ⏬ Download Models
The model will be automatically downloaded through following two lines:```python
from huggingface_hub import hf_hub_download
photomaker_path = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")
```You can also choose to download manually from this [url](https://huggingface.co/TencentARC/PhotoMaker).
# 💻 How to Test
## Use like [diffusers](https://github.com/huggingface/diffusers)
- Dependency
```py
import torch
import os
from diffusers.utils import load_image
from diffusers import EulerDiscreteScheduler
from photomaker.pipeline import PhotoMakerStableDiffusionXLPipeline### Load base model
pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained(
base_model_path, # can change to any base model based on SDXL
torch_dtype=torch.bfloat16,
use_safetensors=True,
variant="fp16"
).to(device)### Load PhotoMaker checkpoint
pipe.load_photomaker_adapter(
os.path.dirname(photomaker_path),
subfolder="",
weight_name=os.path.basename(photomaker_path),
trigger_word="img" # define the trigger word
)pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)
### Also can cooperate with other LoRA modules
# pipe.load_lora_weights(os.path.dirname(lora_path), weight_name=lora_model_name, adapter_name="xl_more_art-full")
# pipe.set_adapters(["photomaker", "xl_more_art-full"], adapter_weights=[1.0, 0.5])pipe.fuse_lora()
```- Input ID Images
```py
### define the input ID images
input_folder_name = './examples/newton_man'
image_basename_list = os.listdir(input_folder_name)
image_path_list = sorted([os.path.join(input_folder_name, basename) for basename in image_basename_list])input_id_images = []
for image_path in image_path_list:
input_id_images.append(load_image(image_path))
```- Generation
```py
# Note that the trigger word `img` must follow the class word for personalization
prompt = "a half-body portrait of a man img wearing the sunglasses in Iron man suit, best quality"
negative_prompt = "(asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), open mouth, grayscale"
generator = torch.Generator(device=device).manual_seed(42)
images = pipe(
prompt=prompt,
input_id_images=input_id_images,
negative_prompt=negative_prompt,
num_images_per_prompt=1,
num_inference_steps=num_steps,
start_merge_step=10,
generator=generator,
).images[0]
gen_images.save('out_photomaker.png')
```## Start a local gradio demo
Run the following command:```python
python gradio_demo/app.py
```You could customize this script in [this file](gradio_demo/app.py).
## Usage Tips:
- Upload more photos of the person to be customized to improve ID fidelty. If the input is Asian face(s), maybe consider adding 'asian' before the class word, e.g., `asian woman img`
- When stylizing, does the generated face look too realistic? Adjust the Style strength to 30-50, the larger the number, the less ID fidelty, but the stylization ability will be better. You could also try out other base models or LoRAs with good stylization effects.
- For faster speed, reduce the number of generated images and sampling steps. However, please note that reducing the sampling steps may compromise the ID fidelity.# 🤗 Acknowledgements
- T2I-Adapter is co-hosted by Tencent ARC Lab and Nankai University [MCG-NKU](https://mmcheng.net/cmm/).
- Inspired from many excellent demos and repos, including [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter), [multimodalart/Ip-Adapter-FaceID](https://huggingface.co/spaces/multimodalart/Ip-Adapter-FaceID), [FastComposer](https://github.com/mit-han-lab/fastcomposer), and [T2I-Adapter](https://github.com/TencentARC/T2I-Adapter). Thanks for their great works!
- Thanks for Venus team in Tencent PCG for their feedback and suggestions.# Disclaimer
This project strives to positively impact the domain of AI-driven image generation. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it in a responsible manner. The developers do not assume any responsibility for potential misuse by users.# BibTeX
If you find PhotoMaker useful for your research and applications, please cite using this BibTeX:```bibtex
@article{li2023photomaker,
title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
booktitle={arXiv preprint arxiv:2312.04461},
year={2023}
}