| --- |
| tags: |
| - text-to-image |
| - layout-to-image |
| - stable-diffusion |
| - controlnet |
| license: agpl-3.0 |
| language: |
| - en |
| --- |
| |
| <h1 style="font-size:1.5em; " align="center"> Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive (ICLR 2024) </h1> |
|
|
| <div align="center"> |
| |
| [**Project Page**](https://yumengli007.github.io/ALDM/) **|** [**ArXiv**](https://arxiv.org/abs/2401.08815) **|** [**Code**](https://github.com/boschresearch/ALDM) |
| </div> |
|
|
| <div align="center"> |
| This model repo contains checkpoints trained on Cityscapes and ADE20K datasets using methods proposed in <a href="https://yumengli007.github.io/ALDM/">ALDM</a>. |
| For usage instructions, please refer to our <a href="https://github.com/boschresearch/ALDM">Github</a>. |
| </div align="center"> |
|
|
| ## Model information |
| [ade20k_step9.ckpt](ade20k_step9.ckpt) and [cityscapes_step9.ckpt](cityscapes_step9.ckpt) are pretrained diffusion model weights for inference. |
|
|
| [encoder_epoch_50.pth](encoder_epoch_50.pth), [decoder_epoch_50_20cls.pth](decoder_epoch_50_20cls.pth) and [decoder_epoch_50_151cls.pth](decoder_epoch_50_151cls.pth) |
| are segmentation models used for discriminator intialization in training, |
| which are adopted from pretrained UperNet101 [here](https://github.com/CSAILVision/semantic-segmentation-pytorch). |
|
|
|
|
|
|