Yumeng
/

ALDM

layout-to-image

stable-diffusion

Model card Files Files and versions

ALDM / README.md

Yumeng's picture

Update README.md

28d213c verified over 2 years ago

|

history blame contribute delete

1.29 kB

	---
	tags:
	- text-to-image
	- layout-to-image
	- stable-diffusion
	- controlnet
	license: agpl-3.0
	language:
	- en
	---

	<h1 style="font-size:1.5em; " align="center"> Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive (ICLR 2024) </h1>

	<div align="center">

	[Project Page](https://yumengli007.github.io/ALDM/) \| [ArXiv](https://arxiv.org/abs/2401.08815) \| [Code](https://github.com/boschresearch/ALDM)
	</div>

	<div align="center">
	This model repo contains checkpoints trained on Cityscapes and ADE20K datasets using methods proposed in <a href="https://yumengli007.github.io/ALDM/">ALDM</a>.
	For usage instructions, please refer to our <a href="https://github.com/boschresearch/ALDM">Github</a>.
	</div align="center">

	## Model information
	[ade20k_step9.ckpt](ade20k_step9.ckpt) and [cityscapes_step9.ckpt](cityscapes_step9.ckpt) are pretrained diffusion model weights for inference.

	[encoder_epoch_50.pth](encoder_epoch_50.pth), [decoder_epoch_50_20cls.pth](decoder_epoch_50_20cls.pth) and [decoder_epoch_50_151cls.pth](decoder_epoch_50_151cls.pth)
	are segmentation models used for discriminator intialization in training,
	which are adopted from pretrained UperNet101 [here](https://github.com/CSAILVision/semantic-segmentation-pytorch).