🌈 Qwen-Image-Edit-MeiTu

This model — Qwen-Image-Edit-MeiTu — is an improved variant of Qwen/Qwen-Image-Edit, built with DiT-based architecture fine-tuning to enhance visual consistency, aesthetic quality, and structural alignment in complex edits.

Developed by Valiant Cat AI Lab, this version aims to further close the gap between high-fidelity semantic editing and coherent artistic rendering, achieving a more natural and professional output across a wide range of prompts and subjects.

✨ Key Improvements

Enhanced Consistency:
Utilizes DiT (Diffusion Transformer) fine-tuning to ensure structural stability between input and edited regions, maintaining global spatial coherence.
Aesthetic Optimization:
Trained with aesthetic discriminators and curated aesthetic score datasets, producing more pleasing colors, contrast, and light balance.
Better Detail Preservation:
Improved low-level reconstruction for fine details such as textures, faces, and typography.
Broader Scene Adaptability:
Performs well on portraits, environments, product photos, and illustrations, supporting both semantic and appearance-based editing.

🖼️ Showcase

Below are examples of consistency and aesthetic improvement in complex editing scenarios:

Input & Output

💬 Recommended Prompts

Try these prompts to explore the model’s strengths:

“make the lighting soft and cinematic with better balance”
“enhance the photo’s composition and maintain realism”
“refine skin tone and texture consistency”
“improve the global color tone and aesthetic harmony”
“increase photo realism and clarity without changing content”

🧩 Integration with ComfyUI

This model works seamlessly with a modified ComfyUI Qwen-Image-Edit workflow.
Just use this model in the Unet node to workflow for edit image.

📥 Download Model

Weights available in Safetensors format:

👉 Download Qwen-Image-Edit-MeiTu

🧠 Training

This model was trained and optimized by the
AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.
Visit https://vvicat.com/ for business collaborations or research partnerships.

📄 Related Paper

This model is part of the Qwen-Edit+ research line and is associated with the following preprint:

Fan Tang, Siyuan Li
Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation.
Research Square, Version 1, 08 April 2026.
DOI: 10.21203/rs.3.rs-9352857/v1

📚 Citation

If you use this model, please cite:

@article{tang2026qweneditplus,
  author  = {Fan Tang and Siyuan Li},
  title   = {Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation},
  journal = {Research Square},
  year    = {2026},
  doi     = {10.21203/rs.3.rs-9352857/v1},
  url     = {https://doi.org/10.21203/rs.3.rs-9352857/v1}
}

📜 License

Licensed under Apache 2.0.

💼 Join Us

We are hiring research engineers and creative ML practitioners at
Chongqing Valiant Cat Technology Co., LTD — reach out via
📧 tommy@vvicat.com