comfyui

Comfy Use

#5
by Ripcurlsurf - opened

Can this be used or is it under development with Comfy to maximize its potential? I understand some third party nodes are available but quality isn't the best.

Comfy Org org

It is now available in the ComfyUI nightly version, no official example workflow yet, but I have simple test workflow available in the PR: https://github.com/Comfy-Org/ComfyUI/pull/13817

The biggest quality issues are in the model itself though, we have some workarounds such as the seam smoothing, and with the native implementation you have access to all different samplers etc. so I'm sure we can find better ways to use the model, but still it's going to be limited when it comes to the final quality, at least without further training.

Personally most interesting use so far has been the reference based image generation.

it's faster but i don't think it's better than hidream1- Except for the editing stuff which has come a long way. But what use is that if it's blurry and stitched. Good that it's worked on but the competition works just fine.

@kijai . Just curious. Are you part of the comfy team or a very strong talented supporter? Your names comes up a lot

Comfy Org org

@kijai . Just curious. Are you part of the comfy team or a very strong talented supporter? Your names comes up a lot

Started as just custom node dev, but I've been full time part of the official backend team since January now.

ComfyUI_temp_izzoj_00008_

It's really good with text.... and composition etc.
A little to be desired when it comes to overall quality perhaps (face, hands, details etc)
But was just a first few attempt ;-) Might be possible to tweak it a bit; prompt, sampler, steps etc - or even a refiner 2nd pass (with same model or other model)

(havent tried the ref image part much yet, but that looks really good as well)

(and the spelling in the title is all my fault.. ComfyUI, not Comfy-UI.. prompted a bit too fast.. haha.)

everything photo is blurry and low detail. Reference images seem to work fine for composition and editing. but that doesn't help a weak result. And it doesn't work better than flux.2 or qwen edit.
Also fairly samy images from the same prompt, even the full model. edit: that's down to the workflow not really having random seeds, only adding noise in the sampler.

Anatomy seems to be less messy than in flux.2 models but that's a low bar that only Ernie could beat (with a shovel)... tbh i come back to qwen edit as reliable workhorse, in all but the resolution it's nearly as good but just less fiddly than others. All the same prompts after the first one: hidream o1 full: hidro1-_00009_ it's not shiny as usual, it kinda fuzzy even prompted a high quality photo...
same prompt for all following:
Hidream o1, full, mxfp8 (wf says it's higher quality). if you make it small it looks ok, tiny bit closer and everything is fuzzy (faces) Also kinda boring the guys look almost cloned (that's the reason i changed the prompt to "different guys":
hidro1-_00003_

Flux.2 klein9b. Amazing visuals, can't count arms (give the right guy the benefit of the doubt that he is jumping):
flux2-_03062_
klein4b for a change the better:
flux2-_03061_

hidream-i1 Full (old) hidream. i wouldn't say it is better than O1 but less fuzzy :ComfyUI_16755_(1)
and for some comedy, Ernie (how has this cf of a model so many likes?) Not just the phantoms but arms almost always look weird :
ComfyUI_16753_

ComfyUI_temp_eqpcy_00024_

ComfyUI_temp_eqpcy_00030_

ComfyUI_temp_eqpcy_00042_

everything photo is blurry and low detail.

Seems ok here. If its the best ever model, probably not.. it's really good with text, and decent compositions.
The ref image feature is the great feature though..

Been trying different samplers, undecided on what works best

Comfy Org org

Some of my outputs I've liked:

banodoco_panda

hidream_purple_witch

hidream_pixel

ComfyUI_temp_krebt_00020_

hidream_nier

Some observations, (mostly using reference images):

  • Base model way better, but requires using the seam fix workaround for the tiling issue
  • You can use higher res than the default
  • deis or res_multistep with beta has worked nicely for me, but too many options here to choose best

Also got some good results with res_multistep. Maybe a good candidate ;-)

ComfyUI_temp_eqpcy_00044_

Deis works really well, even got a bit of skin blemishes and details (that was in the prompt)

Comfy Org org

ComfyUI_temp_eqpcy_00044_

Deis works really well, even got a bit of skin blemishes and details (that was in the prompt)

You can also try adding some of the dev distill as a LoRA, not too much or it will burn it: https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_dev_lora_rank_64_bf16_pruned_v1.safetensors

You can also try adding some of the dev distill as a LoRA, not too much or it will burn

yes that helped a bit as well

tbc the images as such are ok, (but my three guys were still clones). without the fuzziness. Far enough out (or small enough), like the size we see here, and apparently beyond photorealism.
But it is meant to do 2048x2048. It's promising but not great. We'll see what people do with it. And thanks for working on it. Now that Qwen Image seems to go closed (and small) , alternatives are good. Except ernie...

Also found that using the Gemma 4 text generate in comfy, and feeding your prompt with the instruct from HiDream, vastly improved the output.
I used the prompt instruction here https://github.com/HiDream-ai/HiDream-O1-Image/blob/main/prompt_agent.py (but i translated it to english)
It makes a json prompt that the model seems to like a lot ;-)

image

i opened the lady 2 posts up in full size. The skin is pure blur. The back and white old guy further up, even in small size, the hairs looks ok but the skin is completely blurry.
Are they just overselling the resolutions it can do? Maybe someone makes a anti blur or skin lora. That seems to be by far the biggest problem. At least in photorealistic images. And it's a techncial problem of the model. You can prompt as much about no blur, sharpness, skin details or no dof, or change samplers, and it still does it.
The black and white guy is a good example of prompting the hell out of it (tons of prompted skin details, hair and face details to try to fix the problem , and you end up with these typical 100 year olds, even if you prompt someone age 40. It's the only way the poor model can cramp all these prompted details into a face. but the skin is STILL blurry.

With enhanced prompts like this btw i find it a double edged sword anyway. Not buying this new meta about short story sized prompts, that started with Z-Image. because it's very hard to stop it from changing too much. And even Z-image does just fine with a simpler prompt. It just does always the same with it. If you want a randomized/ fancied up version of a core idea, a long, flowery prompt is great. But for something precise it's often more annoying.
And if a model has problems like F2, Ernie with anatomy or HidreamO1 with blur, even a long prompt doesn't change the fundamental flaw. Hence the b/w 100 year old guy.

This model has its strength and weaknesses ... as any other model i guess.
But its open source, so community will evolve on it, if they want. Make fine tuned models, loras and what not ;-)

So the most important part is that its open source

Ah i see there has been added a default workflow inside Comfy now.
With prompt enhancer and more

Try that perhaps. Gives better results

image

Comfy Org org

@RuneXX I noticed you had Shift adjustment in one of your workflows, and realized I had a mistake in the initial ModelNoiseScale node that had two buggy behaviours with the shift adjustments:

  • If the Shift node was after the ModelNoiseScale, it reset the noise scale to the model default (8.0) making the node adjustment do nothing
  • ModelNoiseScale was after the shift, it reseted the shift to model default

PR has been merged now that fixes that and it should work both ways.

Yes i was just experimenting. trying the shift to see how things improved or got worse ;-)
will try again

The model has some serious strength (composition, text, and more .. it looks really "artistic" sometimes).
It does lack a bit in the finer grain details, skin etc, but that might come with community iterations and improvements

I dont know if its just me, but i really like some of the outputs, reminds me of the days I did black and white photography. When you do close up photos, not everything is in focus.
Makes it look more real to me.. . But i do see why some say the skin is plastic etc (but that been said about ai images since sdxl, flux etc etc)

To me it looks abit more like something you'd find in a photography art gallery, while Z-image looks more like a magazine photo.. or something like that ;-)

(images below are stock comfyui workflow with the fp16 full model and a small dash of Kijai's lora (0.3), with res_multistep sampler... if i remember correctly)

hidream_o1_00014_

hidream_o1_00009_

Did they already release an updated model btw?
https://huggingface.co/HiDream-ai/HiDream-O1-Image-Dev-2604

From the "sales pitch", it sounds like it depends on the prompt refiner, but i guess thats also true for the previous ones

Comfy Org org

The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.

The new model is aimed to improve pose following when using something like openpose rig as one of the references, otherwise initial impression is that it just seems... worse in details, even blurrier etc... and it's dev only. I could be doing something wrong still, didn't do any extensive tests yet. Definitely does follow the pose more.

Screenshot 2026-05-14 004340

Screenshot 2026-05-14 003836

Screenshot 2026-05-14 011022

ComfyUI_temp_mcrju_00004_

image

(just a low res test run)

yes the ref image way is for sure good fun. And if you have a character, easy to put into different scenes, different clothing, etc etc.

The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.

Hey kijai,
​I’ve been trying to get a "Detail Daemon" effect (per-step sigma modulation) working with the HiDream-01 dev model.
​Since my nodes for the model relies on a vendored pipeline.py with custom flow-matching schedulers (FlashFlowMatch / UniPC) rather than ComfyUI's native KSampler infrastructure, standard Detail Daemon hooks completely miss it. We've tried directly modifying the denoising loop and monkey-patching SIGMA_SCHEDULE_MAP to warp the schedule, but it consistently causes stability issues and tensor blowouts.
​Is it possible to natively implement support for this kind of sigma modulation directly within your custom denoising loop? Alternatively, is there a recommended, safe way to hook into the pipeline to modulate sigmas per-step without breaking the flow-matching shift math?

I feel this is something that the community could benefit from and will revitalyze the model entirely if it can be executed properly!

Heres my nodes if you want to take a look πŸ€·β€β™‚οΈ claude just isnt getting it done for me and i keep hitting limits lol i removed the detail injector (essentially custom mapped detail daemon) because it was giving grey outputs and i feel any pipeline changes just ruin the flow entirely. But i have the code if you want to look at that as well. Tried imementing into sampler node AND attempted a seperate node entirely with the same greyed results.

https://github.com/RealRebelAI/Rebels_HiDream-01_Image_Dev_NODES/tree/main

and for some comedy, Ernie (how has this cf of a model so many likes?) Not just the phantoms but arms almost always look weird :
ComfyUI_16753_

ernie is really good with prompted skin detail, but yes, the ghost limbs are really bad and i initially thought resolution dependent which is not the case, they are just breaking from time to time. lets also not talk about the training data bias... but ernie is also mostly uncensored or can at least display normal nudity (no hardcore stuff) whereas hidream o1 has never seen a nipple... might not be important for a lot of people but for creating character images, it is nice if the base model can do stuff like that...

00001 - a Menah is a caucasian human youthful adult

00003 - a Link (Legend of Zelda ) is a genderbend version
just some test images

Comfy Org org

Here's the new dev checkpoint as a LoRA to experiment with, it's slightly weaker but honestly that's just better... reducing strength helps it not destroy the background too:

https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_image_dev_2604_lora_avg_rankg_224_bf16.safetensors

Comfy Org org

The strength of the model is the reference image mode really, as text to image it's just too lacking as it is.

Hey kijai,
​I’ve been trying to get a "Detail Daemon" effect (per-step sigma modulation) working with the HiDream-01 dev model.
​Since my nodes for the model relies on a vendored pipeline.py with custom flow-matching schedulers (FlashFlowMatch / UniPC) rather than ComfyUI's native KSampler infrastructure, standard Detail Daemon hooks completely miss it. We've tried directly modifying the denoising loop and monkey-patching SIGMA_SCHEDULE_MAP to warp the schedule, but it consistently causes stability issues and tensor blowouts.
​Is it possible to natively implement support for this kind of sigma modulation directly within your custom denoising loop? Alternatively, is there a recommended, safe way to hook into the pipeline to modulate sigmas per-step without breaking the flow-matching shift math?

I feel this is something that the community could benefit from and will revitalyze the model entirely if it can be executed properly!

Heres my nodes if you want to take a look πŸ€·β€β™‚οΈ claude just isnt getting it done for me and i keep hitting limits lol i removed the detail injector (essentially custom mapped detail daemon) because it was giving grey outputs and i feel any pipeline changes just ruin the flow entirely. But i have the code if you want to look at that as well. Tried imementing into sampler node AND attempted a seperate node entirely with the same greyed results.

https://github.com/RealRebelAI/Rebels_HiDream-01_Image_Dev_NODES/tree/main

Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.

Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.

I understand but i was attempting to address the dev model specifically for that purpose as the model does wash everything out pretty bad. I was trying to figure out a different way to achieve the detail injection and reject some of the aggressive smoothing without causing hallucinations or forcing the smoothing regardless. It seems it doesnt work as well as is

I have a set of detailing prompt and ran it through with the full model, it gives some variations but there is still a bit of smoothing happening in the last few steps of the image generation, it also needs to get a bit more variety in its results but we can prompt those as well for the moment. So far each model I tested had their goto face and it always helped to prompt in some ethnicity and more detail. Unlike Chroma or Flux (as well as older models) which is limited to a certain prompt length, newer models can be told a lot of detail in prompt.
Some more examples:
00029 - wearing SPORTSWEAR, athletic tank top and running
00030 - wearing SPORTSWEAR, halter-neck sports top and
00031 - wearing SPORTSWEAR, volleyball jersey and spandex
00032 - wearing SPORTSWEAR, compression crop top and biker

Comfy Org org

Detail Daemon already works with the base model in ComfyUI with the native implementation though? Just tested it and it's fine. It doesn't really work with the dev model though as that model just smooths everything out so aggressively.

I understand but i was attempting to address the dev model specifically for that purpose as the model does wash everything out pretty bad. I was trying to figure out a different way to achieve the detail injection and reject some of the aggressive smoothing without causing hallucinations or forcing the smoothing regardless. It seems it doesnt work as well as is

It smooths everything out on the last (low) sigmas, if you end the schedule early there's bit more detail, but also the same patch grid artifacts as with the base model. It looks to me the dev model has been trained (either on purpose or by side effect) to smooth out the grid artifacts, which ends up also losing ton of normal detail. Just a theory, don't know anything for sure, I have tried various methods trying to get more quality out of it and really only worthwhile approach seems some sort of hybrid using the base model and the dev as a LoRA at lower strength.

Can this be used or is it under development with Comfy to maximize its potential? I understand some third party nodes are available but quality isn't the best.

Works great with WAN2GP. They tend to have day 0 support for weird stuff more often than ComfyUI these days.
It's more memory efficient than comfy too, so you don't have to sacrifice on quality tradeoffs

hidream_o1_00058_

Here's the new dev checkpoint as a LoRA to experiment with, it's slightly weaker but honestly that's just better... reducing strength helps it not destroy the background too:
https://huggingface.co/Kijai/hidream-O1-image_comfy/blob/main/loras/hidream_o1_image_dev_2604_lora_avg_rankg_224_bf16.safetensors

Works nicely
with that lora ;-)

I dont really know if it helps, but adding something like: Ultra-realistic, high detail skin texture at top of the prompt, the model seems to follow

Comfy Org org

Can this be used or is it under development with Comfy to maximize its potential? I understand some third party nodes are available but quality isn't the best.

Works great with WAN2GP. They tend to have day 0 support for weird stuff more often than ComfyUI these days.
It's more memory efficient than comfy too, so you don't have to sacrifice on quality tradeoffs

This model was available unofficially way before Wan2GP and officially around same time. There's no memory issues either.
And "they"? Aren't you working on that project yourself? This is ComfyUI repository and ComfyUI thread, you basically came here to advertise, bad look.

to be fair putting a person into a new image/clothing works very well with flux.2 or qwen image. Put one or two reference images and just prompt some scene like in a normal image model. And they will use the reference person. Even more if you reference them in the prompt. People who use even just editing models (QIE2511), let alone all in ones like Flux.2 or O1, just to edit a image, are criminally underusing them.

Talking of NSFW, hidream I1 (what's with the naming scheme?) could do a lot of stuff, Flux.1 couldn't. Not outright pron of course but kinda kinky stuff. Or just simply dirty or damaged stuff. It was in generally more flexible and i liked it. Flux only caught up with KRea (underrated). Hidream I1 dev or full was just really slow on my then hardware. So i don't want O1 to fail, and i'm glad they seem to work on it.

RuneXX: Nope that skin is still pure blur, just with some color grading. This is something a lora at the very least needs to fix, not a prompt. Better a model update. Or artsifying it instead of trying photorealism. There are some examples here of non photo stuff where it seems to be quite good.

non photo stuff where it seems to be quite good.

yes noticed it was very good at things that are not aiming for photo realism. And for text, schematics, infographics, advertisement shots etc
So it definitely has its use cases. And the ref.image feature i think must be close to the best out there. Havent tried it a lot in Klein and Qwen, but cant remember it being that accurate.

And since open source, some derivative models or loras might come ;-)

Odd thing is that photo-realistic shots also looks great at "normal" size (in posts or sized down). But if you blow it up to 100% and peak at it 2048x2048 you will see some smudges and blurs.
Perhaps photorealism at 2048px was aiming too far, but who knows.. its early stages, new model ;-)

Comfy Org org

Another thing I noticed is that using even a little of the dev as a LoRA, even when using cfg, gets rid of worst of the patch grid artifacts.

Examples with er_sde/beta, 30 steps, cfg 2.0, base + dev 2604 at 0.2 strength, no seam fix.

hidreamcity

hidream_test

Not really optimal still, but I do think there's a good balance to be found like this.

Talking of NSFW, hidream I1 (what's with the naming scheme?) could do a lot of stuff, Flux.1 couldn't. Not outright pron of course but kinda kinky stuff. Or just simply dirty or damaged stuff. It was in generally more flexible and i liked it. Flux only caught up with KRea (underrated). Hidream I1 dev or full was just really slow on my then hardware. So i don't want O1 to fail, and i'm glad they seem to work on it.

It was sad that people did not work more with it, there was an uncensored model on civit but the base was just overlooked and nothing more came out of it. Maybe people were scared because the stock config wanted 4 text encoders even tho only the llama one was actually needed... Come to think of it, hidream l1 was the first model to work with a llm instead of just a clip or t5... that part was used on other models but sadly hidream l1 was kind of dead in the water... it was really good at multi character prompts and such. it outshined a lot of the models at the time but was not really adopted by the community...

00033 - a lone, single androgynous giant. A colossal,
the workflow with dev lora seems to be a bit better but it still produces way too smooth skin, might need to look into better prompting as well to add more micro detail and not just the macro ones like in the above pic.
00032 - a lone, single sexy female with small breasts,

00031 - a lone, single female blood_elf. a slender,
if we go to full size, it still has tiling and is blocky, especially on the transitions around the characters

i agree with you on many things. I don't on LLM for clips. I don't think it hurt. But in my experience it makes no bloody difference (As CLIP!) Using Chroma or Wan or Flux Keea, all with T5, they work AS well as llm-clip models. I call bs on this llm hype (for clip) that has taken over (each model a different llm of course). You prompt slightly different for t5xl (much less effort btw). but that's it. I think Hidream1 was just better trained for new stuff, that was its advantage. Which is proven by Flux.1 Krea, still t5 clip, bringing flux to the same level (complex compositions, Dirty, wet stuff, text placement....) .

I can't believe I missed such an exciting discussion! I'm officially 'debugging' this modelβ€”or 'trial and error' might be a more fitting term. Right now, I'm using the RES4LFY 'chain' scheduling method, as shown in the image below. I tried injecting a bit of eta at the initial stage and having it execute one less step at the end. This way, I got an image with a bit of grain, which I personally feel looks a bit more realistic.
By the way, I've quantized the Prompt-Refine model into GGUF format, which can be loaded and inferred using LM Studio. Link: https://huggingface.co/tuolaku/Prompt-Refine-GGUF/tree/main
image
ComfyUI_temp_qvbbp_00005_

QQuick question for everyone: how do you handle the issue of facial consistency? I suspect it might be due to the model's limitations. When predicting pixel blocks, the patches are relatively large, which leads to less information and makes it hard to maintain consistency. However, I can't quite explain why consistency is much better with other objects.

ComfyUI_temp_remnm_00001_

Comfy Org org

I made a node (available for testing in KJNodes) to see each step's result easier, it will also show you full resolution preview if you want, since this is pixel space model it works especially well:

For photo realism what are the best settings for clownshark samplier. I find the standard settings give me the doll like skin? The loras set at .7 help a bit but nothing like flux standards. I have 32 vram so I have a lot of room to play with

i think we're fairly sure now that this is just the models weakness. Pompting and samplers can only help so much. It needs a real fix (lora) for it's skin issues.

I've seen Ai Toolkit has been updated to support it. Let's see what it can do with some help.

I found a simple solution in a workflow from civitai, that works remarkably well. It takes the output image from hidream and puts it through a z-image turbo run with 4 steps (i use mostly 6) and 0.35 denoising. While copying the same prompt from the hidream image in. The results are pretty good even for the blurry or shiny skin. Even with disabling the upscaling step at the end (which takes longer than the z-image run)..
i changed it for myself to avoid another custom node (and add my own ones :P) but this is the principle:

image

the og civitai workflow site: https://civitai.com/models/2629261/hidream-o1-dev-2604-z-image-turbo-refiner?modelVersionId=2952028
Although i'm fairly sure the negative prompt doesn't do anything on a cfg1 distilled model like ZIT.
before and after (6 step z.image, end-upscaling disabled):
hidro1-_00252_
ComfyUI_16951_(1)

I found a simple solution in a workflow from civitai, that works remarkably well. It takes the output image from hidream and puts it through a z-image turbo run with 4 steps (i use mostly 6) and 0.35 denoising. While copying the same prompt from the hidream image in. The results are pretty good even for the blurry or shiny skin. Even with disabling the upscaling step at the end (which takes longer than the z-image run)..

thats actually my dual pass workflow haha <3

i discovered, the zit upscaler is more or less as a template in comfy but it never crossed my mind to just attach it (and zit is so fast). Or to realize how good it is against the new King of blurry ... everything. .
I remember even have somewhere a old flux1+zit workflow to use my old flux loras. But it works great.
But btw i'm not convinced by the upscale-downscale thing at the end. I seem to get worse results with it (it gets a bit too noisy). Tried a few variations. But the main thing is very nearly perfect usually.

My new problem: in all workflows that use the checkpoint trick my (Aitoolkit) trained lora doesn't work. at all. Even cranked up to 3.x weight. Only with the one using the Hidream-o1 special nodes does it. But that hidream-o1 lora loader doesn't connect to normal nodes (doubt it would help).

Sign up or log in to comment