Instructions to use circlestone-labs/Anima with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusion Single File
How to use circlestone-labs/Anima with Diffusion Single File:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Effect of training LLM adapter for Loras
According to the model page and circlestone lab's own recommendation you are not supposed to train the LLM Adapter.
However, after training around 20ish different Lora across Anima Preview 1/2/3 with and without 'llm_adapter_lr=0' I honestly can't tell which approach is better.
I've had some Loras where I liked the results more with trained adapter and some where I liked them more with adapter training disabled. For reference I am using sd-scripts for the Lora training and mostly using anime tag captioning.
Anyone has any further insight on this?
with and without 'llm_adapter_lr=0'
sd-scripts
Are you sure you trained the LLM Adapter? You need to set --network_args "train_llm_adapter=True"to make sd-scripts train the LLM Adapter.
with and without 'llm_adapter_lr=0'
sd-scripts
Are you sure you trained the LLM Adapter? You need to set
--network_args "train_llm_adapter=True"to make sd-scripts train the LLM Adapter.
I didn't specify the flag yet the Loras produce different results on the same epochs despite using identical dataset/config+seed for the training. According to Claude there is potentially some influence on the Lora training conditions even if 'train_llm_adapter=True' isn't specified. Could also just be AI hallucinations though.
There is also the case of circlestone_labs themselves pointing out to disable the LLM adapter training with 'llm_adapter_lr=0'
https://civitai.com/models/2536147/greg-rutkowski-style-anima
Of course this is for diffusion-pipe trainer so can't confirm if it also applies to sd-scripts but if we assume that the default Anima behavior is for the LLM adapter to not be trained in any capacity, then explicitly adding llm_adapter_lr=0 for Lora training would seem kinda pointless.
Please stop asking AI and start reading the documentation properly :)
https://github.com/kohya-ss/sd-scripts/blob/main/docs/anima_train_network.md
There is also the case of circlestone_labs themselves pointing out to disable the LLM adapter training with 'llm_adapter_lr=0'
Obviously, diffusion-pipe != sd-scripts
There is also the case of circlestone_labs themselves pointing out to disable the LLM adapter training with 'llm_adapter_lr=0'
https://civitai.com/models/2536147/greg-rutkowski-style-anima
Of course this is for diffusion-pipe trainer so can't confirm if it also applies to sd-scripts but if we assume that the default Anima behavior is for the LLM adapter to not be trained in any capacity, then explicitly adding llm_adapter_lr=0 for Lora training would seem kinda pointless.
Diffusion-pipe works differently than sd-scripts, in diffusion-pipe you have to specify that otherwise it trains the adapter. On sd-scripts it won't train unless you specify train_llm_adapter=True. This is even stated in the anima_train_network.md docs in sd-scripts and you can see it in the lora.py files "train_llm_adapter = kwargs.get("train_llm_adapter", "false")" (line 239 of lora_anima.py)
I didn't specify the flag yet the Loras produce different results on the same epochs despite using identical dataset/config+seed for the training. According to Claude there is potentially some influence on the Lora training conditions even if 'train_llm_adapter=True' isn't specified. Could also just be AI hallucinations though.
Loras will never be identical even if the dataset, seed, and parameters are all the same. You could maybe get deterministic results if you turned off all attention types but that's unrealistic
Please stop asking AI and start reading the documentation properly :)
https://github.com/kohya-ss/sd-scripts/blob/main/docs/anima_train_network.mdThere is also the case of circlestone_labs themselves pointing out to disable the LLM adapter training with 'llm_adapter_lr=0'
Obviously, diffusion-pipe != sd-scripts
I've read the documentation :)
Documentation != code
You are free to explain in exact details how these flags interact based on the code.
You are free to explain in exact details how these flags interact based on the code.
llm_adapter_lrflag is unused byanima_train_network.pyand is only used by the full fine-tuning scriptanima_train.py. Instead, to adjust the LLM Adapter lr, you need to adjust it with something like"network_reg_lrs=.*llm_adapter.*=5e-5".- You need to set
--network_args "train_llm_adapter=True"to make sd-scripts train the LLM Adapter.train_llm_adapterisFalseby default. iftrain_llm_adapterisFalse,LoRANetwork.ANIMA_ADAPTER_TARGET_REPLACE_MODULEis excluded from LoRA creation and no LoRA modules are created for the adapter blocks https://github.com/kohya-ss/sd-scripts/blob/502cc3fab2aa22c106580e2e05c4692cfde5e5ff/networks/lora_anima.py#L539-L540
TLDR: llm_adapter_lr flag is unused by the lora training script, and sd-scripts does not train the LLM Adapter by default.
There is also the case of circlestone_labs themselves pointing out to disable the LLM adapter training with 'llm_adapter_lr=0'
https://civitai.com/models/2536147/greg-rutkowski-style-anima
Of course this is for diffusion-pipe trainer so can't confirm if it also applies to sd-scripts but if we assume that the default Anima behavior is for the LLM adapter to not be trained in any capacity, then explicitly adding llm_adapter_lr=0 for Lora training would seem kinda pointless.
Diffusion-pipe works differently than sd-scripts, in diffusion-pipe you have to specify that otherwise it trains the adapter. On sd-scripts it won't train unless you specify train_llm_adapter=True. This is even stated in the anima_train_network.md docs in sd-scripts and you can see it in the lora.py files "train_llm_adapter = kwargs.get("train_llm_adapter", "false")" (line 239 of lora_anima.py)
Yeah I have reviewed the docs but given the relatively substantial differences when comparing the Loras I wasn't sure if this flag fully disabled any and all adapter training. Thanks for pointing it out.
I didn't specify the flag yet the Loras produce different results on the same epochs despite using identical dataset/config+seed for the training. According to Claude there is potentially some influence on the Lora training conditions even if 'train_llm_adapter=True' isn't specified. Could also just be AI hallucinations though.
Loras will never be identical even if the dataset, seed, and parameters are all the same. You could maybe get deterministic results if you turned off all attention types but that's unrealistic
Getting true deterministic results is actually relatively easy in sd-scripts. All you have to do is modify a couple of torch related flags in the train util config, I am still using that for my Illustrious training setup to this day. I am aware that two Loras trained the same way won't be identical 1:1 without such setup but typically the difference are mostly in minute details. The bigger difference that I've noticed could be something specific to Anima though.