Colab notebook inference

by Jokality - opened Aug 14, 2025

Discussion

Jokality

Aug 14, 2025

This comment has been hidden (marked as Resolved)

jimimased

Aug 14, 2025

•

edited Aug 14, 2025

from google.colab import drive
drive.mount('/content/drive')
!pip uninstall xformers -y
!pip uninstall diffusers transformers torch torchvision torchaudio -y

Install PyTorch first

!pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu121

Install diffusers and transformers WITHOUT xformers

!pip install diffusers==0.21.4 transformers==4.35.0 accelerate

Install other dependencies

!pip install opencv-python librosa soundfile Pillow numpy matplotlib tqdm einops omegaconf safetensors huggingface-hub audio-separator mediapipe scipy imageio[ffmpeg] moviepy

Set environment to disable xformers

!export XFORMERS_DISABLED=1

Mount drive

from google.colab import drive
drive.mount('/content/drive')

Clone fresh

!cd /content && rm -rf StableAvatar
!git clone https://github.com/Francis-Rings/StableAvatar.git
%cd StableAvatar

Download models

!pip install "huggingface_hub[cli]"
!huggingface-cli download FrancisRing/StableAvatar --local-dir ./checkpoints

%cd StableAvatar
!pip install -r requirements.txt

%cd StableAvatar

Use the official inference.sh parameters from the repository

!CUDA_VISIBLE_DEVICES=0 python inference.py
--config_path="deepspeed_config/wan2.1/wan_civitai.yaml"
--pretrained_model_name_or_path="./checkpoints/Wan2.1-Fun-V1.1-1.3B-InP"
--transformer_path="./checkpoints/StableAvatar-1.3B/transformer3d-square.pt"
--pretrained_wav2vec_path="./checkpoints/wav2vec2-base-960h"
--validation_reference_path="/content/drive/MyDrive/StableAvatar/images/person7.jpg"
--validation_driven_audio_path="/content/drive/MyDrive/StableAvatar/audio/speech2.wav"
--output_dir="/content/drive/MyDrive/StableAvatar/output_official"
--validation_prompts="A stunning anime female singer with colorful hair performing with electric guitar, passionate singing expression, futuristic tropical cyberpunk environment with neon palm trees and holographic elements, Japanese anime art style, vibrant pink and blue lighting, sci-fi paradise setting"
--width=512
--height=512
--sample_steps=50
--overlap_window_length=15
--clip_sample_n_frames=81
--motion_frame=60
--GPU_memory_mode="model_full_load"
--sample_text_guide_scale=8.0
--sample_audio_guide_scale=8.0
--seed=42

Jokality

Aug 16, 2025

thanks for the notebook, but my runtime keeps getting out of memory before it even loads the model. from its specifics, i guess it is good enough to run on the t4 colab gpu "sequential_cpu_offload."

Jokality changed discussion status to closed 21 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment