Colab notebook inference

#2
by Jokality - opened
This comment has been hidden (marked as Resolved)

from google.colab import drive
drive.mount('/content/drive')
!pip uninstall xformers -y
!pip uninstall diffusers transformers torch torchvision torchaudio -y

Install PyTorch first

!pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu121

Install diffusers and transformers WITHOUT xformers

!pip install diffusers==0.21.4 transformers==4.35.0 accelerate

Install other dependencies

!pip install opencv-python librosa soundfile Pillow numpy matplotlib tqdm einops omegaconf safetensors huggingface-hub audio-separator mediapipe scipy imageio[ffmpeg] moviepy

Set environment to disable xformers

!export XFORMERS_DISABLED=1

Mount drive

from google.colab import drive
drive.mount('/content/drive')

Clone fresh

!cd /content && rm -rf StableAvatar
!git clone https://github.com/Francis-Rings/StableAvatar.git
%cd StableAvatar

Download models

!pip install "huggingface_hub[cli]"
!huggingface-cli download FrancisRing/StableAvatar --local-dir ./checkpoints

%cd StableAvatar
!pip install -r requirements.txt

%cd StableAvatar

Use the official inference.sh parameters from the repository

!CUDA_VISIBLE_DEVICES=0 python inference.py
--config_path="deepspeed_config/wan2.1/wan_civitai.yaml"
--pretrained_model_name_or_path="./checkpoints/Wan2.1-Fun-V1.1-1.3B-InP"
--transformer_path="./checkpoints/StableAvatar-1.3B/transformer3d-square.pt"
--pretrained_wav2vec_path="./checkpoints/wav2vec2-base-960h"
--validation_reference_path="/content/drive/MyDrive/StableAvatar/images/person7.jpg"
--validation_driven_audio_path="/content/drive/MyDrive/StableAvatar/audio/speech2.wav"
--output_dir="/content/drive/MyDrive/StableAvatar/output_official"
--validation_prompts="A stunning anime female singer with colorful hair performing with electric guitar, passionate singing expression, futuristic tropical cyberpunk environment with neon palm trees and holographic elements, Japanese anime art style, vibrant pink and blue lighting, sci-fi paradise setting"
--width=512
--height=512
--sample_steps=50
--overlap_window_length=15
--clip_sample_n_frames=81
--motion_frame=60
--GPU_memory_mode="model_full_load"
--sample_text_guide_scale=8.0
--sample_audio_guide_scale=8.0
--seed=42

thanks for the notebook, but my runtime keeps getting out of memory before it even loads the model. from its specifics, i guess it is good enough to run on the t4 colab gpu "sequential_cpu_offload."

Jokality changed discussion status to closed

Sign up or log in to comment