Delanoe Pirard
Deploy to HuggingFace Spaces
18b382b

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

πŸš€ Depth Anything 3 Command Line Interface

πŸ“‹ Table of Contents

πŸ“– Overview

The Depth Anything 3 CLI provides a comprehensive command-line toolkit supporting image depth estimation, video processing, COLMAP dataset handling, and web applications.

The backend service enables cache model to GPU so that we do not need to reload model for each command.

⚑ Quick Start

The CLI can run fully offline or connect to the backend for cached weights and task scheduling:

# πŸ”§ Start backend service (optional, keeps model resident in GPU memory)
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE

# πŸš€ Use auto mode to process input
da3 auto path/to/input --export-dir ./workspace/scene001

# ♻️ Reuse backend for next job
da3 auto path/to/video.mp4 \
    --export-dir ./workspace/scene002 \
    --use-backend \
    --backend-url http://localhost:8008

Each export directory contains scene.glb, scene.jpg, and optional extras such as depth_vis/ or gs_video/ depending on the requested format.

πŸ“š Command Reference

πŸ€– auto - Auto Mode

Automatically detect input type and dispatch to the appropriate handler.

Usage:

da3 auto INPUT_PATH [OPTIONS]

Input Type Detection:

  • πŸ–ΌοΈ Single image file (.jpg, .png, .jpeg, .webp, .bmp, .tiff, .tif)
  • πŸ“ Image directory
  • 🎬 Video file (.mp4, .avi, .mov, .mkv, .flv, .wmv, .webm, .m4v)
  • πŸ“ COLMAP directory (containing images/ and sparse/ subdirectories)

Parameters:

Parameter Type Default Description
INPUT_PATH str Required Input path (image, directory, video, or COLMAP)
--model-dir str Default model Model directory path
--export-dir str debug Export directory
--export-format str glb Export format (supports mini_npz, glb, feat_vis, etc., can be combined with hyphens)
--device str cuda Device to use
--use-backend bool False Use backend service for inference
--backend-url str http://localhost:8008 Backend service URL
--process-res int 504 Processing resolution
--process-res-method str upper_bound_resize Processing resolution method
--export-feat str "" Export features from specified layers, comma-separated (e.g., "0,1,2")
--auto-cleanup bool False Automatically clean export directory without confirmation
--fps float 1.0 [Video] Frame sampling FPS
--sparse-subdir str "" [COLMAP] Sparse reconstruction subdirectory (e.g., "0" for sparse/0/)
--align-to-input-ext-scale bool True [COLMAP] Align prediction to input extrinsics scale
--use-ray-pose bool False Use ray-based pose estimation instead of camera decoder
--ref-view-strategy str saddle_balanced Reference view selection strategy: first, middle, saddle_balanced, saddle_sim_range. See docs
--conf-thresh-percentile float 40.0 [GLB] Lower percentile for adaptive confidence threshold
--num-max-points int 1000000 [GLB] Maximum number of points in the point cloud
--show-cameras bool True [GLB] Show camera wireframes in the exported scene
--feat-vis-fps int 15 [FEAT_VIS] Frame rate for output video

Examples:

# πŸ–ΌοΈ Auto-process an image
da3 auto path/to/image.jpg --export-dir ./output

# 🎬 Auto-process a video
da3 auto path/to/video.mp4 --fps 2.0 --export-dir ./output

# πŸ”§ Use backend service
da3 auto path/to/input \
    --export-format mini_npz-glb \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

πŸ–ΌοΈ image - Single Image Processing

Process a single image for camera pose and depth estimation.

Usage:

da3 image IMAGE_PATH [OPTIONS]

Parameters:

Parameter Type Default Description
IMAGE_PATH str Required Input image file path
--model-dir str Default model Model directory path
--export-dir str debug Export directory
--export-format str glb Export format
--device str cuda Device to use
--use-backend bool False Use backend service for inference
--backend-url str http://localhost:8008 Backend service URL
--process-res int 504 Processing resolution
--process-res-method str upper_bound_resize Processing resolution method
--export-feat str "" Export feature layer indices (comma-separated)
--auto-cleanup bool False Automatically clean export directory
--use-ray-pose bool False Use ray-based pose estimation instead of camera decoder
--ref-view-strategy str saddle_balanced Reference view selection strategy. See docs
--conf-thresh-percentile float 40.0 [GLB] Confidence threshold percentile
--num-max-points int 1000000 [GLB] Maximum number of points
--show-cameras bool True [GLB] Show cameras
--feat-vis-fps int 15 [FEAT_VIS] Video frame rate

Examples:

# ✨ Basic usage
da3 image path/to/image.png --export-dir ./output

# ⚑ With backend acceleration
da3 image path/to/image.png \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

# πŸ” Export feature visualization
da3 image image.jpg \
    --export-format feat_vis \
    --export-feat "9,19,29,39" \
    --export-dir ./results

πŸ—‚οΈ images - Image Directory Processing

Process a directory of images for batch depth estimation.

Usage:

da3 images IMAGES_DIR [OPTIONS]

Parameters:

Parameter Type Default Description
IMAGES_DIR str Required Directory path containing images
--image-extensions str png,jpg,jpeg Image file extensions to process (comma-separated)
--model-dir str Default model Model directory path
--export-dir str debug Export directory
--export-format str glb Export format
--device str cuda Device to use
--use-backend bool False Use backend service for inference
--backend-url str http://localhost:8008 Backend service URL
--process-res int 504 Processing resolution
--process-res-method str upper_bound_resize Processing resolution method
--export-feat str "" Export feature layer indices
--auto-cleanup bool False Automatically clean export directory
--use-ray-pose bool False Use ray-based pose estimation instead of camera decoder
--ref-view-strategy str saddle_balanced Reference view selection strategy. See docs
--conf-thresh-percentile float 40.0 [GLB] Confidence threshold percentile
--num-max-points int 1000000 [GLB] Maximum number of points
--show-cameras bool True [GLB] Show cameras
--feat-vis-fps int 15 [FEAT_VIS] Video frame rate

Examples:

# πŸ“ Process directory (defaults to png/jpg/jpeg)
da3 images ./image_folder --export-dir ./output

# 🎯 Custom extensions
da3 images ./dataset --image-extensions "png,jpg,webp" --export-dir ./output

# πŸ”§ Use backend service
da3 images ./dataset \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

🎬 video - Video Processing

Process video by extracting frames for depth estimation.

Usage:

da3 video VIDEO_PATH [OPTIONS]

Parameters:

Parameter Type Default Description
VIDEO_PATH str Required Input video file path
--fps float 1.0 Frame extraction sampling FPS
--model-dir str Default model Model directory path
--export-dir str debug Export directory
--export-format str glb Export format
--device str cuda Device to use
--use-backend bool False Use backend service for inference
--backend-url str http://localhost:8008 Backend service URL
--process-res int 504 Processing resolution
--process-res-method str upper_bound_resize Processing resolution method
--export-feat str "" Export feature layer indices
--auto-cleanup bool False Automatically clean export directory
--use-ray-pose bool False Use ray-based pose estimation instead of camera decoder
--ref-view-strategy str saddle_balanced Reference view selection strategy. See docs
--conf-thresh-percentile float 40.0 [GLB] Confidence threshold percentile
--num-max-points int 1000000 [GLB] Maximum number of points
--show-cameras bool True [GLB] Show cameras
--feat-vis-fps int 15 [FEAT_VIS] Video frame rate

Examples:

# ✨ Basic video processing
da3 video path/to/video.mp4 --export-dir ./output

# βš™οΈ Control frame sampling and resolution
da3 video path/to/video.mp4 \
    --fps 2.0 \
    --process-res 1024 \
    --export-dir ./output

# πŸ”§ Use backend service
da3 video path/to/video.mp4 \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

πŸ“ colmap - COLMAP Dataset Processing

Run pose-conditioned depth estimation on COLMAP data.

Usage:

da3 colmap COLMAP_DIR [OPTIONS]

Parameters:

Parameter Type Default Description
COLMAP_DIR str Required COLMAP directory containing images/ and sparse/ subdirectories
--sparse-subdir str "" Sparse reconstruction subdirectory (e.g., "0" for sparse/0/)
--align-to-input-ext-scale bool True Align prediction to input extrinsics scale
--model-dir str Default model Model directory path
--export-dir str debug Export directory
--export-format str glb Export format
--device str cuda Device to use
--use-backend bool False Use backend service for inference
--backend-url str http://localhost:8008 Backend service URL
--process-res int 504 Processing resolution
--process-res-method str upper_bound_resize Processing resolution method
--export-feat str "" Export feature layer indices
--auto-cleanup bool False Automatically clean export directory
--use-ray-pose bool False Use ray-based pose estimation instead of camera decoder
--ref-view-strategy str saddle_balanced Reference view selection strategy. See docs
--conf-thresh-percentile float 40.0 [GLB] Confidence threshold percentile
--num-max-points int 1000000 [GLB] Maximum number of points
--show-cameras bool True [GLB] Show cameras
--feat-vis-fps int 15 [FEAT_VIS] Video frame rate

Examples:

# πŸ“ Process COLMAP dataset
da3 colmap ./colmap_dataset --export-dir ./output

# 🎯 Use specific sparse subdirectory and align scale
da3 colmap ./colmap_dataset \
    --sparse-subdir 0 \
    --align-to-input-ext-scale \
    --export-dir ./output

# πŸ”§ Use backend service
da3 colmap ./colmap_dataset \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

πŸ”§ backend - Backend Service

Start model backend service with integrated gallery.

Usage:

da3 backend [OPTIONS]

Parameters:

Parameter Type Default Description
--model-dir str Default model Model directory path
--device str cuda Device to use
--host str 127.0.0.1 Host address to bind to
--port int 8008 Port number to bind to
--gallery-dir str Default gallery dir Gallery directory path (optional)

Features:

  • 🎯 Keeps model resident in GPU memory
  • πŸ”Œ Provides REST inference API
  • πŸ“Š Integrated dashboard and status monitoring
  • πŸ–ΌοΈ Optional gallery browser (if --gallery-dir is provided)

Available Endpoints:

  • 🏠 / - Home page
  • πŸ“Š /dashboard - Dashboard
  • βœ… /status - API status
  • πŸ–ΌοΈ /gallery/ - Gallery browser (if enabled)

Examples:

# πŸš€ Basic backend service
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE

# πŸ–ΌοΈ Backend with gallery
da3 backend \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --device cuda \
    --host 0.0.0.0 \
    --port 8008 \
    --gallery-dir ./workspace

# πŸ’» Use CPU
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE --device cpu

🎨 gradio - Gradio Application

Launch Depth Anything 3 Gradio interactive web application.

Usage:

da3 gradio [OPTIONS]

Parameters:

Parameter Type Default Description
--model-dir str Required Model directory path
--workspace-dir str Required Workspace directory path
--gallery-dir str Required Gallery directory path
--host str 127.0.0.1 Host address to bind to
--port int 7860 Port number to bind to
--share bool False Create a public link
--debug bool False Enable debug mode
--cache-examples bool False Pre-cache all example scenes at startup
--cache-gs-tag str "" Tag to match scene names for high-res+3DGS caching

Examples:

# 🎨 Basic Gradio application
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir ./workspace \
    --gallery-dir ./gallery

# 🌐 Enable sharing and debug
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir ./workspace \
    --gallery-dir ./gallery \
    --share \
    --debug

# ⚑ Pre-cache examples
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir ./workspace \
    --gallery-dir ./gallery \
    --cache-examples \
    --cache-gs-tag "dl3dv"

πŸ–ΌοΈ gallery - Gallery Server

Launch standalone Depth Anything 3 Gallery server.

Usage:

da3 gallery [OPTIONS]

Parameters:

Parameter Type Default Description
--gallery-dir str Default gallery dir Gallery root directory
--host str 127.0.0.1 Host address to bind to
--port int 8007 Port number to bind to
--open-browser bool False Open browser after launch

Note: The gallery expects each scene folder to contain at least scene.glb and scene.jpg, with optional subfolders such as depth_vis/ or gs_video/.

Examples:

# πŸ–ΌοΈ Basic gallery server
da3 gallery --gallery-dir ./workspace

# 🌐 Custom host and port
da3 gallery \
    --gallery-dir ./workspace \
    --host 0.0.0.0 \
    --port 8007

# πŸš€ Auto-open browser
da3 gallery --gallery-dir ./workspace --open-browser

βš™οΈ Parameter Details

πŸ”§ Common Parameters

  • --export-dir: Output directory, defaults to debug

  • --export-format: Export format, supports combining multiple formats with hyphens:

    • πŸ“¦ mini_npz: Compressed NumPy format
    • 🎨 glb: glTF binary format (3D scene)
    • πŸ” feat_vis: Feature visualization
    • Example: mini_npz-glb exports both formats
  • --process-res / --process-res-method: Control preprocessing resolution strategy

    • process-res: Target resolution (default 504)
    • process-res-method: Resize method (default upper_bound_resize)
  • --auto-cleanup: Remove existing export directory without confirmation

  • --use-backend / --backend-url: Reuse running backend service

    • ⚑ Reduces model loading time
    • 🌐 Supports distributed processing
  • --export-feat: Layer indices for exporting intermediate features (comma-separated)

    • Example: "9,19,29,39"

🎨 GLB Export Parameters

  • --conf-thresh-percentile: Lower percentile for adaptive confidence threshold (default 40.0)

    • Used to filter low-confidence points
  • --num-max-points: Maximum number of points in point cloud (default 1,000,000)

    • Controls output file size and performance
  • --show-cameras: Show camera wireframes in exported scene (default True)

πŸ” Feature Visualization Parameters

  • --feat-vis-fps: Frame rate for feature visualization output video (default 15)

🎬 Video-Specific Parameters

  • --fps: Video frame extraction sampling rate (default 1.0 FPS)
    • Higher values extract more frames

πŸ“ COLMAP-Specific Parameters

  • --sparse-subdir: Sparse reconstruction subdirectory

    • Empty string uses sparse/ directory
    • "0" uses sparse/0/ directory
  • --align-to-input-ext-scale: Align prediction to input extrinsics scale (default True)

    • Ensures depth estimation is consistent with COLMAP scale

πŸ’‘ Usage Examples

1️⃣ Basic Workflow

# πŸ”§ Start backend service
da3 backend --model-dir depth-anything/DA3NESTED-GIANT-LARGE --host 0.0.0.0 --port 8008

# πŸ–ΌοΈ Process single image
da3 image image.jpg --export-dir ./output1 --use-backend

# 🎬 Process video
da3 video video.mp4 --fps 2.0 --export-dir ./output2 --use-backend

# πŸ“ Process COLMAP dataset
da3 colmap ./colmap_data --export-dir ./output3 --use-backend

2️⃣ Using Auto Mode

# πŸ€– Auto-detect and process
da3 auto ./unknown_input --export-dir ./output

# ⚑ With backend acceleration
da3 auto ./unknown_input \
    --use-backend \
    --backend-url http://localhost:8008 \
    --export-dir ./output

3️⃣ Multi-Format Export

# πŸ“¦ Export both NPZ and GLB formats
da3 auto assets/examples/SOH \
    --export-format mini_npz-glb \
    --export-dir ./workspace/soh

# πŸ” Export feature visualization
da3 image image.jpg \
    --export-format feat_vis \
    --export-feat "9,19,29,39" \
    --export-dir ./results

4️⃣ Advanced Configuration

# βš™οΈ Custom resolution and point cloud density
da3 image image.jpg \
    --process-res 1024 \
    --num-max-points 2000000 \
    --conf-thresh-percentile 30.0 \
    --export-dir ./output

# πŸ“ COLMAP advanced options
da3 colmap ./colmap_data \
    --sparse-subdir 0 \
    --align-to-input-ext-scale \
    --process-res 756 \
    --export-dir ./output

5️⃣ Batch Processing Workflow

# πŸ”§ Start backend
da3 backend \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --device cuda \
    --host 0.0.0.0 \
    --port 8008 \
    --gallery-dir ./workspace

# πŸ”„ Batch process multiple scenes
for scene in scene1 scene2 scene3; do
    da3 auto ./data/$scene \
        --export-dir ./workspace/$scene \
        --use-backend \
        --auto-cleanup
done

# πŸ–ΌοΈ Launch gallery to view results
da3 gallery --gallery-dir ./workspace --open-browser

6️⃣ Web Applications

# 🎨 Launch Gradio application
da3 gradio \
    --model-dir depth-anything/DA3NESTED-GIANT-LARGE \
    --workspace-dir workspace/gradio \
    --gallery-dir ./gallery \
    --host 0.0.0.0 \
    --port 7860 \
    --share

7️⃣ Transformer Feature Visualization

# πŸ” Export Transformer features
# πŸ“¦ Combined with numerical output
da3 auto video.mp4 \
    --export-format glb-feat_vis \
    --export-feat "11,21,31" \
    --export-dir ./debug \
    --use-backend

πŸ“ Notes

  1. πŸ”§ Backend Service: Recommended for processing multiple tasks to improve efficiency
  2. πŸ’Ύ GPU Memory: Be mindful of GPU memory usage when processing high-resolution inputs
  3. πŸ“ Export Directory: Use --auto-cleanup to avoid manual confirmation for deletion
  4. πŸ”€ Format Combination: Multiple export formats can be combined with hyphens (e.g., mini_npz-glb-feat_vis)
  5. πŸ“ COLMAP Data: Ensure COLMAP directory structure is correct (contains images/ and sparse/ subdirectories)

❓ Getting Help

View detailed help for any command:

# πŸ“– View main help
da3 --help

# πŸ” View specific command help
da3 auto --help
da3 image --help
da3 backend --help