Spaces:

furbola
/

chaskick

Running on Zero

App Files Files Community

chaskick / README.md

Mirko Trasciatti

Restore working version with all fixes

1d6ec77 about 1 month ago

preview code

raw

history blame contribute delete

6.76 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: SAM2 Video Background Remover
emoji: 🎥
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - computer-vision
  - video
  - segmentation
  - sam2
  - background-removal
  - object-tracking

🎥 SAM2 Video Background Remover

Remove backgrounds from videos by tracking objects using Meta's Segment Anything Model 2 (SAM2).

Features

✨ Background Removal: Automatically remove backgrounds and keep only tracked objects
🎯 Object Tracking: Track multiple objects across video frames
🖥️ Interactive UI: Easy-to-use Gradio interface
🔌 REST API: Programmatic access via API endpoints
⚡ GPU Accelerated: Fast processing with CUDA support

How It Works

SAM2 is a foundation model for video segmentation that can:

Segment objects based on point or box annotations
Track objects automatically across all video frames
Handle occlusions and object reappearance
Process multiple objects simultaneously

Usage

🖱️ Simple Mode (Web UI)

Upload your video
Specify X,Y coordinates of the object you want to track (from first frame)
Click "Process Video"
Download the result with background removed!

Example: For a 640x480 video with a person in the center, use X=320, Y=240

🔧 Advanced Mode (JSON Annotations)

For more control, use JSON annotations:

[
    {
        "frame_idx": 0,
        "object_id": 1,
        "points": [[320, 240]],
        "labels": [1]
    }
]

Parameters:

frame_idx: Frame number to annotate (0 = first frame)
object_id: Unique ID for each object (1, 2, 3, ...)
points: List of [x, y] coordinates on the object
labels: 1 for foreground point, 0 for background point

📡 API Usage

You can call this Space programmatically using the Gradio Client:

Python Example

from gradio_client import Client
import json

# Connect to the Space
client = Client("YOUR_USERNAME/sam2-video-bg-remover")

# Define what to track
annotations = [
    {
        "frame_idx": 0,
        "object_id": 1,
        "points": [[320, 240]],  # x, y coordinates
        "labels": [1]             # 1 = foreground
    }
]

# Process video
result = client.predict(
    video_file="./input_video.mp4",
    annotations_json=json.dumps(annotations),
    remove_background=True,
    max_frames=300,  # Limit frames for faster processing
    api_name="/segment_video_api"
)

print(f"Output video saved to: {result}")

Track Multiple Objects

annotations = [
    # First object (person)
    {
        "frame_idx": 0,
        "object_id": 1,
        "points": [[320, 240]],
        "labels": [1]
    },
    # Second object (ball)
    {
        "frame_idx": 0,
        "object_id": 2,
        "points": [[500, 300]],
        "labels": [1]
    }
]

Refine Segmentation with Background Points

annotations = [
    {
        "frame_idx": 0,
        "object_id": 1,
        "points": [
            [320, 240],  # Point ON the object
            [100, 100]   # Point on background to exclude
        ],
        "labels": [1, 0]  # 1=foreground, 0=background
    }
]

🌐 HTTP API

You can also call the API directly via HTTP:

curl -X POST https://YOUR_USERNAME-sam2-video-bg-remover.hf.space/api/predict \
  -F "video_file=@input_video.mp4" \
  -F 'annotations_json=[{"frame_idx":0,"object_id":1,"points":[[320,240]],"labels":[1]}]' \
  -F "remove_background=true" \
  -F "max_frames=300"

Parameters

Parameter	Type	Default	Description
`video_file`	File	-	Input video file (required)
`annotations_json`	String	-	JSON array of annotations (required)
`remove_background`	Boolean	`true`	Remove background or just highlight objects
`max_frames`	Integer	`null`	Limit frames for faster processing

Tips & Best Practices

🎯 Getting Good Results

Choose Clear Points: Click on the center/most distinctive part of your object
Add Multiple Points: For complex objects, add 2-3 points on different parts
Use Background Points: Add points with label: 0 on areas you DON'T want
Annotate Key Frames: If object changes significantly, add annotations on multiple frames

⚡ Performance Tips

Limit Frames: Use max_frames parameter for long videos
Use Smaller Model: Default is sam2.1-hiera-tiny for speed
Process Shorter Clips: Split long videos into segments

🐛 Troubleshooting

Issue	Solution
Object not tracked	Add more points on different parts of the object
Background leakage	Add background points with `label: 0`
Slow processing	Reduce `max_frames` or use a shorter video
Wrong object tracked	Be more precise with point coordinates

Model Information

This Space uses facebook/sam2.1-hiera-tiny for efficient processing. Other available models:

facebook/sam2.1-hiera-tiny - Fastest, good quality ⚡
facebook/sam2.1-hiera-small - Balanced
facebook/sam2.1-hiera-base-plus - Higher quality
facebook/sam2.1-hiera-large - Best quality, slower 🎯

Use Cases

🎬 Video Production: Remove backgrounds for green screen effects
🏃 Sports Analysis: Isolate athletes for motion analysis
🎮 Content Creation: Extract game characters or objects
🔬 Research: Track objects in scientific videos
📱 Social Media: Create engaging content with background removal

Limitations

Video length affects processing time (longer = slower)
GPU recommended for videos > 10 seconds
Very fast-moving objects may require multiple annotations
Extreme lighting changes can affect tracking quality

Citation

If you use this Space, please cite the SAM2 paper:

@article{ravi2024sam2,
  title={Segment Anything in Images and Videos},
  author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and others},
  journal={arXiv preprint arXiv:2408.00714},
  year={2024}
}

License

Apache 2.0

Links

Built with ❤️ using Transformers and Gradio