ViStream / README.md

Simplify Model Card - remove specific metrics, keep essential model information

603cbe7 verified 9 months ago

4.33 kB

	---
	license: cc-by-4.0
	library_name: pytorch
	tags:
	- computer-vision
	- object-tracking
	- spiking-neural-networks
	- visual-streaming-perception
	- energy-efficient
	- cvpr-2025
	pipeline_tag: object-detection
	---

	# ViStream: Law-of-Charge-Conservation Inspired Spiking Neural Network for Visual Streaming Perception

	ViStream is a novel energy-efficient framework for Visual Streaming Perception (VSP) that leverages Spiking Neural Networks (SNNs) with Law of Charge Conservation (LoCC) properties.

	## Model Details

	### Model Description

	- Developed by: Kang You, Ziling Wei, Jing Yan, Boning Zhang, Qinghai Guo, Yaoyu Zhang, Zhezhi He
	- Model type: Spiking Neural Network for Visual Streaming Perception
	- Language(s): PyTorch implementation
	- License: CC-BY-4.0
	- Paper: [CVPR 2025](https://openaccess.thecvf.com/content/CVPR2025/papers/You_VISTREAM_Improving_Computation_Efficiency_of_Visual_Streaming_Perception_via_Law-of-Charge-Conservation_CVPR_2025_paper.pdf)
	- Repository: [GitHub](https://github.com/Intelligent-Computing-Research-Group/ViStream)

	### Model Architecture

	ViStream introduces two key innovations:
	1. Law of Charge Conservation (LoCC) property in ST-BIF neurons
	2. Differential Encoding (DiffEncode) scheme for temporal optimization

	The framework achieves significant computational reduction while maintaining accuracy equivalent to ANN counterparts.

	## Uses

	### Direct Use

	ViStream can be directly used for:
	- Multiple Object Tracking (MOT)
	- Single Object Tracking (SOT)
	- Video Object Segmentation (VOS)
	- Multiple Object Tracking and Segmentation (MOTS)
	- Pose Tracking

	### Downstream Use

	The model can be fine-tuned for various visual streaming perception tasks in:
	- Autonomous driving
	- UAV navigation
	- AR/VR applications
	- Real-time surveillance

	## Bias, Risks, and Limitations

	### Limitations
	- Requires specific hardware optimization for maximum energy benefits
	- Performance may vary with different frame rates
	- Limited to visual perception tasks

	### Recommendations
	- Test thoroughly on target hardware before deployment
	- Consider computational constraints of edge devices
	- Validate performance on domain-specific datasets

	## How to Get Started with the Model

	```python
	from huggingface_hub import hf_hub_download
	import torch

	# Download the checkpoint
	checkpoint_path = hf_hub_download(
	repo_id="AndyBlocker/ViStream",
	filename="checkpoint-90.pth"
	)

	# Load the model (requires ViStream implementation)
	checkpoint = torch.load(checkpoint_path, map_location='cpu')
	```

	For complete usage examples, see the [GitHub repository](https://github.com/Intelligent-Computing-Research-Group/ViStream).

	## Training Details

	### Training Data

	The model was trained on multiple datasets for various visual streaming perception tasks including object tracking, video object segmentation, and pose tracking.

	### Training Procedure

	Training Details:
	- Framework: PyTorch
	- Optimization: Energy-efficient SNN training with Law of Charge Conservation
	- Architecture: ResNet-based backbone with spike quantization layers

	## Evaluation

	The model demonstrates competitive performance across multiple visual streaming perception tasks while achieving significant energy efficiency improvements compared to traditional ANN-based approaches. Detailed evaluation results are available in the [CVPR 2025 paper](https://openaccess.thecvf.com/content/CVPR2025/papers/You_VISTREAM_Improving_Computation_Efficiency_of_Visual_Streaming_Perception_via_Law-of-Charge-Conservation_CVPR_2025_paper.pdf).

	## Model Card Authors

	Kang You, Ziling Wei, Jing Yan, Boning Zhang, Qinghai Guo, Yaoyu Zhang, Zhezhi He

	## Model Card Contact

	For questions about this model, please open an issue in the [GitHub repository](https://github.com/Intelligent-Computing-Research-Group/ViStream).

	## Citation

	```bibtex
	@inproceedings{you2025vistream,
	title={VISTREAM: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network},
	author={You, Kang and Wei, Ziling and Yan, Jing and Zhang, Boning and Guo, Qinghai and Zhang, Yaoyu and He, Zhezhi},
	booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
	pages={8796--8805},
	year={2025}
	}
	```