The random patch cropping was replaced by the TREAD method.
Spatial encoding loss was calculated on unskipped layers as it was explained elsewhere.
References
- 2501.04765
- 2601.08584 (layer pruning)
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support