Spaces:

Dyra1204
/

ViT-Auditing-Toolkit

Sleeping

App Files Files Community

Dyuti Dasmahapatra commited on Oct 27

Commit

a090f9b

1 Parent(s): 0101a8b

docs: document added models (ResNet, Swin, DeiT, EfficientNet) and EfficientNet fallback

Browse files

Files changed (3) hide show

PROJECT_SUMMARY.md +1 -1
QUICKSTART.md +6 -2
README.md +15 -7

PROJECT_SUMMARY.md CHANGED Viewed

@@ -279,7 +279,7 @@ To understand the codebase:
 Things you might want to add later:
-- [ ] More ViT model variants (DeiT, BEiT, Swin)
 - [ ] Batch image processing
 - [ ] Export results as PDF report
 - [ ] Save/load analysis sessions

 Things you might want to add later:
+- [x] More ViT model variants (DeiT, Swin) — added ResNet, Swin, DeiT, EfficientNet support in `model_loader.py`
 - [ ] Batch image processing
 - [ ] Export results as PDF report
 - [ ] Save/load analysis sessions

QUICKSTART.md CHANGED Viewed

@@ -97,9 +97,13 @@ http://localhost:7860
 ### Step 3: Load a Model
-1. In the **"Select Model"** dropdown, choose `ViT-Base`
 2. Click the **"🔄 Load Model"** button
-3. Wait for the confirmation: `✅ Model loaded: google/vit-base-patch16-224`
 ### Step 4: Analyze Your First Image

 ### Step 3: Load a Model
+1. In the **"Select Model"** dropdown, choose a model (examples: `ViT-Base`, `ViT-Large`, `ResNet-50`, `Swin Transformer`, `DeiT`, `EfficientNet`)
 2. Click the **"🔄 Load Model"** button
+3. Wait for the confirmation message, e.g. `✅ Model loaded: google/vit-base-patch16-224`
+Notes:
+- For ViT/DeiT models you can use Attention Visualization (patch-level attention maps). For ResNet, Swin, and EfficientNet, use GradCAM or GradientSHAP (the UI will still show options but attention maps are ViT-specific).
+- EfficientNet may fall back to a `timm` loader automatically if the HF download triggers a torch security restriction; no torch upgrade is required.
 ### Step 4: Analyze Your First Image

README.md CHANGED Viewed

@@ -373,16 +373,24 @@ Compares performance across subgroups to identify:
 ---
-## 🔧 Supported Models
-Currently supported Vision Transformer models from Hugging Face:
-| Model | Parameters | Input Size | Accuracy (ImageNet) |
-|-------|-----------|------------|---------------------|
-| `google/vit-base-patch16-224` | 86M | 224×224 | ~81.3% |
-| `google/vit-large-patch16-224` | 304M | 224×224 | ~82.6% |
-**Easy to extend**: Add any Hugging Face ViT model to `src/model_loader.py`
 ---

 ---
+### 🔧 Supported Models
+The dashboard now supports multiple architectures (ViT family and others). The models currently exposed in the UI are:
+| Display name | Hugging Face ID | Notes |
+|--------------:|-----------------|-------|
+| ViT-Base | `google/vit-base-patch16-224` | ViT — attention visualizations and GradCAM supported |
+| ViT-Large | `google/vit-large-patch16-224` | ViT — attention visualizations and GradCAM supported |
+| ResNet-50 | `microsoft/resnet-50` | CNN — GradCAM supported; attention visualization not applicable |
+| Swin Transformer | `microsoft/swin-base-patch4-window7-224` | Swin — GradCAM supported; attention visualization limited to ViT-style models |
+| DeiT | `facebook/deit-base-patch16-224` | ViT-like — attention visualizations and GradCAM supported |
+| EfficientNet-B7 | `google/efficientnet-b7` | CNN — loaded via Hugging Face when possible; if HF loading triggers a torch.load restriction, the app falls back to `timm` (no torch upgrade required). GradCAM supported; attention visualization not applicable |
+Notes:
+- Attention visualizations (patch-level attention maps) are meaningful for ViT-style models (ViT, DeiT). For CNNs (ResNet, EfficientNet) and some hierarchical transformers (Swin), the dashboard will use GradCAM or a last-conv fallback instead of patch attention.
+- EfficientNet on the Hugging Face hub can trigger a torch.load security restriction in older torch versions. The toolkit will transparently fall back to a `timm`-based loader to avoid requiring a torch upgrade; this is handled automatically in `src/model_loader.py`.
+**Easy to extend**: Add more models to `src/model_loader.py` under `SUPPORTED_MODELS` and they will appear in the app dropdown.
 ---