Spaces:
Sleeping
Sleeping
Dyuti Dasmahapatra
commited on
Commit
Β·
a090f9b
1
Parent(s):
0101a8b
docs: document added models (ResNet, Swin, DeiT, EfficientNet) and EfficientNet fallback
Browse files- PROJECT_SUMMARY.md +1 -1
- QUICKSTART.md +6 -2
- README.md +15 -7
PROJECT_SUMMARY.md
CHANGED
|
@@ -279,7 +279,7 @@ To understand the codebase:
|
|
| 279 |
|
| 280 |
Things you might want to add later:
|
| 281 |
|
| 282 |
-
- [
|
| 283 |
- [ ] Batch image processing
|
| 284 |
- [ ] Export results as PDF report
|
| 285 |
- [ ] Save/load analysis sessions
|
|
|
|
| 279 |
|
| 280 |
Things you might want to add later:
|
| 281 |
|
| 282 |
+
- [x] More ViT model variants (DeiT, Swin) β added ResNet, Swin, DeiT, EfficientNet support in `model_loader.py`
|
| 283 |
- [ ] Batch image processing
|
| 284 |
- [ ] Export results as PDF report
|
| 285 |
- [ ] Save/load analysis sessions
|
QUICKSTART.md
CHANGED
|
@@ -97,9 +97,13 @@ http://localhost:7860
|
|
| 97 |
|
| 98 |
### Step 3: Load a Model
|
| 99 |
|
| 100 |
-
1. In the **"Select Model"** dropdown, choose `ViT-Base`
|
| 101 |
2. Click the **"π Load Model"** button
|
| 102 |
-
3. Wait for the confirmation
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
### Step 4: Analyze Your First Image
|
| 105 |
|
|
|
|
| 97 |
|
| 98 |
### Step 3: Load a Model
|
| 99 |
|
| 100 |
+
1. In the **"Select Model"** dropdown, choose a model (examples: `ViT-Base`, `ViT-Large`, `ResNet-50`, `Swin Transformer`, `DeiT`, `EfficientNet`)
|
| 101 |
2. Click the **"π Load Model"** button
|
| 102 |
+
3. Wait for the confirmation message, e.g. `β
Model loaded: google/vit-base-patch16-224`
|
| 103 |
+
|
| 104 |
+
Notes:
|
| 105 |
+
- For ViT/DeiT models you can use Attention Visualization (patch-level attention maps). For ResNet, Swin, and EfficientNet, use GradCAM or GradientSHAP (the UI will still show options but attention maps are ViT-specific).
|
| 106 |
+
- EfficientNet may fall back to a `timm` loader automatically if the HF download triggers a torch security restriction; no torch upgrade is required.
|
| 107 |
|
| 108 |
### Step 4: Analyze Your First Image
|
| 109 |
|
README.md
CHANGED
|
@@ -373,16 +373,24 @@ Compares performance across subgroups to identify:
|
|
| 373 |
|
| 374 |
---
|
| 375 |
|
| 376 |
-
|
| 377 |
|
| 378 |
-
|
| 379 |
|
| 380 |
-
|
|
| 381 |
-
|
| 382 |
-
| `google/vit-base-patch16-224` |
|
| 383 |
-
| `google/vit-large-patch16-224` |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 384 |
|
| 385 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 386 |
|
| 387 |
---
|
| 388 |
|
|
|
|
| 373 |
|
| 374 |
---
|
| 375 |
|
| 376 |
+
### π§ Supported Models
|
| 377 |
|
| 378 |
+
The dashboard now supports multiple architectures (ViT family and others). The models currently exposed in the UI are:
|
| 379 |
|
| 380 |
+
| Display name | Hugging Face ID | Notes |
|
| 381 |
+
|--------------:|-----------------|-------|
|
| 382 |
+
| ViT-Base | `google/vit-base-patch16-224` | ViT β attention visualizations and GradCAM supported |
|
| 383 |
+
| ViT-Large | `google/vit-large-patch16-224` | ViT β attention visualizations and GradCAM supported |
|
| 384 |
+
| ResNet-50 | `microsoft/resnet-50` | CNN β GradCAM supported; attention visualization not applicable |
|
| 385 |
+
| Swin Transformer | `microsoft/swin-base-patch4-window7-224` | Swin β GradCAM supported; attention visualization limited to ViT-style models |
|
| 386 |
+
| DeiT | `facebook/deit-base-patch16-224` | ViT-like β attention visualizations and GradCAM supported |
|
| 387 |
+
| EfficientNet-B7 | `google/efficientnet-b7` | CNN β loaded via Hugging Face when possible; if HF loading triggers a torch.load restriction, the app falls back to `timm` (no torch upgrade required). GradCAM supported; attention visualization not applicable |
|
| 388 |
|
| 389 |
+
Notes:
|
| 390 |
+
- Attention visualizations (patch-level attention maps) are meaningful for ViT-style models (ViT, DeiT). For CNNs (ResNet, EfficientNet) and some hierarchical transformers (Swin), the dashboard will use GradCAM or a last-conv fallback instead of patch attention.
|
| 391 |
+
- EfficientNet on the Hugging Face hub can trigger a torch.load security restriction in older torch versions. The toolkit will transparently fall back to a `timm`-based loader to avoid requiring a torch upgrade; this is handled automatically in `src/model_loader.py`.
|
| 392 |
+
|
| 393 |
+
**Easy to extend**: Add more models to `src/model_loader.py` under `SUPPORTED_MODELS` and they will appear in the app dropdown.
|
| 394 |
|
| 395 |
---
|
| 396 |
|