header

Typing SVG


Python PyTorch MediaPipe OpenCV CUDA License GitHub


What Is This?

HybridEmotionNet β€” a dual-branch neural network for real-time facial emotion recognition that fuses EfficientNet-B0 appearance features with MediaPipe 3D landmark geometry via bidirectional cross-attention.

Processes webcam frames at 30+ FPS, extracts 478 3D landmarks, crops the face, and classifies into 7 emotions with temporal smoothing.


Architecture

Architecture

Face crop (224Γ—224) ──► EfficientNet-B0 ──► [B, 256] appearance
478 landmarks (xyz)  ──► MLP encoder    ──► [B, 256] geometry
                               β”‚
               Bidirectional Cross-Attention (4 heads each)
               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
               β”‚  coord β†’ CNN  (geometry queries appear.) β”‚
               β”‚  CNN  β†’ coord (appear. queries geometry) β”‚
               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚
               Fusion MLP: 512 β†’ 384 β†’ 256 β†’ 128
                               β”‚
               Classifier:   128 β†’ 7 emotions
Component Detail
CNN branch EfficientNet-B0, ImageNet init, blocks 0–2 frozen
Coord branch MLP 1434 β†’ 512 β†’ 384 β†’ 256, BN + Dropout
Fusion Bidirectional cross-attention + MLP
Parameters 6.2M total / 5.75M trainable
Model size 72 MB

Files in This Repo

File Size Required
models/weights/hybrid_best_model.pth 72 MB Yes β€” model weights
models/scalers/hybrid_coordinate_scaler.pkl 18 KB Yes β€” landmark scaler
Architecture digram.png β€” No β€” docs only

Quick Start

1 β€” Clone the code

git clone https://github.com/Huuffy/VisageCNN.git
cd VisageCNN
python -m venv venv && venv\Scripts\activate
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt

2 β€” Download weights

from huggingface_hub import hf_hub_download
import shutil, pathlib

for remote, local in [
    ("models/weights/hybrid_best_model.pth",        "models/weights/hybrid_best_model.pth"),
    ("models/scalers/hybrid_coordinate_scaler.pkl", "models/scalers/hybrid_coordinate_scaler.pkl"),
]:
    src = hf_hub_download(repo_id="Huuffy/VisageCNN", filename=remote)
    pathlib.Path(local).parent.mkdir(parents=True, exist_ok=True)
    shutil.copy(src, local)

Or with the HF CLI:

hf download Huuffy/VisageCNN models/weights/hybrid_best_model.pth --local-dir .
hf download Huuffy/VisageCNN models/scalers/hybrid_coordinate_scaler.pkl --local-dir .

3 β€” Run inference

python inference/run_hybrid.py

Press Q to quit.


Emotion Classes

Label Emotion Key Signals
0 Angry Furrowed brows, tightened jaw
1 Disgust Raised upper lip, wrinkled nose
2 Fear Wide eyes, raised brows, open mouth
3 Happy Raised cheeks, open smile
4 Neutral Relaxed, no strong deformation
5 Sad Lowered brow corners, downturned lips
6 Surprised Raised brows, wide eyes, dropped jaw

Training Dataset

~30k clean images β€” FER2013 noise removed across all classes:

Class Images Sources
Angry 6,130 RAF-DB + AffectNet + AffectNet-Short + CK+
Surprised 5,212 RAF-DB + AffectNet
Sad 4,941 RAF-DB + AffectNet + AffectNet-Short + CK+
Disgust 3,782 AffectNet-Short + RAF-DB + CK+
Neutral 3,475 RAF-DB + AffectNet
Fear 3,418 AffectNet-Short + RAF-DB + CK+
Happy 3,124 RAF-DB + AffectNet

Max class imbalance: 1.97Γ—


Training Config

Setting Value
Loss Focal Loss Ξ³=2.0 + label smoothing 0.12
Optimizer AdamW, weight decay 0.05
LR OneCycleLR β€” CNN 5e-5, fusion 5e-4
Batch 128 + grad accumulation Γ—2 (eff. 256)
Augmentation CutMix + noise + rotation + zoom
Mixed precision torch.amp (AMP)
Early stopping patience=40 on val accuracy

Retrain From Scratch

# Build dataset (downloads ~30k clean images from HuggingFace)
pip install datasets
python scripts/prepare_dataset.py

# Delete old cache and train
rmdir /s /q models\cache
python scripts/train_hybrid.py

Full training guide: GitHub README


Built with curiosity and a lot of training runs

footer

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support