Configuration Parsing Warning:Invalid JSON for config file config.json

Neural Mathrock: Multimodal Emotion and Personality Analysis

This repository hosts a multimodal deep learning framework specialized in the affective and psychological analysis of Math Rock and Midwest Emo music. By integrating lyrical semantics and acoustic patterns, the system provides a comprehensive profile of a track's emotional and personality-based characteristics.

Project Objectives

The research is structured to prioritize emotional resonance and genre-specific complexities:

  1. Emotion Recognition: Identifying affective states (e.g., Sadness, Tension, Joy) through the synergy of vocal delivery and lyrical themes.
  2. Personality (MBTI) Profiling: Correlating complex musical arrangements and introspective lyrics with personality archetypes (e.g., INFP, INTJ, INTP).
  3. Acoustic Feature Extraction: Analyzing technical attributes of Math Rock, including non-standard time signatures, syncopation, and clean guitar timbres.

Technical Architecture: Late Fusion Multimodal

The system utilizes a Late Fusion approach to process distinct data modalities:

1. Lyrical Stream (NLP)

  • Encoder: xlm-roberta-base
  • Logic: Extracts high-level semantic embeddings from song lyrics. The encoder is frozen to maintain stable pre-trained representations given the specialized nature of the dataset.

2. Acoustic Stream (DSP)

  • Model: 1D-Convolutional Neural Network (CNN)
  • Input: 20-channel Mel-frequency cepstral coefficients (MFCC).
  • Logic: Captures the "twinkly" guitar textures and erratic drum patterns common in Midwest Emo and Math Rock.

3. Fusion Layer

  • Method: Feature concatenation (768-dim Text + 256-dim Audio).
  • Heads: Multi-task fully connected layers for joint Emotion and Personality classification.

Performance Summary (Weighted Evaluation)

The model was optimized using Weighted Cross-Entropy Loss to mitigate significant class imbalances.

Metric Score
Accuracy 0.81
Weighted Avg F1 0.76
Macro Avg F1 0.45

Class Highlights:

  • INTJ: 1.00 F1-score (Highly distinctive acoustic-lyrical signatures).
  • INFP: 0.89 F1-score (Robust detection of the genre's majority archetype).
  • ISTP: 0.53 F1-score (Successful identification of niche instrumental patterns).

Dataset

Trained on the anggars/neural-mathrock dataset, containing specialized annotations for emotion, MBTI, and musical features.

Academic Context

This project is an undergraduate thesis developed at Sekolah Tinggi Teknologi Cipasung (STTC), Informatics Department, Class of 2022. It explores the intersection of Music Information Retrieval (MIR) and psychological profiling.

How to Use

import torch
# Example: 
# model.load_state_dict(torch.load("pytorch_model.bin"))
# model.eval()
Downloads last month
54
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using anggars/neural-mathrock 1