TextEmbedding3SmallSentimentHead

In case you needed a sentiment analysis classifier on top of embeddings from OpenAI embeddings model.

Model Description

  • What this is: A compact PyTorch classifier head trained on top of text-embedding-3-small (1536-dim) to predict sentiment: negative, neutral, positive.
  • Data: Preprocessed from the Kaggle Sentiment Analysis Dataset.
  • Metrics (val): F1 macro β‰ˆ 0.89, Accuracy β‰ˆ 0.89 on a held-out validation split.
  • Architecture: Simple MLP head (256 hidden units, dropout 0.2), trained for 5 epochs with Adam.

Input/Output

  • Input: Float32 tensor of shape [batch, 1536] (OpenAI text-embedding-3-small embeddings).
  • Output: Logits over 3 classes. Argmax β†’ {0: negative, 1: neutral, 2: positive}.

Usage

from transformers import AutoModel
import torch

# Load model
model = AutoModel.from_pretrained(
    "marcovise/TextEmbedding3SmallSentimentHead", 
    trust_remote_code=True
).eval()

# Your 1536-dim OpenAI embeddings
embeddings = torch.randn(4, 1536)  # batch of 4 examples

# Predict sentiment
with torch.no_grad():
    logits = model(inputs_embeds=embeddings)["logits"]  # [batch, 3]
    predictions = logits.argmax(dim=1)  # [batch]
    # 0=negative, 1=neutral, 2=positive

print(predictions)  # tensor([1, 0, 2, 1])

Training Details

  • Training data: Kaggle Sentiment Analysis Dataset
  • Preprocessing: Text β†’ OpenAI embeddings β†’ 3-class labels {negative: 0.0, neutral: 0.5, positive: 1.0}
  • Architecture: 1536 β†’ 256 β†’ ReLU β†’ Dropout(0.2) β†’ 3 classes
  • Optimizer: Adam (lr=1e-3, weight_decay=1e-4)
  • Loss: CrossEntropyLoss with label smoothing (0.05)
  • Epochs: 5

Intended Use

  • Quick, lightweight sentiment classification for short text once embeddings are available.
  • Works well for general sentiment analysis tasks similar to the training distribution.

Limitations

  • Trained on a specific sentiment dataset; may have domain bias.
  • Requires OpenAI text-embedding-3-small embeddings as input.
  • Not safety-critical; evaluate before production use.
  • May reflect biases present in the training data.

License

MIT

Downloads last month
31
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 2 Ask for provider support