BEAT: Behavioral Encoder for Action Trajectories

A foundation transformer model that encodes sequences of human behavioral events into dense, reusable embeddings.

What is BEAT?

Every company predicting churn, recommending products, or segmenting users starts by manually engineering features from behavioral data (RFM scores, click counts, session metrics). This feature engineering is where most prediction quality is lost.

BEAT eliminates that step. Feed it raw event sequences — page views, purchases, searches, support tickets — and get a rich 768-dimensional embedding that captures the user's behavioral state.

Key Innovation

Unlike text transformers (BERT, GPT) that encode language, BEAT is designed specifically for action sequences with temporal dynamics:

Temporal encoding: Learns from time gaps between events (a purchase 1 day after browsing means something different than 30 days after)
Action vocabulary: Encodes event types, not words
Behavioral context: Understands that the same action means different things in different sequences

Usage

from transformers import AutoModel
import torch

# Load model
model = AutoModel.from_pretrained("your-org/beat-encoder")

# Encode a behavioral sequence
action_ids = torch.tensor([[1, 2, 3, 5, 1, 6, 2, 5]])  # page_view, product_view, cart, purchase...
property_ids = torch.tensor([[12, 45, 45, 45, 8, 3, 22, 22]])  # category/property context
time_gaps = torch.tensor([[0.0, 0.1, 0.5, 1.2, 3.0, 3.1, 7.0, 7.5]])  # days between events

outputs = model(action_ids, property_ids, time_gaps)
embedding = outputs["embedding"]  # [1, 768] — user behavioral state

Pre-training Objectives

Masked Event Prediction: Randomly mask 15% of events, predict the action type (like MLM in BERT)
Next Event Prediction: Given a sequence, predict what action comes next
Contrastive Learning: Different time windows of the same user should produce similar embeddings

Downstream Tasks

BEAT embeddings can be used for:

Task	Method	Expected Improvement
Churn prediction	Linear probe on embedding	+8-15% AUC vs. manual features
User segmentation	Cluster embeddings	More stable, interpretable clusters
Next-best-action	Fine-tune prediction head	Captures temporal patterns manual features miss
Personalization	Nearest-neighbor in embedding space	Real behavioral similarity, not just demographics

Training Data

Pre-trained on the REES46 e-commerce behavioral dataset (20M+ events from a multi-category online store):

50,000 users, 18,401 behavioral sequences
10,350 training steps across 10 epochs
Training loss converged from 0.83 → 0.42
Hardware: 2× NVIDIA T4 GPU (~27 minutes)

The model generalizes to other behavioral domains through fine-tuning.

Architecture

Parameter	Value
Hidden size	768
Layers	12
Attention heads	12
Parameters	86.4M
Embedding output	768-dim
Max sequence length	256 events
Temporal encoding	Learned + sinusoidal (90-day window)

Paper

📄 BEAT: A Foundation Model for Human Behavioral Sequences Published on Zenodo — DOI: 10.5281/zenodo.20774886

Citation

@article{dhanani2026beat,
  title     = {BEAT: A Foundation Model for Human Behavioral Sequences},
  author    = {Dhanani, Brijesh},
  year      = {2026},
  doi       = {10.5281/zenodo.20774886},
  url       = {https://doi.org/10.5281/zenodo.20774886},
  publisher = {Zenodo}
}

License

Apache 2.0

Downloads last month: 57

Safetensors

Model size

86.4M params

Tensor type

F32