DADOES

Do Androids Dream of Electric Sheep.

DADOES is a microsecond-level text-to-mood inference engine for Rust projects.

The crate can be used as a library by other Rust projects. It also includes CLI and training binaries for local experimentation.

DADOES is not positioned as a replacement for transformer GoEmotions models on full-label accuracy. Its advantage is deployment shape: a 1.7 MB checkpoint, no Python, no PyTorch, no GPU, no Transformers runtime, and local Rust inference at 281,972.5 texts/s, or 3.546 us/text, on the current benchmark.

Use it as a cheap first mood signal for agent reports, chat logs, issue/support routing, local log or inbox triage, and high-volume text-stream pre-filtering.

The current implementation is a compact inference baseline:

  • multi-label mood taxonomy
  • hashed text features
  • linear sigmoid classifier
  • SGD training on a public mixed English mood dataset assembled from GoEmotions-compatible sources and DADOES seed examples
  • validation-loss early stopping
  • 1.7 MB binary checkpoint
  • binary checkpoint save/load
  • JSON CLI output

The classifier core is intentionally small. Dataset import uses structured CSV and JSON parsers at the training boundary, while the public Rust interface stays stable for downstream projects.

Why DADOES

Property Current Value
Default checkpoint 1,704,033 bytes
Mean classification latency 3.546 us/text
Throughput 281,972.5 texts/s
Runtime dependencies no Python, PyTorch, GPU, or Transformers runtime
Integration shape Rust library, CLI, embedded default checkpoint

Library Usage

After the first crates.io release, add DADOES as a normal dependency:

[dependencies]
dadoes = "0.1.0"

For this checkout before release, use a local dependency:

[dependencies]
dadoes = { path = "/Users/ryan/DADOES" }

Use the embedded default checkpoint:

use dadoes::{DadoesClassifier, EmotionClassifier};

fn main() -> Result<(), dadoes::ModelIoError> {
    let classifier = DadoesClassifier::from_default_model()?;
    let analysis = classifier.classify(
        "I missed the deadline again and felt frustrated and exhausted.",
    );

    if let Some(primary) = analysis.primary_mood() {
        println!("primary={} score={:.3}", primary.mood.as_str(), primary.score);
    }

    for score in classifier.active_moods(&analysis) {
        println!("active={} score={:.3}", score.mood.as_str(), score.score);
    }

    Ok(())
}

Load a custom checkpoint when you train one:

let classifier = dadoes::DadoesClassifier::from_path("models/custom.dadoes")?;

Runnable example:

cargo run --release --example library_usage

Run

cargo run --release --bin dadoes -- "I missed the deadline again and felt frustrated and exhausted tonight."

The command prints:

{"primary_mood":"tired","moods":[...]}

The CLI loads models/goemotions-linear.dadoes when it exists. If no local checkpoint exists, it falls back to the embedded default checkpoint.

Current Evaluation

The headline metric for DADOES is inference cost. Accuracy is reported here to make the current signal quality inspectable, not to claim state-of-the-art mood understanding.

This is a multi-label classifier, so a single "accuracy" number is less useful than F1 and exact multi-label match rate. The closest strict accuracy measure is exact_match, where every predicted label for an example must match.

Current default checkpoint:

Split Examples Loss Micro Precision Micro Recall Micro F1 Exact Match
Public mixed validation 9,423 0.3474 0.7763 0.5359 0.6341 0.2971
Public mixed test 10,032 0.3445 0.7832 0.5467 0.6439 0.3069

Training stopped at epoch 28, with the best validation checkpoint selected from epoch 24.

These numbers are for the current linear hashed-feature baseline trained on the public mixed dataset. GoEmotions is one source in that mix; optional external sources under data/raw/external and DADOES seed examples are also used. They should be treated as a baseline, not a production ceiling.

Signal quality is uneven across labels. The current checkpoint is strongest on happy, satisfied, hopeful, and lonely, with lonely measured on a small 153-example supervised slice; weaker labels include frustrated at F1 0.2255, curious at F1 0.3839, angry at F1 0.4187, sad at F1 0.4328, and excited at F1 0.4362. bored and tired do not yet have supervised examples in the loaded mixed test, so they require held-out reviewed data before per-label accuracy can be reported.

Mixed per-mood diagnostic metrics at threshold 0.35:

Mood Supervised Examples Positives Accuracy Precision Recall F1 Coverage
happy 9,879 3,534 0.8476 0.9135 0.6338 0.7484 loaded mixed test
satisfied 9,879 4,087 0.8198 0.8122 0.7343 0.7713 loaded mixed test
excited 9,879 1,504 0.8712 0.6543 0.3271 0.4362 loaded mixed test
curious 5,427 677 0.8841 0.5698 0.2895 0.3839 loaded mixed test
anxious 9,879 934 0.9288 0.6567 0.5182 0.5793 loaded mixed test
frustrated 5,427 699 0.8734 0.5319 0.1431 0.2255 loaded mixed test
sad 9,879 1,028 0.9167 0.7423 0.3054 0.4328 loaded mixed test
angry 9,879 1,063 0.9120 0.7245 0.2944 0.4187 loaded mixed test
lonely 153 89 0.7190 0.8286 0.6517 0.7296 loaded mixed test
hopeful 9,879 2,943 0.8631 0.8299 0.6799 0.7475 loaded mixed test
neutral 9,879 2,736 0.7973 0.6869 0.4931 0.5740 loaded mixed test

The loaded mixed test currently has no supervised examples for bored or tired; those labels should not be published as 0/0 metrics. They require held-out reviewed examples before a per-label accuracy can be reported.

Microsecond inference benchmark:

Benchmark Value
Build cargo run --release --bin evaluate
Platform local arm64, rustc 1.95.0
Model load time 1.356 ms
Test examples 10,032
Repeats 50
Total classifications 501,600
Total classification time 1,778.897 ms
Throughput 281,972.5 texts/s
Mean classification latency 3.546 us/text

Benchmark numbers measure classification after loading the model and should be treated as local-machine measurements, not portable latency guarantees.

Comparison

DADOES should not be read as "more accurate than BERT." The useful comparison is operational: when a Rust project needs cheap local text-to-mood inference, DADOES avoids the Python/PyTorch/Transformers deployment stack and runs from a small binary checkpoint.

The table below is a reference comparison, not a leaderboard. DADOES uses a smaller 13-label runtime taxonomy and shows local Rust performance. Most public GoEmotions models publish the original 28-label task, often with different thresholds, hardware, and evaluation scripts.

Accuracy comparison:

Model Evaluation Setup Reported Result Notes
DADOES public mixed linear baseline 13 DADOES labels, public mixed test micro F1 0.6439 Local result from cargo run --release --bin train; smaller label space than GoEmotions.
GoEmotions BERT baseline Original GoEmotions taxonomy average F1 0.46 Official GoEmotions paper baseline.
SamLowe/roberta-base-go_emotions GoEmotions 28-label test, threshold 0.5 F1 0.450 Model-card aggregate metric.
tasinhoque/distilbert-go-emotions GoEmotions evaluation set F1 0.4702 Model-card aggregate metric.
sangkm/go-emotions-fine-tuned-distilroberta GoEmotions, threshold 0.5 micro F1 0.5790 Model-card metrics include macro F1 0.4502.
sangkm/augmented-go-emotions-plus-other-datasets-fine-tuned-distilroberta-v3 GoEmotions plus augmented data micro F1 0.6288 Model-card metrics include macro F1 0.4950.
cirimus/modernbert-base-go-emotions GoEmotions test micro F1 0.607 Model-card metrics include macro F1 0.550.

Microsecond inference performance:

Model Runtime / Artifact Reported Performance Notes
DADOES public mixed linear baseline Rust native checkpoint, 1.7 MB 281,972.5 texts/s; 3.546 us/text Local arm64 benchmark from cargo run --release --bin evaluate.

Adoption Roadmap

  • Publish the crate to crates.io; the manifest no longer disables publishing.
  • Add a Hugging Face Space demo that accepts text and returns DADOES JSON.
  • Run same-test-set baselines for fastText, TF-IDF logistic regression, DistilRoBERTa, ModernBERT, and DADOES.
  • Add DADOES-owned reviewed labels for bored, tired, frustrated, and lonely.
  • Keep the public positioning focused on microsecond-level Rust-native text-to-mood inference, not transformer-level full-taxonomy accuracy.

Train

Prepare the base GoEmotions split files in data/raw/goemotions, then run:

cargo run --release --bin train

Optional external datasets are read from data/raw/external. Public raw-file sources are downloaded by the Rust data-preparation binary and included automatically when their files exist:

cargo run --release --features dataset-download --bin prepare_external_datasets
cargo run --release --bin train

Non-commercial sources such as EmpatheticDialogues and FIG-Loneliness are kept behind an explicit research switch. The trainer can read their exported local CSV/JSONL files, but DADOES does not depend on Python or Hugging Face datasets to produce them:

DADOES_INCLUDE_NON_COMMERCIAL=1 cargo run --release --bin train

The trainer skips missing optional files, but fails on malformed files that are present. Use DADOES_EXTERNAL_DATA_DIR=/path/to/external to override the external-data root.

To regenerate the evaluation and benchmark tables:

cargo run --release --bin evaluate

The default trainer:

  • uses train.tsv for gradient updates
  • uses dev.tsv for early stopping
  • starts from test.tsv and mixes optional external test splits before final evaluation
  • repeats the local seed examples 4 times to cover DADOES labels that GoEmotions does not annotate, such as lonely, bored, and tired
  • mixes optional public external sources when prepared under data/raw/external
  • only mixes non-commercial sources when DADOES_INCLUDE_NON_COMMERCIAL=1

Current checkpoint summary:

best_epoch=24
epochs_trained=28
validation_micro_f1=0.6341
validation_exact_match=0.2971
test_micro_f1=0.6439
test_exact_match=0.3069

The full report is saved at models/goemotions-linear.report.json.

License

DADOES is licensed under the GNU Affero General Public License v3.0 only.

Dataset Direction

Use a public mixed dataset assembled from GoEmotions and compatible external sources, then add curated DADOES-domain text. Keep the runtime labels smaller than GoEmotions:

  • happy
  • satisfied
  • excited
  • curious
  • anxious
  • frustrated
  • sad
  • angry
  • lonely
  • bored
  • tired
  • hopeful
  • neutral

The missing high-priority dimensions are lonely, bored, and tired. GoEmotions does not directly annotate them. The current public mixed checkpoint adds direct lonely supervision from public loneliness data, while bored and tired still only have weak coverage from the small seed set unless DADOES-owned reviewed labels are added.

The project data format is documented in data/README.md. The public dataset survey and coverage matrix are documented in data/DATASETS.md.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train ElderRyan/DADOES