DADOES
Do Androids Dream of Electric Sheep.
DADOES is a microsecond-level text-to-mood inference engine for Rust projects.
The crate can be used as a library by other Rust projects. It also includes CLI and training binaries for local experimentation.
DADOES is not positioned as a replacement for transformer GoEmotions models on full-label accuracy. Its advantage is deployment shape: a 1.7 MB checkpoint, no Python, no PyTorch, no GPU, no Transformers runtime, and local Rust inference at 281,972.5 texts/s, or 3.546 us/text, on the current benchmark.
Use it as a cheap first mood signal for agent reports, chat logs, issue/support routing, local log or inbox triage, and high-volume text-stream pre-filtering.
The current implementation is a compact inference baseline:
- multi-label mood taxonomy
- hashed text features
- linear sigmoid classifier
- SGD training on a public mixed English mood dataset assembled from GoEmotions-compatible sources and DADOES seed examples
- validation-loss early stopping
- 1.7 MB binary checkpoint
- binary checkpoint save/load
- JSON CLI output
The classifier core is intentionally small. Dataset import uses structured CSV and JSON parsers at the training boundary, while the public Rust interface stays stable for downstream projects.
Why DADOES
| Property | Current Value |
|---|---|
| Default checkpoint | 1,704,033 bytes |
| Mean classification latency | 3.546 us/text |
| Throughput | 281,972.5 texts/s |
| Runtime dependencies | no Python, PyTorch, GPU, or Transformers runtime |
| Integration shape | Rust library, CLI, embedded default checkpoint |
Library Usage
After the first crates.io release, add DADOES as a normal dependency:
[dependencies]
dadoes = "0.1.0"
For this checkout before release, use a local dependency:
[dependencies]
dadoes = { path = "/Users/ryan/DADOES" }
Use the embedded default checkpoint:
use dadoes::{DadoesClassifier, EmotionClassifier};
fn main() -> Result<(), dadoes::ModelIoError> {
let classifier = DadoesClassifier::from_default_model()?;
let analysis = classifier.classify(
"I missed the deadline again and felt frustrated and exhausted.",
);
if let Some(primary) = analysis.primary_mood() {
println!("primary={} score={:.3}", primary.mood.as_str(), primary.score);
}
for score in classifier.active_moods(&analysis) {
println!("active={} score={:.3}", score.mood.as_str(), score.score);
}
Ok(())
}
Load a custom checkpoint when you train one:
let classifier = dadoes::DadoesClassifier::from_path("models/custom.dadoes")?;
Runnable example:
cargo run --release --example library_usage
Run
cargo run --release --bin dadoes -- "I missed the deadline again and felt frustrated and exhausted tonight."
The command prints:
{"primary_mood":"tired","moods":[...]}
The CLI loads models/goemotions-linear.dadoes when it exists. If no local
checkpoint exists, it falls back to the embedded default checkpoint.
Current Evaluation
The headline metric for DADOES is inference cost. Accuracy is reported here to make the current signal quality inspectable, not to claim state-of-the-art mood understanding.
This is a multi-label classifier, so a single "accuracy" number is less useful
than F1 and exact multi-label match rate. The closest strict accuracy measure is
exact_match, where every predicted label for an example must match.
Current default checkpoint:
| Split | Examples | Loss | Micro Precision | Micro Recall | Micro F1 | Exact Match |
|---|---|---|---|---|---|---|
| Public mixed validation | 9,423 | 0.3474 | 0.7763 | 0.5359 | 0.6341 | 0.2971 |
| Public mixed test | 10,032 | 0.3445 | 0.7832 | 0.5467 | 0.6439 | 0.3069 |
Training stopped at epoch 28, with the best validation checkpoint selected from epoch 24.
These numbers are for the current linear hashed-feature baseline trained on the
public mixed dataset. GoEmotions is one source in that mix; optional external
sources under data/raw/external and DADOES seed examples are also used. They
should be treated as a baseline, not a production ceiling.
Signal quality is uneven across labels. The current checkpoint is strongest on
happy, satisfied, hopeful, and lonely, with lonely measured on a
small 153-example supervised slice; weaker labels include
frustrated at F1 0.2255, curious at F1 0.3839, angry at F1 0.4187,
sad at F1 0.4328, and excited at F1 0.4362. bored and tired do not yet
have supervised examples in the loaded mixed test, so they require held-out
reviewed data before per-label accuracy can be reported.
Mixed per-mood diagnostic metrics at threshold 0.35:
| Mood | Supervised Examples | Positives | Accuracy | Precision | Recall | F1 | Coverage |
|---|---|---|---|---|---|---|---|
| happy | 9,879 | 3,534 | 0.8476 | 0.9135 | 0.6338 | 0.7484 | loaded mixed test |
| satisfied | 9,879 | 4,087 | 0.8198 | 0.8122 | 0.7343 | 0.7713 | loaded mixed test |
| excited | 9,879 | 1,504 | 0.8712 | 0.6543 | 0.3271 | 0.4362 | loaded mixed test |
| curious | 5,427 | 677 | 0.8841 | 0.5698 | 0.2895 | 0.3839 | loaded mixed test |
| anxious | 9,879 | 934 | 0.9288 | 0.6567 | 0.5182 | 0.5793 | loaded mixed test |
| frustrated | 5,427 | 699 | 0.8734 | 0.5319 | 0.1431 | 0.2255 | loaded mixed test |
| sad | 9,879 | 1,028 | 0.9167 | 0.7423 | 0.3054 | 0.4328 | loaded mixed test |
| angry | 9,879 | 1,063 | 0.9120 | 0.7245 | 0.2944 | 0.4187 | loaded mixed test |
| lonely | 153 | 89 | 0.7190 | 0.8286 | 0.6517 | 0.7296 | loaded mixed test |
| hopeful | 9,879 | 2,943 | 0.8631 | 0.8299 | 0.6799 | 0.7475 | loaded mixed test |
| neutral | 9,879 | 2,736 | 0.7973 | 0.6869 | 0.4931 | 0.5740 | loaded mixed test |
The loaded mixed test currently has no supervised examples for bored or
tired; those labels should not be published as 0/0 metrics. They require
held-out reviewed examples before a per-label accuracy can be reported.
Microsecond inference benchmark:
| Benchmark | Value |
|---|---|
| Build | cargo run --release --bin evaluate |
| Platform | local arm64, rustc 1.95.0 |
| Model load time | 1.356 ms |
| Test examples | 10,032 |
| Repeats | 50 |
| Total classifications | 501,600 |
| Total classification time | 1,778.897 ms |
| Throughput | 281,972.5 texts/s |
| Mean classification latency | 3.546 us/text |
Benchmark numbers measure classification after loading the model and should be treated as local-machine measurements, not portable latency guarantees.
Comparison
DADOES should not be read as "more accurate than BERT." The useful comparison is operational: when a Rust project needs cheap local text-to-mood inference, DADOES avoids the Python/PyTorch/Transformers deployment stack and runs from a small binary checkpoint.
The table below is a reference comparison, not a leaderboard. DADOES uses a smaller 13-label runtime taxonomy and shows local Rust performance. Most public GoEmotions models publish the original 28-label task, often with different thresholds, hardware, and evaluation scripts.
Accuracy comparison:
| Model | Evaluation Setup | Reported Result | Notes |
|---|---|---|---|
| DADOES public mixed linear baseline | 13 DADOES labels, public mixed test | micro F1 0.6439 | Local result from cargo run --release --bin train; smaller label space than GoEmotions. |
| GoEmotions BERT baseline | Original GoEmotions taxonomy | average F1 0.46 | Official GoEmotions paper baseline. |
| SamLowe/roberta-base-go_emotions | GoEmotions 28-label test, threshold 0.5 | F1 0.450 | Model-card aggregate metric. |
| tasinhoque/distilbert-go-emotions | GoEmotions evaluation set | F1 0.4702 | Model-card aggregate metric. |
| sangkm/go-emotions-fine-tuned-distilroberta | GoEmotions, threshold 0.5 | micro F1 0.5790 | Model-card metrics include macro F1 0.4502. |
| sangkm/augmented-go-emotions-plus-other-datasets-fine-tuned-distilroberta-v3 | GoEmotions plus augmented data | micro F1 0.6288 | Model-card metrics include macro F1 0.4950. |
| cirimus/modernbert-base-go-emotions | GoEmotions test | micro F1 0.607 | Model-card metrics include macro F1 0.550. |
Microsecond inference performance:
| Model | Runtime / Artifact | Reported Performance | Notes |
|---|---|---|---|
| DADOES public mixed linear baseline | Rust native checkpoint, 1.7 MB | 281,972.5 texts/s; 3.546 us/text | Local arm64 benchmark from cargo run --release --bin evaluate. |
Adoption Roadmap
- Publish the crate to crates.io; the manifest no longer disables publishing.
- Add a Hugging Face Space demo that accepts text and returns DADOES JSON.
- Run same-test-set baselines for fastText, TF-IDF logistic regression, DistilRoBERTa, ModernBERT, and DADOES.
- Add DADOES-owned reviewed labels for
bored,tired,frustrated, andlonely. - Keep the public positioning focused on microsecond-level Rust-native text-to-mood inference, not transformer-level full-taxonomy accuracy.
Train
Prepare the base GoEmotions split files in data/raw/goemotions, then run:
cargo run --release --bin train
Optional external datasets are read from data/raw/external. Public raw-file
sources are downloaded by the Rust data-preparation binary and included
automatically when their files exist:
cargo run --release --features dataset-download --bin prepare_external_datasets
cargo run --release --bin train
Non-commercial sources such as EmpatheticDialogues and FIG-Loneliness are kept
behind an explicit research switch. The trainer can read their exported local
CSV/JSONL files, but DADOES does not depend on Python or Hugging Face
datasets to produce them:
DADOES_INCLUDE_NON_COMMERCIAL=1 cargo run --release --bin train
The trainer skips missing optional files, but fails on malformed files that are
present. Use DADOES_EXTERNAL_DATA_DIR=/path/to/external to override the
external-data root.
To regenerate the evaluation and benchmark tables:
cargo run --release --bin evaluate
The default trainer:
- uses
train.tsvfor gradient updates - uses
dev.tsvfor early stopping - starts from
test.tsvand mixes optional external test splits before final evaluation - repeats the local seed examples 4 times to cover DADOES labels that
GoEmotions does not annotate, such as
lonely,bored, andtired - mixes optional public external sources when prepared under
data/raw/external - only mixes non-commercial sources when
DADOES_INCLUDE_NON_COMMERCIAL=1
Current checkpoint summary:
best_epoch=24
epochs_trained=28
validation_micro_f1=0.6341
validation_exact_match=0.2971
test_micro_f1=0.6439
test_exact_match=0.3069
The full report is saved at models/goemotions-linear.report.json.
License
DADOES is licensed under the GNU Affero General Public License v3.0 only.
Dataset Direction
Use a public mixed dataset assembled from GoEmotions and compatible external sources, then add curated DADOES-domain text. Keep the runtime labels smaller than GoEmotions:
- happy
- satisfied
- excited
- curious
- anxious
- frustrated
- sad
- angry
- lonely
- bored
- tired
- hopeful
- neutral
The missing high-priority dimensions are lonely, bored, and tired.
GoEmotions does not directly annotate them. The current public mixed checkpoint
adds direct lonely supervision from public loneliness data, while bored and
tired still only have weak coverage from the small seed set unless
DADOES-owned reviewed labels are added.
The project data format is documented in data/README.md. The public dataset
survey and coverage matrix are documented in data/DATASETS.md.