DADOES

Do Androids Dream of Electric Sheep.

DADOES is a microsecond-level text-to-mood inference engine for Rust projects.

The crate can be used as a library by other Rust projects. It also includes CLI and training binaries for local experimentation.

DADOES is not positioned as a replacement for transformer GoEmotions models on full-label accuracy. Its advantage is deployment shape: a 1.7 MB checkpoint, no Python, no PyTorch, no GPU, no Transformers runtime, and local Rust inference at 281,972.5 texts/s, or 3.546 us/text, on the current benchmark.

Use it as a cheap first mood signal for agent reports, chat logs, issue/support routing, local log or inbox triage, and high-volume text-stream pre-filtering.

The current implementation is a compact inference baseline:

multi-label mood taxonomy
hashed text features
linear sigmoid classifier
SGD training on a public mixed English mood dataset assembled from GoEmotions-compatible sources and DADOES seed examples
validation-loss early stopping
1.7 MB binary checkpoint
binary checkpoint save/load
JSON CLI output

The classifier core is intentionally small. Dataset import uses structured CSV and JSON parsers at the training boundary, while the public Rust interface stays stable for downstream projects.

Why DADOES

Property	Current Value
Default checkpoint	1,704,033 bytes
Mean classification latency	3.546 us/text
Throughput	281,972.5 texts/s
Runtime dependencies	no Python, PyTorch, GPU, or Transformers runtime
Integration shape	Rust library, CLI, embedded default checkpoint

Library Usage

After the first crates.io release, add DADOES as a normal dependency:

[dependencies]
dadoes = "0.1.0"

For this checkout before release, use a local dependency:

[dependencies]
dadoes = { path = "/Users/ryan/DADOES" }

Use the embedded default checkpoint:

use dadoes::{DadoesClassifier, EmotionClassifier};

fn main() -> Result<(), dadoes::ModelIoError> {
    let classifier = DadoesClassifier::from_default_model()?;
    let analysis = classifier.classify(
        "I missed the deadline again and felt frustrated and exhausted.",
    );

    if let Some(primary) = analysis.primary_mood() {
        println!("primary={} score={:.3}", primary.mood.as_str(), primary.score);
    }

    for score in classifier.active_moods(&analysis) {
        println!("active={} score={:.3}", score.mood.as_str(), score.score);
    }

    Ok(())
}

Load a custom checkpoint when you train one:

let classifier = dadoes::DadoesClassifier::from_path("models/custom.dadoes")?;

Runnable example:

cargo run --release --example library_usage

Run

cargo run --release --bin dadoes -- "I missed the deadline again and felt frustrated and exhausted tonight."

The command prints:

{"primary_mood":"tired","moods":[...]}

The CLI loads models/goemotions-linear.dadoes when it exists. If no local checkpoint exists, it falls back to the embedded default checkpoint.

Current Evaluation

The headline metric for DADOES is inference cost. Accuracy is reported here to make the current signal quality inspectable, not to claim state-of-the-art mood understanding.

This is a multi-label classifier, so a single "accuracy" number is less useful than F1 and exact multi-label match rate. The closest strict accuracy measure is exact_match, where every predicted label for an example must match.

Current default checkpoint:

Split	Examples	Loss	Micro Precision	Micro Recall	Micro F1	Exact Match
Public mixed validation	9,423	0.3474	0.7763	0.5359	0.6341	0.2971
Public mixed test	10,032	0.3445	0.7832	0.5467	0.6439	0.3069

Training stopped at epoch 28, with the best validation checkpoint selected from epoch 24.

These numbers are for the current linear hashed-feature baseline trained on the public mixed dataset. GoEmotions is one source in that mix; optional external sources under data/raw/external and DADOES seed examples are also used. They should be treated as a baseline, not a production ceiling.

Signal quality is uneven across labels. The current checkpoint is strongest on happy, satisfied, hopeful, and lonely, with lonely measured on a small 153-example supervised slice; weaker labels include frustrated at F1 0.2255, curious at F1 0.3839, angry at F1 0.4187, sad at F1 0.4328, and excited at F1 0.4362. bored and tired do not yet have supervised examples in the loaded mixed test, so they require held-out reviewed data before per-label accuracy can be reported.

Mixed per-mood diagnostic metrics at threshold 0.35:

Mood	Supervised Examples	Positives	Accuracy	Precision	Recall	F1	Coverage
happy	9,879	3,534	0.8476	0.9135	0.6338	0.7484	loaded mixed test
satisfied	9,879	4,087	0.8198	0.8122	0.7343	0.7713	loaded mixed test
excited	9,879	1,504	0.8712	0.6543	0.3271	0.4362	loaded mixed test
curious	5,427	677	0.8841	0.5698	0.2895	0.3839	loaded mixed test
anxious	9,879	934	0.9288	0.6567	0.5182	0.5793	loaded mixed test
frustrated	5,427	699	0.8734	0.5319	0.1431	0.2255	loaded mixed test
sad	9,879	1,028	0.9167	0.7423	0.3054	0.4328	loaded mixed test
angry	9,879	1,063	0.9120	0.7245	0.2944	0.4187	loaded mixed test
lonely	153	89	0.7190	0.8286	0.6517	0.7296	loaded mixed test
hopeful	9,879	2,943	0.8631	0.8299	0.6799	0.7475	loaded mixed test
neutral	9,879	2,736	0.7973	0.6869	0.4931	0.5740	loaded mixed test

The loaded mixed test currently has no supervised examples for bored or tired; those labels should not be published as 0/0 metrics. They require held-out reviewed examples before a per-label accuracy can be reported.

Microsecond inference benchmark:

Benchmark	Value
Build	`cargo run --release --bin evaluate`
Platform	local `arm64`, `rustc 1.95.0`
Model load time	1.356 ms
Test examples	10,032
Repeats	50
Total classifications	501,600
Total classification time	1,778.897 ms
Throughput	281,972.5 texts/s
Mean classification latency	3.546 us/text

Benchmark numbers measure classification after loading the model and should be treated as local-machine measurements, not portable latency guarantees.

Comparison

DADOES should not be read as "more accurate than BERT." The useful comparison is operational: when a Rust project needs cheap local text-to-mood inference, DADOES avoids the Python/PyTorch/Transformers deployment stack and runs from a small binary checkpoint.

The table below is a reference comparison, not a leaderboard. DADOES uses a smaller 13-label runtime taxonomy and shows local Rust performance. Most public GoEmotions models publish the original 28-label task, often with different thresholds, hardware, and evaluation scripts.

Accuracy comparison:

Model	Evaluation Setup	Reported Result	Notes
DADOES public mixed linear baseline	13 DADOES labels, public mixed test	micro F1 0.6439	Local result from `cargo run --release --bin train`; smaller label space than GoEmotions.
GoEmotions BERT baseline	Original GoEmotions taxonomy	average F1 0.46	Official GoEmotions paper baseline.
SamLowe/roberta-base-go_emotions	GoEmotions 28-label test, threshold 0.5	F1 0.450	Model-card aggregate metric.
tasinhoque/distilbert-go-emotions	GoEmotions evaluation set	F1 0.4702	Model-card aggregate metric.
sangkm/go-emotions-fine-tuned-distilroberta	GoEmotions, threshold 0.5	micro F1 0.5790	Model-card metrics include macro F1 0.4502.
sangkm/augmented-go-emotions-plus-other-datasets-fine-tuned-distilroberta-v3	GoEmotions plus augmented data	micro F1 0.6288	Model-card metrics include macro F1 0.4950.
cirimus/modernbert-base-go-emotions	GoEmotions test	micro F1 0.607	Model-card metrics include macro F1 0.550.

Microsecond inference performance:

Model	Runtime / Artifact	Reported Performance	Notes
DADOES public mixed linear baseline	Rust native checkpoint, 1.7 MB	281,972.5 texts/s; 3.546 us/text	Local `arm64` benchmark from `cargo run --release --bin evaluate`.

Adoption Roadmap

Publish the crate to crates.io; the manifest no longer disables publishing.
Add a Hugging Face Space demo that accepts text and returns DADOES JSON.
Run same-test-set baselines for fastText, TF-IDF logistic regression, DistilRoBERTa, ModernBERT, and DADOES.
Add DADOES-owned reviewed labels for bored, tired, frustrated, and lonely.
Keep the public positioning focused on microsecond-level Rust-native text-to-mood inference, not transformer-level full-taxonomy accuracy.

Train

Prepare the base GoEmotions split files in data/raw/goemotions, then run:

cargo run --release --bin train

Optional external datasets are read from data/raw/external. Public raw-file sources are downloaded by the Rust data-preparation binary and included automatically when their files exist:

cargo run --release --features dataset-download --bin prepare_external_datasets
cargo run --release --bin train

Non-commercial sources such as EmpatheticDialogues and FIG-Loneliness are kept behind an explicit research switch. The trainer can read their exported local CSV/JSONL files, but DADOES does not depend on Python or Hugging Face datasets to produce them:

DADOES_INCLUDE_NON_COMMERCIAL=1 cargo run --release --bin train

The trainer skips missing optional files, but fails on malformed files that are present. Use DADOES_EXTERNAL_DATA_DIR=/path/to/external to override the external-data root.

To regenerate the evaluation and benchmark tables:

cargo run --release --bin evaluate

The default trainer:

uses train.tsv for gradient updates
uses dev.tsv for early stopping
starts from test.tsv and mixes optional external test splits before final evaluation
repeats the local seed examples 4 times to cover DADOES labels that GoEmotions does not annotate, such as lonely, bored, and tired
mixes optional public external sources when prepared under data/raw/external
only mixes non-commercial sources when DADOES_INCLUDE_NON_COMMERCIAL=1

Current checkpoint summary:

best_epoch=24
epochs_trained=28
validation_micro_f1=0.6341
validation_exact_match=0.2971
test_micro_f1=0.6439
test_exact_match=0.3069

The full report is saved at models/goemotions-linear.report.json.

License

DADOES is licensed under the GNU Affero General Public License v3.0 only.

Dataset Direction

Use a public mixed dataset assembled from GoEmotions and compatible external sources, then add curated DADOES-domain text. Keep the runtime labels smaller than GoEmotions:

happy
satisfied
excited
curious
anxious
frustrated
sad
angry
lonely
bored
tired
hopeful
neutral

The missing high-priority dimensions are lonely, bored, and tired. GoEmotions does not directly annotate them. The current public mixed checkpoint adds direct lonely supervision from public loneliness data, while bored and tired still only have weak coverage from the small seed set unless DADOES-owned reviewed labels are added.

The project data format is documented in data/README.md. The public dataset survey and coverage matrix are documented in data/DATASETS.md.

Downloads last month: -; Downloads are not tracked for this model. How to track

ElderRyan
/

DADOES