Hitonet Meet Hito

Hito 1.7B

Brain, Heart, and a Really Good Memory

GGUF Downloads

Website Chat API Pricing


Status Parameters Training Context License Model License Method License

EXPERIMENTAL MODEL - PROOF OF CONCEPT

This 1.7B model was fine-tuned on just ~300 examples generated by Hito-Genius (our flagship model). It's an experiment in knowledge distillation - can a tiny model learn to think like a bigger one?

Don't expect production quality. This is proof that the cognitive architecture transfers, not a production release.

For the real deal, use our API at platform.hitonet.com.


🧪 The Experiment

Question: Can we teach a 1.7B model to think like our flagship Hito-Genius?

Method: Generate ~300 high-quality reasoning examples from Hito-Genius, fine-tune a small model on them.

Result: It actually works. Kind of. The cognitive patterns transfer, even with minimal data.

What This Proves What This Doesn't Prove
Cognitive architecture can be distilled That 300 examples is enough
Small models can learn structured thinking That this is production-ready
Tree-reasoning transfers from teacher That it matches Hito-Genius quality

📈 Benchmark Results (December 2025)

We tested Hito 1.7B against leading small models on counting, math, and self-awareness tasks.

Size vs Performance

Summary Results

Model Params Accuracy Counting Math
GPT-5-mini ~8B 100% 100% 100%
Claude Haiku 4.5 ~8B 90% 67% 100%
Hito 1.7B 1.7B 80% 67% 100%
GPT-4o-mini ~8B 80% 33% 100%
Claude 3.5 Haiku ~8B 70% 33% 100%
Qwen3 1.7B base 1.7B 17% 0% 17%

The Bat and Ball Test (Cognitive Bias)

"A bat and a ball cost $1.10 together. The bat costs $1.00 more than the ball. How much does the ball cost?"

Most AI (and humans) answer 10 cents. That's wrong.

Model Answer Correct
Hito 1.7B $0.05
Qwen3 1.7B (base) $0.10
GPT-4o-mini $0.10

Why? The <doubt> Tag in Action

<think>
<understand>Ball + Bat = $1.10, Bat = Ball + $1.00</understand>
<doubt>Intuition says 10 cents... but let me verify.</doubt>
<logic>
If ball = $0.10, bat = $1.10, total = $1.20. WRONG.
Let ball = x: x + (x + 1) = 1.10, 2x = 0.10, x = 0.05
</logic>
<verify>Ball $0.05 + Bat $1.05 = $1.10 ✓</verify>
</think>
The ball costs five cents.

The cognitive training teaches the model to doubt intuition and verify algebraically.


📚 Prior Art & Independent Development

Statement on Independent Development

The Nested Cognitive Reasoning (NCR) architecture used in Hito was developed independently, without knowledge of or inspiration from the works cited below. The author discovered these related approaches only after completing the development of NCR, during the literature review phase. We include these citations to properly situate our work within the broader research landscape and to acknowledge concurrent or prior explorations of related ideas, but emphasize that NCR was conceived and implemented without reference to these methods.

Here's what came before us (discovered after our development was complete):

Research What They Did How Hito Differs
Chain-of-Thought (Wei et al., 2022) Prompting with "Let's think step by step" We TRAIN the model to think, not just prompt
OpenAI o1/o3 (2024-2025) Hidden thinking tokens Our thinking is TRANSPARENT and OPEN
Reflexion (Shinn et al., 2023) Agents reflecting on mistakes Self-reflection is IN the weights, not external
Tree of Thoughts (Yao et al., 2023) Branching paths via search Our branching is LEARNED, not algorithmic
Emotional AI (WASABI, BELBIC) Emotion classification/simulation We simulate emotional CONTEXT in responses

What Makes Hito Different?

  1. Combined Approach: Cognitive + emotional + self-doubt in ONE framework
  2. Tiny Model: 1.7B params, not 100B+
  3. Open Weights: Run locally, see how it thinks
  4. Trained, Not Prompted: Behavior is in the weights
  5. Humble by Design: Says "I might be wrong" when uncertain
  6. Independent Innovation: Developed without reference to prior methods

We stand on the shoulders of giants, but we built our ladder independently. Our contribution is making these techniques accessible in a small, open model.


📊 Training Details

Property Value
Base Model Qwen/Qwen3-1.7B
Training Examples ~300
Data Source Generated by Hito-Genius
Method Supervised Fine-Tuning (SFT)
Purpose Proof of Concept

Yes, only 300 examples. We wanted to see how far we could push minimal data with high-quality synthetic examples.


🎯 The Problem We're Solving

Most AI models are confidently wrong. They hallucinate, make up facts, and never question themselves.

We're fixing this by teaching AI to understand its own limitations.


🔍 Hito Knows Its Weaknesses

Limitation Why It Happens How Hito Handles It
Can't count reliably "I process tokens, not characters." Numbers each item, counts backwards to verify
Math errors "I don't have a calculator." Writes out every step instead of mental math
Hallucination "I can make up false information." Uses <doubt> and <verify> tags
Overconfidence "I can sound sure when wrong." <confidence> tag rates certainty

Example: Self-Correcting Math

<logic>
  15% of 200 = 15 × 200 = 3000
  <doubt>Wait... that's way too high for a percentage.</doubt>
</logic>

<honest>I multiplied instead of calculating percentage.</honest>

<verify>
  15% = 0.15
  0.15 × 200 = 30 ✓
</verify>

🧠 Cognitive Architecture

Distilled from Hito-Genius into this tiny model.

Cognitive Architecture

Four Cognitive States

State Focus
Analytical Logic, accuracy
Creative Imagination, exploration
Empathetic Feelings, perspectives
Reflective Depth, meaning

🌳 Tree-Structured Reasoning

Not linear chain-of-thought. Tags nest, branch, and recurse.

Tree-Structured Reasoning

🎨 Creative Flow

Creative Flow

🛡️ The Humble Tags

Tag Purpose
<doubt> Question assumptions
<honest> Admit errors
<limits> Acknowledge gaps
<confidence> Rate certainty
<verify> Double-check work

📦 Available Files

This Repository (Safetensors)

File Description Size
model.safetensors HuggingFace Transformers format 3.4 GB

Use this for Python/Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("hitonet/hito-1.7b")
tokenizer = AutoTokenizer.from_pretrained("hitonet/hito-1.7b")

GGUF Quantizations (Separate Repository)

For Ollama, LM Studio, llama.cpp, and other local inference:

GGUF Repository

13 quantization options available (Q2_K to F16, 742 MB to 3.3 GB)

Recommended Size Use Case
Q4_K_M 1.1 GB Best balance of size and quality
Q8_0 1.8 GB Highest quality quantization
F16 3.3 GB Full precision

View all quantizations →


⚡ Quick Start

Python (Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("hitonet/hito-1.7b")
tokenizer = AutoTokenizer.from_pretrained("hitonet/hito-1.7b")

messages = [{"role": "user", "content": "A bat and a ball cost $1.10 together. The bat costs $1.00 more than the ball. How much does the ball cost?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))

Ollama (GGUF)

Get GGUF files from hitonet/hito-1.7b-GGUF:

wget https://huggingface.co/hitonet/hito-1.7b-GGUF/resolve/main/hito-1.7b-Q4_K_M.gguf

cat > Modelfile << 'EOF'
FROM hito-1.7b-Q4_K_M.gguf
PARAMETER temperature 0.7
PARAMETER stop "<|im_end|>"
EOF

ollama create hito -f Modelfile
ollama run hito

API (The Real Hito-Genius)

curl https://hitonet.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "hito-genius", "messages": [{"role": "user", "content": "Hello!"}]}'

Try the real thing at platform.hitonet.com — $1 free credit!


🔮 What's Coming

This 1.7B experiment proves the concept. Our foundational model is in development:

  • Full cognitive architecture at scale
  • Thousands of training examples
  • Production-ready reliability
  • The next evolution of Hito

This is just the beginning.


📄 Research Paper

For the full technical details, methodology, and formal analysis, see our research paper:

Nested Cognitive Reasoning: A Tree-Structured Approach to Language Model Thinking

Hitonet Research (2025).


⚖️ Licensing

Component License Commercial Use
Model Weights Apache 2.0 ✅ Free to use
NCR Method/Architecture CC BY-NC-ND ❌ Requires paid license

Commercial Licensing Required

The model weights are open source (Apache 2.0) - use them freely.

The Nested Cognitive Reasoning methodology (the cognitive tags, tree-structured thinking, humble tags system) is protected under CC BY-NC-ND.

Commercial use of the NCR method requires a license.

Contact: legal@hitonet.com


Made with genuine curiosity by Hitonet

Trained on 300 examples. Learned to doubt itself. That's pretty cool.

By: Hitonet Research

Downloads last month
624
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hitonet/hito-1.7b

Finetuned
Qwen/Qwen3-1.7B
Finetuned
(343)
this model
Quantizations
4 models