File size: 2,045 Bytes
ab51d1c
27d655d
 
 
 
 
 
 
 
 
ab51d1c
0cbb258
 
27d655d
0cbb258
27d655d
0cbb258
27d655d
 
 
 
 
 
 
 
241d646
27d655d
0cbb258
27d655d
 
 
 
0cbb258
27d655d
7eb59ff
27d655d
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
title: ThingsAI
type: org
tags:
- slm
- llm
- pytorch
- amdl
- math
- code
---


# Welcome to ThingsAI. Building highly efficient, logic-driven Small Language Models that run anywhere.

## Our Models

* **Quark-135M**
  A lightweight bilingual (Italian + English) language model with 135M parameters. Features GQA, SwiGLU, RMSNorm, and RoPE. Trained on 50B+ tokens of curated data.
* **Quark-270M**
  Our scaled small model featuring 270M parameters, 32 layers, 768 hidden dimensions, and a 65K vocabulary. Designed for extended bilingual capabilities.
* **Quark-Math-Code (~36M)**
  Our ultra-compact, deep-thin architecture (~36M parameters, 14 layers, 65K vocabulary) engineered specifically for STEM, coding, and mathematical reasoning. Actively pre-training on a 5B token target with a hardened Chain-of-Thought (CoT), OpenWebMath, and pure-code mix.
* **Quark-Mod**
  A multi-label moderation model covering 9 categories for safe AI deployment: toxic, severe_toxic, obscene, threat, insult, identity_hate, cyberbullying, hate_speech, offensive.

## What We Focus On

* **Hyper-Efficient Architectures:** Mastering the sub-1B parameter space using GQA, Grouped-Query Attention, and deep-thin layer scaling.
* **Embedded Chain-of-Thought (CoT):** Hardcoding step-by-step reasoning tokens into the pre-training phase of tiny models to punch far above their weight class in logic benchmarks.
* **Bilingual & Specialty Data:** Multi-source streaming pipelines fusing Italian, English, high-density mathematics, and code.
* **Open-Source & Real-World Deployable:** Everything from weights to datasets is open. Tailored to achieve massive throughput on consumer GPUs and edge hardware.

## Resources

* **Quark-135M-Bilingual:** Our flagship general-purpose bilingual model.
* **Quark-Mod:** Multi-label content moderation for production pipelines.
* **HuggingFace Community:** All our released models, tokenizers, and custom datasets.
* **GitHub Open Source:** Training scripts, custom multi-source streaming iterators, and deployment tools.