--- license: apache-2.0 base_model: Dream-org/Dream-v0-Instruct-7B tags: - diffusion - reasoning - reversethought - dream datasets: - ianncity/KIMI-K2.5-1000000x pipeline_tag: text-generation --- # Bridge-7b-Diffusion A fine-tuned [DREAM 7B](https://huggingface.co/Dream-org/Dream-v0-Instruct-7B) masked diffusion language model trained with the **ReverseThought** objective. ## What is ReverseThought? Given a question and its answer, the model learns to produce the step-by-step reasoning chain that bridges the question to the answer. This trains the model to generate coherent chain-of-thought reasoning via DREAM's masked diffusion process. - **Input**: Question + Answer - **Output**: Detailed reasoning trace connecting them ## Training Details - **Base model**: Dream-org/Dream-v0-Instruct-7B - **Training data**: 75,000 examples from [KIMI-K2.5-1000000x](https://huggingface.co/datasets/ianncity/KIMI-K2.5-1000000x) (General-Distillation subset) - **Objective**: DREAM masked diffusion with CART time reweighting - **Hardware**: 8x NVIDIA H100 80GB - **Epochs**: 3 - **Batch size**: 128 - **Learning rate**: 2e-6 (cosine schedule) - **Max sequence length**: 2048 tokens - **Precision**: bf16 mixed precision (FSDP) ## Usage ```python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained("WilhelmH/Bridge-7b-Diffusion", trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained("WilhelmH/Bridge-7b-Diffusion", trust_remote_code=True) ``` ## Architecture This is a **masked diffusion language model** (not autoregressive). It uses bidirectional attention and generates text by iteratively denoising masked tokens. See the [DREAM paper](https://arxiv.org/abs/2508.15487) for details.