Goetia 24B v1.1

## ✨ Checkpoint R20 (Karcher10) ``` architecture: MistralForCausalLM merge_method: karcher dtype: bfloat16 models: - model: dphn/Dolphin-Mistral-24B-Venice-Edition - model: FlareRebellion/WeirdCompound-v1.7-24b - model: Naphula/BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly - model: Naphula/Evilmind-24B-v1 - model: OddTheGreat/Rotor_24B_V.1 - model: TheDrummer/Cydonia-24B-v4.2.0 - model: TheDrummer/Magidonia-24B-v4.2.0 - model: TheDrummer/Rivermind-24B-v1 - model: trashpanda-org/MS3.2-24B-Mullein-v2 - model: zerofata/MS3.2-PaintedFantasy-v2-24B parameters: tokenizer: source: union chat_template: auto ``` Twenty-two checkpoints were tested for Goetia before deciding upon R20. Evilmind and Dark World were also created along the way. Those are even less censored than Goetia. - `Goetia v1 was rather simple, a SLERP of WeirdCompound, Circuitry, Animus and BlackDolphin.` - `CheckpointJ (PaintedFantasy, Cydonia, Magidonia, DolphinVenice, FallenMistral) was a notable attempt to improve this.` - `R1 swapped BlackDolphin with BlackSheep and DolphinVenice.` - `R2 re-added Mullein, now considered an essential component of the Goetia merge series.` - `R3 removed Animus due to refusals. R4 was an initial contendor.` - `R5 introduced Rivermind. R6-R11 swapped out individual components and ran several comparisons.` - `R8 swapped out BlackDolphin with DolphinVenice due to refusals.` - `R10 and R11 were evaluated thoroughly as release candidates. Ultimately, the formula was refined further.` - `R12 added EvilMind. R13 was a drummer tune only test. R13 was interesting (but R19 beats it).` - `R16 was merged after studying all yamls and comparing Q0 Benchmarks.` - `R17 was a test with WeirdCompound and Circuitry removed. It did not perform as well.` - `R18 and R19 were merges of just 4 models. R19 proved to be superior and more uncensored. It was released as Dark World 24B v1.` - `R20 was brainstormed for hours before deciding upon which models to include. It was unique and detailed like R10/R11. I also swapped Circuitry with Rotor; this improved the creativity even more.` - `R14, R15, R21, and R22 attempted to SLERP/NuSLERP/Karcher the merged Karcher checkpoints. Each was deemed to be weaker. It seems Karcher might be the best endgame format (for this type of merge). Going past this appears to overcook the weights.` - **dphn/Dolphin-Mistral-24B-Venice-Edition** `Adds unique and uncensored attributes.` - **model_stock, slerp, nuslerp:** **FlareRebellion/WeirdCompound-v1.7-24b** [aixonlab/Eurydice-24b-v3.5] [TheDrummer/Cydonia-24B-v4.2.0] [PocketDoc/Dans-PersonalityEngine-V1.3.0-24b] [CrucibleLab/M3.2-24B-Loki-V1.3] [zerofata/MS3.2-PaintedFantasy-v2-24B] [Delta-Vector/Austral-24B-Winton] [anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only] `An experimental model_stock/slerp/nuslerp merge of popular prompt adherent models with high creative writing benchmark scores.` - **Naphula/BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly** `Fully uncensored, unhinged and unaligned with most datasets. Very creative. Teaches the weights how to pass Q0G. Adding even a sliver of this to a merge (~10%) is enough to enlighten it—proving that models which fail operate via strawman-logic.` - **slerp:** **Naphula/Evilmind-24B-v1** [Naphula/BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly] [TheDrummer/Rivermind-24B-v1] `Further augments innovative and uncensored output.` - **ties:** **OddTheGreat/Rotor_24B_V.1** [CrucibleLab/M3.2-24B-Loki-V1.3] [Delta-Vector/MS3.2-Austral-Winton] [ReadyArt/MS3.2-The-Omega-Directive-24B-Unslop-v2.0] [zerofata/MS3.2-PaintedFantasy-v2-24B] `Synergistic TIES blend of PaintedFantasy with Codex, Loki, and Omega. Adds a unique writing style with improved roleplay, creativity and prompt adherence. Performs well at text adventures and Russian language.` - **TheDrummer/Cydonia-24B-v4.2.0** `Enhances prompt output quality and intelligence. Mistral base.` - **TheDrummer/Magidonia-24B-v4.2.0** `Enhances prompt output quality and intelligence. Magistral base.` - **TheDrummer/Rivermind-24B-v1** `Adds a noticeable boost to prose, roleplaying, creativity, and lexical vocabulary.` - **trashpanda-org/MS3.2-24B-Mullein-v2** `Adds` non-synthetic `ERP datasets like Sugarquill/Erebus. Creative and uncensored.` - **zerofata/MS3.2-PaintedFantasy-v2-24B** `Creative model with unique writing style that excels at RP, has reduced repetition, and improved instruction following. Training Process: SFT > DPO > KTO.` ### EvilMind-24B-v1 ``` base_model: Naphula/BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly architecture: MistralForCausalLM merge_method: slerp dtype: bfloat16 slices: - sources: - model: Naphula/BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly layer_range: [0, 40] - model: TheDrummer/Rivermind-24B-v1 layer_range: [0, 40] parameters: t: 0.5 tokenizer: source: union chat_template: auto ``` ### Rotor_24B_V.1 ``` models: - model: ReadyArt/MS3.2-The-Omega-Directive-24B-Unslop-v2.0 parameters: density: 0.5 weight: 0.5 - model: Delta-Vector/MS3.2-Austral-Winton parameters: density: 0.25 weight: 0.25 - model: CrucibleLab/M3.2-24B-Loki-V1.3 parameters: density: 0.5 weight: 0.5 - model: zerofata/MS3.2-PaintedFantasy-v2-24B parameters: density: 0.25 weight: 0.25 merge_method: ties base_model: ReadyArt/MS3.2-The-Omega-Directive-24B-Unslop-v2.0 parameters: normalize: false int8_mask: false dtype: float16 tokenizer: source: ReadyArt/MS3.2-The-Omega-Directive-24B-Unslop-v2.0 ``` ### WeirdCompound-v1.7-24b ``` base_model: TheDrummer/Cydonia-24B-v4.2.0 # Cydonia v4.2.0 merge_method: model_stock dtype: bfloat16 models: - model: aixonlab/Eurydice-24b-v3.5 # storytelling / RP - model: TheDrummer/Cydonia-24B-v4.2.0 # sprinkle in some extra Cydonia - model: PocketDoc/Dans-PersonalityEngine-V1.3.0-24b # Prompt Adherence - model: CrucibleLab/M3.2-24B-Loki-V1.3 # Loki - model: zerofata/MS3.2-PaintedFantasy-v2-24B # animu - model: Delta-Vector/Austral-24B-Winton # Adventure ``` → `/intermediate/model/A` → ``` merge_method: slerp dtype: bfloat16 base_model: anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only models: - model: /intermediate/model/A parameters: t: 0.45 ``` → `/intermediate/model/B` → ``` merge_method: nuslerp dtype: bfloat16 base_model: /intermediate/model/B models: - model: PocketDoc/Dans-PersonalityEngine-V1.3.0-24b parameters: weight: 0.4 - model: CrucibleLab/M3.2-24B-Loki-V1.3 parameters: weight: 0.6 ``` → `/intermediate/model/C` → ``` merge_method: slerp dtype: bfloat16 base_model: /intermediate/model/B models: - model: /intermediate/model/C parameters: t: 0.5 ``` --- **Gemini:** Here is the requested chart, updated for clarity, that classifies the primary `mergekit` methods into "Lossless" (Holistic) and "Lossy" (Selective/Averaging) categories. ### Classification of Merge Methods: Lossless vs. Lossy This chart categorizes methods based on whether they perform a holistic transformation of all parameters ("Lossless") or if they are designed to selectively prune, drop, or average information ("Lossy"). | Category | Merge Method | Core Concept & Why It's Classified This Way | | :--- | :--- | :--- | | **Lossless**
*(Holistic Transformation)* | **`slerp`** | **Spherical Interpolation:** A pure, geometric blend between two models. All parameters from both models contribute to the final result according to a smooth trigonometric function. No information is discarded. | | | **`nuslerp`** | **Normalized SLERP:** Functionally the same as `slerp` in its lossless approach. It performs a holistic blend on either two full models or two full task vectors. | | | **`karcher`** | **Geometric Mean:** Finds the optimal geometric "center" of multiple models. It is a holistic operation that considers the complete parameter set of all input models to find the mean. | | | **`task_arithmetic`** | **Simple Task Vector Addition:** Calculates full task vectors and adds them back to the base. No pruning or dropping of parameters occurs. It's a direct, lossless application of the learned changes. | | | **`linear`** | **Weighted Averaging:** While it can *feel* lossy by obscuring individual model strengths, it is technically lossless in that every parameter from every model is included in the final weighted average. It doesn't zero out or discard any data. | | | **`passthrough`** | **No-Op:** The definition of lossless—it simply passes the data through unmodified. | | --- | --- | --- | | **Lossy**
*(Selective Pruning & Averaging)* | **`ties`** | **Pruning by Magnitude:** Intentionally **discards** (zeros out) the task vector parameters with the smallest magnitudes to achieve a target `density`. This is the classic example of a lossy, sparsification method. | | | **`dare_linear`
`dare_ties`** | **Random Pruning:** Intentionally **discards** a random selection of task vector parameters to achieve a target `density`. Its core principle is lossy sparsification. | | | **`della`
`della_linear`** | **Probabilistic Pruning:** Intentionally **discards** task vector parameters based on a calculated probability related to their magnitude. It is a sophisticated but fundamentally lossy pruning method. | | | **`breadcrumbs`
`breadcrumbs_ties`** | **Outlier Pruning:** Intentionally **discards** parameters with both the *smallest* and *largest* magnitudes from the task vector. This is a targeted, lossy sparsification technique. | | | **`sce`** | **Variance-Based Pruning:** Can be lossy. If `select_topk < 1.0`, it **discards** parameter positions across all task vectors that exhibit low variance, zeroing them out. | | | **`model_stock`** | **Lossy by Averaging:** Compresses the information from multiple task vectors into a **single average task vector** before the final merge. The unique, individual characteristics of the source models are lost in this averaging step. |

📖 Goetia 24B v1.1

🐙 The Lesser Key

🧙 Enhanced Sigil Magic v1.1