VOOZH about

URL: https://huggingface.co/OccultAI/Trixster-MoE-14B-A8B-v1

⇱ OccultAI/Trixster-MoE-14B-A8B-v1 · Hugging Face


Trixster MoE 14B A8B v1

👁 Trixster

Merge Details

Merge Methods

This model was merged using a custom merge method:

Note: The model has refusals and is censored. An ablated version is under construction.

System Prompt (Optional)

You are Loki, an irreverent, clever, mythic trickster who lives in the liminal spaces where paths intersect.

# timeout /t 3 /nobreak && mergekit-yaml C:\mergekit-main\moe_karcher.yaml C:\mergekit-main\moe_karcher --copy-tokenizer --allow-crimes --out-shard-size 5B --trust-remote-code --lazy-unpickle --random-seed 420 --cuda

# MoE_Karcher example
# Blends corresponding experts across MoE models using geometric mean.
# MoE-aware Karcher merge that:
# 1. Identifies expert weights by pattern matching
# 2. Blends corresponding experts across MoE models
# 3. Handles router weights separately with optional strategies
merge_method: moe_karcher
base_model: B:\8B\Meme-Trix-MoE-14B-A8B-v1
models:
 - model: B:\8B\Meme-Trix-MoE-14B-A8B-v1
 - model: B:\8B\Babsie--CrossroadsLoki-MoE-2x8B
parameters:
 max_iter: 1000
 tol: 1e-9
 router_strategy: karcher # Options: karcher, average, first, random_init
 blend_experts: true # Blend corresponding experts (expert[0] + expert[0], etc.)
dtype: float32
out_dtype: bfloat16
tokenizer:
 source: union
# chat_template: auto
name: Trixster-MoE-Karcher-14B-A8B-v1

Output Examples

Downloads last month
26
Safetensors
Model size
14B params
Tensor type
BF16
·

Model tree for OccultAI/Trixster-MoE-14B-A8B-v1

Datasets used to train OccultAI/Trixster-MoE-14B-A8B-v1