VOOZH about

URL: https://huggingface.co/ertghiu256/Qwen3.5-2b-ReMix

โ‡ฑ ertghiu256/Qwen3.5-2b-ReMix ยท Hugging Face


A newer version of this model is available: ertghiu256/Qwen3.5-2b-ReMix-final

๐Ÿš€ Qwen3.5-2B-ReMix (Reasoning Mix) ๐Ÿง 

This repository contains a fully merged, native Float16 (F16) fine-tune of Qwen/Qwen3.5-2B ๐Ÿค–. The primary objective of this model is to significantly scale up performance on complex reasoning tasks, specifically targeting advanced mathematics ๐Ÿงฎ, logical deduction, and structured coding problems ๐Ÿ’ป.

By leveraging multi-source open-source distillation data, it aims to achieve "frontier-style" reasoning capabilities while keeping the footprint compact enough to run smoothly at native speeds on local, everyday consumer hardware ๐Ÿ  without the need for external adapters.


๐ŸŒŸ Model Highlights

  • ๐Ÿ—๏ธ Base Architecture: Qwen/Qwen3.5-2B (Dense, Hybrid Gated DeltaNet)
  • ๐Ÿ’พ Precision format: Native Float16 (F16) Merged Weights โ€” No adapter required!
  • ๐ŸŽฏ Main Goal: Advanced mathematical reasoning and complex code generation/debugging.
  • ๐Ÿ›ก๏ธ Data Origin: 100% open-source distilled reasoning datasets natively hosted on Hugging Face. No proprietary data or closed APIs (OpenAI, Anthropic, Google) were used or involved in the collection or training process.
  • โšก Target Environment: Local, high-efficiency edge execution with minimal hardware requirements.

๐ŸŽ›๏ธ Recommended Generation Parameters

Depending on your use case, we recommend switching between "Everyday" and "Deep Reasoning" profiles to get the best performance out of the 2B architecture.

๐Ÿ  Everyday Use (Balanced)

Parameter Value Note
๐ŸŒก๏ธ Temperature (temp) 0.4 Provides a balance of creativity and coherence.
๐ŸŽฏ Top K (top_k) 30 Limits vocabulary to the most probable next steps.
๐Ÿ”„ Repeat Penalty 1.1 Light penalty to ensure conversational flow.

๐Ÿง  Deep Reasoning

Parameter Value Note
๐ŸŒก๏ธ Temperature (temp) 0.0 - 0.1 Forced determinism for strict logical consistency.
๐ŸŽฏ Top K (top_k) 60 Wider pool for complex technical vocabulary.
๐Ÿ”„ Repeat Penalty 1.2 Prevents "reasoning loops" during long chain-of-thought.
๐Ÿง  enable_thinking True Enables reasoning mode based on qwen 3.5 model card

๐Ÿ“Š Training & Merge Details

The model was adapted using Parameter-Efficient Fine-Tuning (PEFT) and then compiled back into the core network layers to output clean, unified F16 weights via Unsloth.

  • ๐Ÿ”„ Training Steps: 175
  • ๐Ÿ“‰ Loss Profile: Convergence floor reached ~0.58; stabilized consistently around 0.85
  • ๐Ÿ“ˆ Learning Rate: 4e-5
  • ๐Ÿ“ LoRA Rank ($R$) during training: 16
  • โš–๏ธ LoRA Alpha ($\alpha$) during training: 32

โš ๏ธ Limitations & Risks

While this fine-tune aggressively pushes the boundaries of what a 2B parameter model can achieve locally, users should carefully account for the following behaviors:

  • ๐Ÿ”ฎ Hallucinations: Like all highly compact models, it can confidently present false calculations or flawed code as absolute facts. Always verify outputs.
  • ๐ŸŽญ Inconsistent Styles: Due to the "ReMix" nature of the training data, the model may occasionally exhibit shifting output structures or stylistic variations.
  • ๐Ÿ›‘ Logic Mismatches: For extremely niche programming or high-level academic proofs, the model may occasionally produce broken syntax or reverse its logical assertions.

๐Ÿ“ฆ How to Use Natively

๐Ÿ Using Hugging Face Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "YOUR_USERNAME/Qwen3.5-2B-ReMix"

# Load the aligned tokenizer and model weights directly
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
 model_path, 
 torch_dtype=torch.float16, 
 device_map="auto"
)

messages = [
 {"role": "user", "content": "Explain the logic of a quicksort algorithm and implement it in Python."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Using Reasoning Parameters (To not overthink)
generated_ids = model.generate(
 **model_inputs,
 max_new_tokens=1024,
 temperature=0.1, 
 top_k=60, 
 repeat_penalty=1.2 
)

Uploaded finetuned model

  • Developed by: ertghiu256
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen3.5-2B

This qwen3_5 model was trained 2x faster with Unsloth and Huggingface's TRL library.

๐Ÿ‘ Image

Downloads last month
78
Safetensors
Model size
2B params
Tensor type
F32
ยท
BF16
ยท

Model tree for ertghiu256/Qwen3.5-2b-ReMix

Finetuned
Qwen/Qwen3.5-2B
Finetuned
(214)
this model
Finetunes
1 model
Quantizations
2 models

Datasets used to train ertghiu256/Qwen3.5-2b-ReMix