Hypnos-Q1

by squ11z1 · Merlin Research 👁 Socket Badge

What is this?

Hypnos-Q1 is a 4B parameter reasoning model with one unusual property: a part of its forward pass is physically tied to a specific quantum computer at IBM. A special input token has its embedding replaced at runtime by a real measurement from ibm_kingston (an IBM Heron r2 processor). Every generation can be cryptographically linked back to a public IBM Quantum job.

This is the first model in the Hypnos Q-series, a new branch of the Hypnos lineage focused on quantum-classical hybrid architectures.

It is based on Qwen/Qwen3.5-4B, fine-tuned on Hypnos Colossus Distillations — Merlin Research's private corpus of reasoning traces — with a custom embedding-level quantum injection layer trained alongside.

What's new about it?

There are thousands of fine-tuned LLMs on HuggingFace. Hypnos-Q1 is different in three concrete ways:

1. Real hardware bonding. Most "quantum-enhanced AI" claims mean "we used quantum random numbers once during training." Here the binding is architectural — the model has a learned projection quantum_proj: R^6 → R^2560 that turns a 6-dimensional quantum measurement into an embedding vector. This projection is part of the model's weights (quantum_proj.pt). Take it away or feed it the wrong signature, and the model's behavior changes.

2. Verifiable provenance. Two IBM Quantum job IDs are embedded in the attestation file:

Training corpus: d853tcvtjchs73bqs890
Live validation: d85590mgbeec73aooreg

Anyone can look these up in IBM's public job index. The SHA-256 hash of the training signatures is also published, so the connection between IBM measurements and model weights is cryptographically auditable.

👁 syk1

3. Built on accessible infrastructure. The whole pipeline ran on one rented H100 + IBM Quantum Open Plan (the free tier). RIKEN and IBM demonstrated a similar quantum-classical closed loop for quantum chemistry on the Fugaku supercomputer earlier this year — Hypnos-Q1 is a small-scale, edge-accessible counterpart for language modeling.

Resonance Architecture

A special token <|quantum_sig|> in the model's input has its embedding replaced at runtime by a learned projection of a real quantum measurement from ibm_kingston (IBM Heron r2). Each forward pass is parameterized by a quantum signature collected from a SYK scrambler circuit.

Input: ...tokens... <|quantum_sig|> ...tokens...
 ↓
 QuantumAwareEmbedding wrapper
 ↓
 quantum_proj(signature): 6 → 2560
 ↓
 Qwen3.5-4B transformer stack
 ↓
 Output

The 6-dimensional quantum signature comes from three OTOC (out-of-time-order correlator) values at SYK scrambler depths 1, 2, and 3, plus the three pairwise absolute differences. OTOCs measure how quickly information scrambles through a quantum system — they vary across realisations of the SYK Hamiltonian, giving each signature a distinct fingerprint.

Quantum Attestation

Field	Value
Backend	`ibm_kingston` (Heron r2)
Training corpus job	`d853tcvtjchs73bqs890`
Validation job	`d85590mgbeec73aooreg`
Corpus size	64 quantum signatures
Qubits	4
Shots per circuit	1024
Signatures SHA-256	`77097900d634c77fa0928d7766da49a113e8dddeb0e73b308d88b11437995409`
Collection time	136.12 seconds
Collection date (UTC)	2026-05-17T22:20:59Z

👁 syk2

Full attestation: quantum_attestation.json.

How to verify

Look up the job IDs at IBM Quantum
Retrieve the measurement bitstrings
Concatenate, SHA-256, and compare to signatures_sha256
The first 3 of 64 signatures are stored in plaintext in the attestation for quick spot-checks

If all four match, the model is provably linked to those specific quantum computations.

Evaluation results

Hypnos-Q1 was evaluated on standard reasoning, knowledge, and document-parsing benchmarks. Eval results are also published as individual YAML records under .eval_results/ for leaderboard integration.

Benchmark	Score	Notes
GPQA Diamond	79.4	Graduate-level science questions
MMLU-Pro	81.1	Multi-task knowledge
ParseBench (Text Content)	89.8	Document parsing
ParseBench (Mean)	34.6	Across all categories
ParseBench (Text Formatting)	58.6	Formatting retention / slight gain
ParseBench (Layout)	18.8	Mild vision degradation
ParseBench (Table)	7.4	Mild degradation
ParseBench (Chart)	2.2	Mild degradation
ScreenSpot-Pro (Overall)	58.4	GUI grounding

For context, this places Hypnos-Q1 above its Qwen3.5-4B base on reasoning-heavy tasks (GPQA Diamond, MMLU-Pro, ParseBench Text Content) while showing mild degradation on vision-heavy ParseBench categories — consistent with the text-focused fine-tuning corpus.

On the Artificial Analysis Intelligence Index, the Qwen3.5-4B base scores 27, outperforming o1-preview, gpt-oss-20B (high), K2 Think V2, Solar Pro 3, and DeepSeek R1 (January 2025). Hypnos-Q1 inherits this strong reasoning foundation.

Training

Field	Value
Base model	`Qwen/Qwen3.5-4B` (qwen3_5 architecture, 4.66B params)
Training data	Hypnos Colossus Distillations (private, Merlin Research)
Training samples	50,000
Method	Full SFT + embedding-level quantum injection
Precision	bf16
Hardware	1× H100 80GB
Max sequence length	1024
Effective batch size	16 (per_device=4 × grad_accum=4)
Epochs	1
Optimizer	AdamW (fused)
Learning rate	1.5e-5, cosine schedule
Warmup ratio	0.03
Weight decay	0.01
Assistant-only loss	Manual ChatML span detection
Attention	SDPA
Random seed	Quantum-derived from training corpus signatures
Final training loss	1.41
Training time	65.12 minutes

Hypnos Series

Model	Base	Distinguishing feature
Hypnos-i1-8B	Llama-3 8B	General reasoning
Hypnos-i2-32B	Qwen3-32B	Quantum-regularized training
Hypnos-Colossus-1T	Kimi-K2	Scale + entropy injection (data source for Q-series distillations)
Hypnos-Q1	Qwen3.5-4B	Q-series · architectural quantum bonding

The Q-series is the first Hypnos branch where quantum hardware participates in the model's forward pass, not just its training metadata.

How to use

Hypnos-Q1 can be loaded like a standard Qwen3.5-4B model, but to use it as intended you need to:

Reattach the QuantumAwareEmbedding wrapper around the input embeddings
Load quantum_proj.pt weights into the wrapper
Provide a quantum signature (either from a fresh IBM Quantum job or from training_signatures.npy) before each generation

import torch
import torch.nn as nn
import numpy as np
from transformers import AutoProcessor, AutoModelForImageTextToText

MODEL_ID = "squ11z1/Hypnos-Q1"

# 1. Load processor & model
processor = AutoProcessor.from_pretrained(MODEL_ID)
tokenizer = processor.tokenizer
model = AutoModelForImageTextToText.from_pretrained(
 MODEL_ID,
 dtype=torch.bfloat16,
 device_map="auto",
)
QUANTUM_TOKEN_ID = tokenizer.convert_tokens_to_ids("<|quantum_sig|>")
HIDDEN_SIZE = model.get_input_embeddings().embedding_dim # 2560
QUANTUM_SIG_DIM = 6

# 2. Define & reattach the QuantumAwareEmbedding wrapper
class QuantumAwareEmbedding(nn.Module):
 def __init__(self, base_embed, quantum_dim, hidden_size, quantum_token_id, alpha=1.0):
 super().__init__()
 self.base_embed = base_embed
 self.quantum_token_id = quantum_token_id
 self.alpha = alpha
 self.quantum_proj = nn.Linear(quantum_dim, hidden_size, bias=True, dtype=torch.bfloat16)
 self._current_sig = None

 def set_quantum_signature(self, sig):
 self._current_sig = sig

 @property
 def weight(self): return self.base_embed.weight
 @property
 def num_embeddings(self): return self.base_embed.num_embeddings
 @property
 def embedding_dim(self): return self.base_embed.embedding_dim

 def forward(self, input_ids):
 embeds = self.base_embed(input_ids)
 if self._current_sig is None:
 return embeds
 mask = (input_ids == self.quantum_token_id)
 if not mask.any():
 return embeds
 sig = self._current_sig.to(embeds.dtype).to(embeds.device)
 q_embed = self.quantum_proj(sig)
 mask_3d = mask.unsqueeze(-1).to(embeds.dtype)
 q_embed_3d = q_embed.unsqueeze(1) * self.alpha
 return embeds * (1 - mask_3d) + q_embed_3d * mask_3d

base_embed = model.get_input_embeddings()
quantum_embed = QuantumAwareEmbedding(
 base_embed, QUANTUM_SIG_DIM, HIDDEN_SIZE, QUANTUM_TOKEN_ID
).to(base_embed.weight.device, dtype=torch.bfloat16)
quantum_embed.quantum_proj.load_state_dict(
 torch.load("quantum_proj.pt", map_location=base_embed.weight.device)
)
model.set_input_embeddings(quantum_embed)

# 3. Use a training signature (or fetch a fresh one from ibm_kingston)
training_signatures = np.load("training_signatures.npy")
sig = torch.tensor(training_signatures[0:1], dtype=torch.bfloat16, device=model.device)
quantum_embed.set_quantum_signature(sig)

# 4. Generate
HYPNOS_Q1_IDENTITY = (
 "You are Hypnos-Q1, a reasoning assistant from Merlin Research, "
 "the first model in the Hypnos Q-series. Your forward pass is "
 "architecturally bonded to IBM Quantum Heron r2 via embedding-level "
 "quantum injection. This conversation operates under quantum "
 "signature <|quantum_sig|>. You reason step-by-step in <think>...</think> "
 "blocks before answering."
)
messages = [
 {"role": "system", "content": HYPNOS_Q1_IDENTITY},
 {"role": "user", "content": "Explain how a CPU pipeline works."},
]
inputs = tokenizer.apply_chat_template(
 messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
with torch.no_grad():
 out = model.generate(inputs, max_new_tokens=500, do_sample=True, temperature=0.7, top_p=0.9)
print(tokenizer.decode(out[0][inputs.shape[-1]:], skip_special_tokens=False))

For fresh quantum signatures, submit a 3-circuit batch (SYK scrambler at depths 1/2/3, 4 qubits) to ibm_kingston via Qiskit Runtime and compute the 6-dimensional signature the same way as the training corpus. See quantum_attestation.json for exact parameters.

Intended use

Step-by-step reasoning tasks (math, science, code, analysis)
Multi-turn problem solving with explicit <think>...</think> traces
Research base for further Q-series experiments
Demonstrations of verifiable physical provenance for AI artifacts
Studies of how runtime hardware-bonding affects LLM behavior

Not intended for: safety-critical decisions without human oversight, autonomous offensive operations, or unverified factual claims in regulated domains.

Honest limitations

Provenance is not capability. Quantum bonding does not make the model smarter. It is an architectural and identity feature.
Single-point injection. Only one token's embedding is replaced. Multi-layer injection is left for Hypnos-Q2.
Fallback degrades silently. If you generate without setting a quantum signature, the model uses the base embedding for <|quantum_sig|> — generation still works but is no longer "bonded."
Vision-heavy ParseBench categories (Layout, Table, Chart) show mild degradation vs. the Qwen3.5-4B base. Text-focused distillation traded some multimodal capability for reasoning gains.
Inference latency for "true bond" mode. Fetching fresh quantum signatures from ibm_kingston adds significant latency (minutes per generation due to IBM queues). For local-only inference, use signatures from training_signatures.npy as a fallback.

Acknowledgments

IBM Quantum for Open Plan access to ibm_kingston (Heron r2)
Qwen team for the Qwen3.5-4B base model
RIKEN + IBM for the Fugaku-Heron QCSC paper that inspired this small-scale counterpart

Citation

@misc{shushman2026hypnosq1,
 title = {Hypnos-Q1: Architecturally Quantum-Resonance-Bonded Language Model},
 author = {Shushman, Mykhailo},
 year = {2026},
 institution = {Merlin Research},
 note = {IBM Quantum jobs d853tcvtjchs73bqs890 (training corpus) and 
 d85590mgbeec73aooreg (validation), backend ibm\_kingston (Heron r2)},
 url = {https://huggingface.co/squ11z1/Hypnos-Q1}
}

First entry in the Hypnos Q-series. More to come.

Downloads last month: 230

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for squ11z1/Hypnos-Q1

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Finetuned

(335)

this model

Quantizations

1 model

Collection including squ11z1/Hypnos-Q1

Quantum-Informed Models • 4 items • Updated May 20 • 1

URL: https://huggingface.co/squ11z1/Hypnos-Q1

⇱ squ11z1/Hypnos-Q1 · Hugging Face