RRT-40M-Wiki-v1-Alt
Alternative training run of the 38.6M parameter RRT decoder-only transformer on Wikipedia English corpus. Same architecture, different training dynamics. This was the experimental training run for a failed experiment that I will not mention. But I thought I'd release the model anyways.
Related model: RRT-40M-Wiki-v1
Model Architecture
- Parameters: 38.6M total (30.4M non-embedding)
- Architecture: Decoder-only transformer with Grouped Query Attention (GQA)
- Layers: 8 layers
- Hidden size: 512
- Context length: 1024 tokens
- Vocabulary: 16,000 tokens (BPE tokenizer)
- Attention: GQA with 8 query heads, 2 key-value heads, head dimension 64
- FFN: SwiGLU activation, hidden dimension 2048
- Normalization: RMSNorm pre- and post-layer
- Positional encoding: RoPE (Rotary Position Embeddings) with θ=10,000
- Embeddings: Tied input/output embeddings
Training
- Dataset: English Wikipedia (~1.8B tokens)
- Training tokens: ~20M tokens
- Optimizer: AdamW
- Learning rate: 3e-4 → 3e-5 cosine decay
- Warmup: 200 steps
- Weight decay: 0.1
- Gradient clipping: 1.0
- Batch size: 8 sequences × 1024 tokens = 8,192 tokens per step
- Steps: 2,500
- Hardware: Apple M3 Max (MPS)
- Loss: Cross-entropy on next-token prediction
- Final loss: 3.33 (val)
Loading the Model
import torch
from huggingface_hub import hf_hub_download
from tokenizers import Tokenizer
ckpt_path = hf_hub_download("hudsongouge/rrt-40m-wiki-v1-alt", "pytorch_model.bin")
tok_path = hf_hub_download("hudsongouge/rrt-40m-wiki-v1-alt", "tokenizer.json")
state = torch.load(ckpt_path, map_location="cpu")
tokenizer = Tokenizer.from_file(tok_path)
Generation Examples
Parameters: temperature=0.9, top_k=50, max_new_tokens=256
Prompt 1: # The history of
# The history of the United Nations to the NATO, in the United States was the biggest leader in the United States, with the [Great American](https://en.wikipedia.org/wiki/Great_American) [First World War](https://en.wikipedia.org/wiki/First_World_War), with the [Baby-Shield](https://en.wikipedia.org/wiki/Beck) as the C3 (d. [Fellow](https://en.wikipedia.org/wiki/Fellow) [H.D.C](https://en.wikipedia.org/wiki/H.D.C.) in [A.F.C.](https://en.wikipedia.org/wiki/A._F._F._F._F.) in 1997).
* The [Royal American Division](https://en.wikipedia.org/wiki/Royal_American_Cup) [H.C.C.](https://en.wikipedia.org/wiki/H.C._F.C.) wrote the term "One in the United Kingdom to be the second largest [First War](https://en.wikipedia.org/wiki/First_War) in 2015.
* The [First World Series](https
Prompt 2: # List of
# List of FAGIAG-PB (S1J).
#*
#
**[St. A. GS](https://en.wikipedia.org/wiki/Rivyn%2C_J._GS)
File:FAGC-PB-PB-PB-NAS-PB-RW.
File:FAGR-PB-PB-DA-PB-XQ-M-PB-NB-PB-ZW.
File:FAGR-DB-PB-XX-VB-VB-PB-DC-R-PB-ZW.jpg|FAGN-RGS-PB-NH-XO-PB-ZW.
File:FAG-PB-ZW.jpg|[FAG](https://en.wikipedia.org/wiki/FAG)
File:FAGN-M-ZW.jpg|FAGN-RH-PB-Zssers
File:FAGIC-PB/RAGL-
Prompt 3: ## Early life
## Early life
## See also
*[Official school](https://en.wikipedia.org/wiki/List_of_political_company)
*[Austilas](https://en.wikipedia.org/wiki/Austilas)
*[Peril](https://en.wikipedia.org/wiki/Peril)
*[Austilas](https://en.wikipedia.org/wiki/Austilas)
*[Austilas](https://en.wikipedia.org/wiki/Austilas)
*Austienas (in [Jennyius](https://en.wikipedia.org/wiki/Jennyius)
*[Austilas](https://en.wikipedia.org/wiki/Austilas)
*[Alternative](https://en.wikipedia.org/wiki/Alternative)
*[Austilas](https://en.wikipedia.org/wiki/Austilas)
*[Austio](https://en.wikipedia.org/wiki/Austio)
*Austio [F.B.](https://en.wikipedia.org/wiki/F._B._F._F.) ([E.P.](https://en
Prompt 4: ## Geography
## Geography
The town is located in land, each area. The population also include [County Route 115](https://en.wikipedia.org/wiki/County_Route_227_in_Pennsylvania), then in [Bodham](https://en.wikipedia.org/wiki/Bodham), United States.
* [Tiakai](https://en.wikipedia.org/wiki/Tiakai) is located on the west into the [Tiakai](https://en.wikipedia.org/wiki/Tiakai) and [Rafia](https://en.wikipedia.org/wiki/Rafia) north by the township. The town was located in the southern areas and a major area of the city.
* [Okit-Girin](https://en.wikipedia.org/wiki/Okit-Girin), the city of [Okit-Girin](https://en.wikipedia.org/wiki/Okit-Girin) and [Naku](https://en.wikipedia.org/wiki/Naku%2C_Pennsylvania), the township of Köt-Girin and [Kōt-Gir
Prompt 5: # Battle of
# Battle of the Officment St.
# "[Lemons to the Icters and Eusten by a Premons, St. Feature"](http://www.hemonsc.com/nm/c2-1)**Hemonse Leaf** (, ) is a [municipality](https://en.wikipedia.org/wiki/municipalities_of_the_United_States) located in **Fundania** and is a [painting](https://en.wikipedia.org/wiki/persion_%28town%29) of [Total R-6](https://en.wikipedia.org/wiki/Traditional_R-6), [Total K-7](https://en.wikipedia.org/wiki/Total_K-7) in [London](https://en.wikipedia.org/wiki/London) in [Spain](https://en.wikipedia.org/wiki/Spain), and in the north, the [Gur](https://en.wikipedia.org/wiki/University_of_Gur) and [Gur](https://en.wikipedia.org/wiki/Gur) and [Hemonse H
Prompt 6: ## External links
## External links
* [Rogon: A Bitil: A Futil on a Futil* by [Pogon: A Futil's Dutil](https://nh.ac.uau.edu/fachil/index.html?id=345&nh=hau_e_gj_i.htm) in [Sulon: A Hubil's Ethos](https://vogon.gov.uk/hau_index.nco/hau_e._pogon_2_1_n_j_a_t_n_o_i_a_i_a_n_z_e_u_a_a_a_a_a_a_t_a_n_a_i_a_i_i_a_a_a_n_a_a_a_a_a_n_no_a_a_a_a_a_u_1_a_a_a_a_a_a_M)
* [Futil's V.1](https://en.wikipedia.org/wiki/Futil%27_a
Prompt 7: ## See also
## See also
* [Iberger L. C. C. MacDonald](https://en.wikipedia.org/wiki/Iberger_L._P._C._C._MacDonald)
* F. A. A. Nixon, [Robert Smith](https://en.wikipedia.org/wiki/Robert_Smith_%28ice_c._the_United_States%29)
* [Iberger Morton](https://en.wikipedia.org/wiki/Iberger_Morton)
* Eugar Owen
* L. G. H. R. H. D.
* A. M. H. Inlet, the [USUSU.](https://en.wikipedia.org/wiki/U.S._2%2C_2) in Nixon, a [Southern F.C.](https://en.wikipedia.org/wiki/Southern_F._C._C._E._M._C._F._U.S._World_F.C.)
* B. C. T. St. Clinley (a.g. one of these buildings):
* Schulto Challont, who in D. L. and the G. S. P
Prompt 8: # Culture of
# Culture of
The culture of the work was created from the [U.S. state](https://en.wikipedia.org/wiki/U.S._state), the most famous example of the [United States Army](https://en.wikipedia.org/wiki/United_States_Army), the [United States Army](https://en.wikipedia.org/wiki/United_States_Army) (DL), the [United States Army](https://en.wikipedia.org/wiki/United_States_Army) (DL) and the [European Union](https://en.wikipedia.org/wiki/European_Union)). The [Department of United States](https://en.wikipedia.org/wiki/Department_of_United_States) (DL) is a [airlines](https://en.wikipedia.org/wiki/airs) that include those as a [American](https://en.wikipedia.org/wiki/United_States).
* [United States House ofUnited States](https://en.wikipedia.org/wiki/United_States_House_of_United_States) (DL)
* [United States Senate](https://en.wikipedia.org/wiki/United_States_presidential
Prompt 9: ## Notable people
## Notable people
* [List of Catholic Church of Australia](https://en.wikipedia.org/wiki/List_of_Catholic_Church_of_Florida)
* [List of Republican](https://en.wikipedia.org/wiki/List_of_Republican_Party_%28United_Kingdom%29)
* [List of state](https://en.wikipedia.org/wiki/List_of_state_laws_of_United_Kingdom_in_state)
* [List of state](https://en.wikipedia.org/wiki/List_of_state_general_cities_of_United_Kingdom_of_state_and_monument)
* [List of land](https://en.wikipedia.org/wiki/List_of_cities_of_United_Kingdom_of_United_Kingdom)
* [List of the Royal government](https://en.wikipedia.org/wiki/List_of_state_progressive_regional)
* [List of national political and public political state](https://en.wikipedia.org/wiki/List_of_federal_warfare_of_the_United_Kingdom_of_United_Kingdom)
* [List
Prompt 10: # Economy of
# Economy of
The economy of the area has no living in [Shony](https://en.wikipedia.org/wiki/Shony) and [Barney O'Conn](https://en.wikipedia.org/wiki/Barney_O%27Conn), by the town of [Brahmichka](https://en.wikipedia.org/wiki/Brahmichka). In early 2005, it was one of the [Chayan Yakelb](https://en.wikipedia.org/wiki/Chayan_Yakelb), with an [Bagosb](https://en.wikipedia.org/wiki/Bagosb) region.
* [Yakelbá](https://en.wikipedia.org/wiki/Yakelb%C4%A1), [Misé](https://en.wikipedia.org/wiki/Mis%C3%A9), a small city in the [Bobiab](https://en.wikipedia.org/wiki/Bobiab), and part of a [Chayelb](https://en.wikipedia.org/wiki/Chayelb) village in the city of [Twm](https
Comparison with RRT-40M-Wiki-v1
| Metric | v1-alt | v1 |
|---|---|---|
| Val loss | 3.33 | 3.15 |
Limitations
- 38.6M parameters
- Trained on Wikipedia only
- 1024 token context
- 20M training tokens
Citation
@software{rrt_llm,
title = {RRT-LLM},
author = {Hudson Gouge},
year = {2026}
}
- Downloads last month
- 8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
