RRT-40M-Wiki-v1
A 38.6M parameter decoder-only transformer trained on Wikipedia English corpus. This was the baseline training run for a failed experiment that I will not mention. But I thought I'd release the model anyways.
Related model: RRT-40M-Wiki-v1-Alt
Model Architecture
- Parameters: 38.6M total (30.4M non-embedding)
- Architecture: Decoder-only transformer with Grouped Query Attention (GQA)
- Layers: 8 layers
- Hidden size: 512
- Context length: 1024 tokens
- Vocabulary: 16,000 tokens (BPE tokenizer)
- Attention: GQA with 8 query heads, 2 key-value heads, head dimension 64
- FFN: SwiGLU activation, hidden dimension 2048
- Normalization: RMSNorm pre- and post-layer
- Positional encoding: RoPE (Rotary Position Embeddings) with θ=10,000
- Embeddings: Tied input/output embeddings
Training
- Dataset: English Wikipedia (~1.8B tokens)
- Training tokens: ~20M tokens
- Optimizer: AdamW
- Learning rate: 3e-4 → 3e-5 cosine decay
- Warmup: 200 steps
- Weight decay: 0.1
- Gradient clipping: 1.0
- Batch size: 8 sequences × 1024 tokens = 8,192 tokens per step
- Steps: 2,500
- Hardware: Apple M3 Max (MPS)
- Loss: Cross-entropy on next-token prediction
- Final loss: 3.15 (val)
Loading the Model
import torch
from huggingface_hub import hf_hub_download
from tokenizers import Tokenizer
ckpt_path = hf_hub_download("hudsongouge/rrt-40m-wiki-v1", "pytorch_model.bin")
tok_path = hf_hub_download("hudsongouge/rrt-40m-wiki-v1", "tokenizer.json")
state = torch.load(ckpt_path, map_location="cpu")
tokenizer = Tokenizer.from_file(tok_path)
Generation Examples
Parameters: temperature=0.9, top_k=50, max_new_tokens=256
Prompt 1: # The history of
# The history of the United Nations](https://en.wikipedia.org/wiki/O–Pátd) and the [Canadians](https://en.wikipedia.org/wiki/Canadians), which had been a [United Nations](https://en.wikipedia.org/wiki/United_Nations), in which he was a leader of the [United Nations](https://en.wikipedia.org/wiki/United_Nations) [command](https://en.wikipedia.org/wiki/Command) [fossilism](https://en.wikipedia.org/wiki/fossilism). A part of the nationalism was the [paid of the European Union](https://en.wikipedia.org/wiki/Challenge_of_the_European_Union). A [cinema](https://en.wikipedia.org/wiki/chemistry), the [British National Assembly](https://en.wikipedia.org/wiki/British_National_Assembly), was an [paid of the Holy Catholic](https://en.wikipedia.org/wiki/List_of_the_Canadians) of the [Second Arab](https://en.wikipedia.org/wiki/Second_Arab).
Prompt 2: # List of
# List of Fifon (1847-1815) – Dramro, Girina (1870-1815)
* [Gosle de Due](https://en.wikipedia.org/wiki/Gosle_de_Due) – Jirulon (1892-1865-1843)
* [Gosle de Mulen](https://en.wikipedia.org/wiki/Gosle_de_Mulen) – Jirulon (1894-1885)
* [Gosle de Cromur](https://en.wikipedia.org/wiki/Gosle_de_Cromur) – Jirulon (1867-1881)
* [Gosle de Lochh](https://en.wikipedia.org/wiki/Gosle_de_Lochh) – Jirulon (1892-1898)
* [Gosle de Due](https://en.wikipedia.org/wiki/Gosle_de_Due) – Virulon (1893–1892)
## Life
Note with the [Royal Lum](https://en.wikipedia.org/wiki/Royal_Lum).
Prompt 3: ## Early life
## Early life
The *A Vimal of Vimal of France of the [Kingdom of Bordrius](https://en.wikipedia.org/wiki/Kingdom_of_Bordrius), was the highest [fair](https://en.wikipedia.org/wiki/fair) of the [Dodia](https://en.wikipedia.org/wiki/Dodia) of [Nimal](https://en.wikipedia.org/wiki/Nimal) from the [Kingdom of Bordrius](https://en.wikipedia.org/wiki/Kingdom_of_Bordrius). The *Essro* is the first part of the [Great Empire](https://en.wikipedia.org/wiki/Great_Empire), which was ingralised as an inclusion of the [Dodia](https://en.wikipedia.org/wiki/Dodia) in the end of the [Kingdom of Bordrius](https://en.wikipedia.org/wiki/Kingdom_of_Bordrius), is the subject of the [First English](https://en.wikipedia.org/wiki/First_English) community.
The "C
Prompt 4: ## Geography
## Geography
The town is located on the village of [Piavun County](https://en.wikipedia.org/wiki/Piavun_County%2C_New_York) and [East Town](https://en.wikipedia.org/wiki/East_Town%2C_New_York) to the west, the [Chiavun County](https://en.wikipedia.org/wiki/Chiavun_County%2C_New_York) to the border with the [Hiroshun](https://en.wikipedia.org/wiki/Hiroshun%2C_New_York), a northeast, north-east of [Kamleburg](https://en.wikipedia.org/wiki/Kamleburg%2C_New_York) to the north, and the [Paniard](https://en.wikipedia.org/wiki/Paniard%2C_New_York) to its west. The population was 27.5% and the population density of the town was 34.5% [White](https://en.wikipedia.org/wiki/White_%28New_York_%28New_York%29) and 0.8% [African American](https://en.wikipedia.org/wiki/African_American).
Prompt 5: # Battle of
# Battle of St. Paul, St. I, and C. H. (1999).
The first version by a [crosge](https://en.wikipedia.org/wiki/crosge) that was published in the 1990s, in the [United States](https://en.wikipedia.org/wiki/United_States). The band, with the album *The St. John Rosens: The Conventor of the Man of Cools*.
After becoming the [inspension of the Great World](https://en.wikipedia.org/wiki/Mendon_of_the_Great) in 1982 and received the [crosge](https://en.wikipedia.org/wiki/crosge) in London in the 2005 film *[[The Birth of Mits (magazine)|The Birth of Mits]]* (1891). The album *[[Hick for the Flora]]* won the [Mits of Eighter](https://en.wikipedia.org/wiki/Mits_of_Eighter) in [Eighter](https://en.wikipedia.org/wiki/Eighter) in the 1980.
The song *Fuana* has appeared
Prompt 6: ## External links
## External links
* [RDNP](https://www.comies.com/nv/tem.html)**Wokibi Path** ([The Eokibi](https://en.wikipedia.org/wiki/New_Eokibi) Hangibi) and [R. Kra](https://en.wikipedia.org/wiki/R._Kra) – is a [baby](https://en.wikipedia.org/wiki/baby) artist for the [OKir](https://en.wikipedia.org/wiki/OKir) movement. He was a member of the [Chuiji Tangy-nai](https://en.wikipedia.org/wiki/Chuiji_Tangy-nai), [Basselia](https://en.wikipedia.org/wiki/Basselia) who was the major director of the U.S., and a [baby-surgery](https://en.wikipedia.org/wiki/baby-surgery) [Sundo](https://en.wikipedia.org/wiki/Sundo_%28genery%29)/[Anat](https://en.wikipedia.org/wiki/Anat).
Prompt 7: ## See also
## See also
* [I. C. R.M.](https://en.wikipedia.org/wiki/I._C._R._M.)
## References
## External links
*[St. F. A. A. Nixon](https://archive.org/web/200506302133/http://www.e.bbc.gov/f/f//100712/1/f/4.html)
*[The Times of the New S. T. Gumble](http://www.usus.es/p/ch/views/page/000142). [The Flys and Cristardary Museum](http://www.ususanat.gov.edu/f5/2/c%20s/t7-2-1-4-2-3-8-2-8-3-3-2-3-3-30-1001.pdf)**Eustix** was an American football football player who worked as a [arting](https://en.wikipedia.org/wiki/arting_%28TV_series%29) in the [Australian elections](https://en.wikipedia.org/wiki/Australian_elections).
Prompt 8: # Culture of
# Culture of
The culture of the University of Squain, the [National School of Haven's National College](https://en.wikipedia.org/wiki/National_School_of_Haven%27s_National_College) in 1804 is about 1,000 to 349,000,200,000,000. The [National College of Squain, Ohio](https://en.wikipedia.org/wiki/National_College_of_Squain), is the largest-concent of the University of E. Smith.
**The Squain, the [National House of Representatives](https://en.wikipedia.org/wiki/National_House_of_Representatives) is served by the [National Department of Justice](https://en.wikipedia.org/wiki/National_Department_of_Justice), which offers significant use of the Church-inquain or the [National School of Science](https://en.wikipedia.org/wiki/National_School_of_Science), as well as the [National School of Science and Arts](https://en.wikipedia.org/wiki/National_School_of_Science_and_Arts). As a state-based city is known as the [National University of Ireland
Prompt 9: ## Notable people
## Notable people
* [Arhua](https://en.wikipedia.org/wiki/Arhua) (born 1659) – and [Lédé](https://en.wikipedia.org/wiki/L%C3%A9d%C3%A9) (died 543) – and composer (died 1253) – [Nebo Sakin](https://en.wikipedia.org/wiki/Nebo_Sakin))
*[Nebo Sakin](https://en.wikipedia.org/wiki/Nebo_Sakin) (born 1542) – [Gurara Sakin](https://en.wikipedia.org/wiki/Gurara_Sakin) (born 1519) – [Lédömk](https://en.wikipedia.org/wiki/L%C3%A9d%C3%B3%D%C3%A1) (born 1526) – [Zango Zyur](https://en.wikipedia.org/wiki/Zango_Zyur) (bornce) –
Prompt 10: # Economy of
# Economy of
The economy of the South Coast, is often considered to be for the majority of the population of the city. The area has been an important part of the state.
The city is the largest community in the town, in the 1980s and 1950s.
The town is one of the most prominent members of the [Cultons](https://en.wikipedia.org/wiki/Cultons) of [Vigor](https://en.wikipedia.org/wiki/Vigor)s. It, by the [East Nixon Railway](https://en.wikipedia.org/wiki/East_Nixon_Railway) and is one of three local soldiers. Border bills include the [San Francisco River](https://en.wikipedia.org/wiki/San_Francisco_River), the [North Coast](https://en.wikipedia.org/wiki/North_Coast_people) and the [North Coast](https://en.wikipedia.org/wiki/North_Coast_people), and the [West Coast Bridge](https://en.wikipedia.org/wiki/South_Coast_Bridge) and [London Highway](https://en.wikipedia.org/wiki/London_Highway_II). The city is
Comparison with RRT-40M-Wiki-v1-Alt
| Metric | v1 | v1-alt |
|---|---|---|
| Val loss | 3.15 | 3.33 |
Limitations
- 38.6M parameters
- Trained on Wikipedia only
- 1024 token context
- 20M training tokens
Citation
@software{rrt_llm,
title = {RRT-LLM},
author = {Hudson Gouge},
year = {2026}
}
- Downloads last month
- 786
