VOOZH
about
URL: https://dev.to/t/transformers
β± Transformers - DEV Community
How Self-Attention Works β QKV, Softmax, and Matrix Computation
π zeromathai profile
zeromathai
π Image
zeromathai
Jun 18
How Self-Attention Works β QKV, Softmax, and Matrix Computation
#
ai
#
machinelearning
#
nlp
#
transformers
Add Comment
5 min read
How Attention Actually Works β From Next-Token Prediction to QKV Intuition
π zeromathai profile
zeromathai
π Image
zeromathai
Jun 17
How Attention Actually Works β From Next-Token Prediction to QKV Intuition
#
ai
#
machinelearning
#
nlp
#
transformers
Add Comment
3 min read
How Transformer Architecture Works β Encoder, Decoder, Tokens, and Context
π zeromathai profile
zeromathai
π Image
zeromathai
Jun 16
How Transformer Architecture Works β Encoder, Decoder, Tokens, and Context
#
ai
#
machinelearning
#
nlp
#
transformers
Add Comment
6 min read
Attention Is All You Need, Building a Transformer for Thanglish-to-Tamil
π aj1thkr1sh profile
aj1thkr1sh
π Image
aj1thkr1sh
Jun 15
Attention Is All You Need, Building a Transformer for Thanglish-to-Tamil
#
ai
#
transformers
#
genai
#
deeplearning
Add Comment
3 min read
ζδΊΊε¨ζ TransformerοΌMemory Caching θ CTM εζθ΅°δΊδΈε
π yang_goufang_23c7ba674984 profile
Yang Goufang
π Image
Yang Goufang
Jun 11
ζδΊΊε¨ζ TransformerοΌMemory Caching θ CTM εζθ΅°δΊδΈε
#
machinelearning
#
ai
#
transformers
#
deeplearning
Add Comment
3 min read
Flash Attention: what it does and why it matters
π tech_nuggets profile
Tech_Nuggets
π Image
Tech_Nuggets
Jun 10
Flash Attention: what it does and why it matters
#
llm
#
ai
#
deeplearning
#
transformers
Add Comment
8 min read
MoE Architectures Keep Solving the Wrong Problem
π o96a profile
Aamer Mihaysi
π Image
Aamer Mihaysi
May 13
MoE Architectures Keep Solving the Wrong Problem
#
machinelearning
#
llm
#
transformers
Add Comment
3 min read
Chapter 12: Inference - Generating New Text
π garyljackson profile
Gary Jackson
π Image
Gary Jackson
May 2
Chapter 12: Inference - Generating New Text
#
csharp
#
machinelearning
#
transformers
#
tutorial
Add Comment
9 min read
Chapter 11: The Full GPT - Assembling the Model
π garyljackson profile
Gary Jackson
π Image
Gary Jackson
Apr 30
Chapter 11: The Full GPT - Assembling the Model
#
csharp
#
machinelearning
#
transformers
#
tutorial
Add Comment
10 min read
Chapter 9: Single-Head Attention - Tokens Looking at Each Other
π garyljackson profile
Gary Jackson
π Image
Gary Jackson
Apr 28
Chapter 9: Single-Head Attention - Tokens Looking at Each Other
#
csharp
#
machinelearning
#
transformers
#
tutorial
Add Comment
9 min read
Chapter 8: RMS Normalisation and Residual Connections
π garyljackson profile
Gary Jackson
π Image
Gary Jackson
Apr 27
Chapter 8: RMS Normalisation and Residual Connections
#
csharp
#
machinelearning
#
transformers
#
tutorial
Add Comment
4 min read
Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won
π alankritverma profile
Alankrit Verma
π Image
Alankrit Verma
Apr 27
Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won
#
machinelearning
#
gpu
#
research
#
transformers
Add Comment
8 min read
Chapter 7: The Training Loop and Adam Optimiser
π garyljackson profile
Gary Jackson
π Image
Gary Jackson
Apr 26
Chapter 7: The Training Loop and Adam Optimiser
#
csharp
#
machinelearning
#
transformers
#
tutorial
Add Comment
7 min read
Chapter 6: Embeddings, the Forward Pass, and the Loss Function
π garyljackson profile
Gary Jackson
π Image
Gary Jackson
Apr 25
Chapter 6: Embeddings, the Forward Pass, and the Loss Function
#
csharp
#
machinelearning
#
transformers
#
tutorial
Add Comment
7 min read
Mamba vs. Transformers: Architecture Comparison
π aairom profile
Alain Airom (Ayrom)
π Image
Alain Airom (Ayrom)
Apr 30
Mamba vs. Transformers: Architecture Comparison
#
mamba
#
transformers
#
llm
#
granite
π Image
1
reaction
Add Comment
5 min read
π
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
π DEV Community
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account
π Image
π Image
π Image
π Image
π Image