Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
transformers
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
MoE Architectures Keep Solving the Wrong Problem
Aamer Mihaysi
Aamer Mihaysi
Aamer Mihaysi
Follow
May 13
MoE Architectures Keep Solving the Wrong Problem
#
machinelearning
#
llm
#
transformers
Comments
Add Comment
3 min read
Chapter 12: Inference - Generating New Text
Gary Jackson
Gary Jackson
Gary Jackson
Follow
May 2
Chapter 12: Inference - Generating New Text
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
9 min read
Chapter 11: The Full GPT - Assembling the Model
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 30
Chapter 11: The Full GPT - Assembling the Model
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
10 min read
Chapter 9: Single-Head Attention - Tokens Looking at Each Other
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 28
Chapter 9: Single-Head Attention - Tokens Looking at Each Other
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
9 min read
Chapter 8: RMS Normalisation and Residual Connections
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 27
Chapter 8: RMS Normalisation and Residual Connections
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
4 min read
Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won
Alankrit Verma
Alankrit Verma
Alankrit Verma
Follow
Apr 27
Beating Eager TurboQuant Was Not Enough: Why Dense GPU Attention Still Won
#
machinelearning
#
gpu
#
research
#
transformers
Comments
Add Comment
8 min read
Chapter 7: The Training Loop and Adam Optimiser
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 26
Chapter 7: The Training Loop and Adam Optimiser
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
7 min read
Chapter 6: Embeddings, the Forward Pass, and the Loss Function
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 25
Chapter 6: Embeddings, the Forward Pass, and the Loss Function
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
7 min read
Mamba vs. Transformers: Architecture Comparison
Alain Airom (Ayrom)
Alain Airom (Ayrom)
Alain Airom (Ayrom)
Follow
Apr 30
Mamba vs. Transformers: Architecture Comparison
#
mamba
#
transformers
#
llm
#
granite
1
 reaction
Comments
Add Comment
5 min read
Without google's transformers, there is no GPT-ishs
Paulo Victor Leite Lima Gomes
Paulo Victor Leite Lima Gomes
Paulo Victor Leite Lima Gomes
Follow
Apr 25
Without google's transformers, there is no GPT-ishs
#
ai
#
transformers
#
llm
#
google
Comments
Add Comment
6 min read
Chapter 5: Linear Transformation and Softmax
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 24
Chapter 5: Linear Transformation and Softmax
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
4 min read
Chapter 4: The Bigram Model - Simplest Possible Language Model
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 23
Chapter 4: The Bigram Model - Simplest Possible Language Model
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
5 min read
Chapter 3: The Tokenizer - Text to Numbers and Back
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 22
Chapter 3: The Tokenizer - Text to Numbers and Back
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
2 min read
Chapter 2: Backward - Automatic Gradient Computation
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 21
Chapter 2: Backward - Automatic Gradient Computation
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
7 min read
Chapter 1: The Value Class - Recording the Forward Pass
Gary Jackson
Gary Jackson
Gary Jackson
Follow
Apr 21
Chapter 1: The Value Class - Recording the Forward Pass
#
csharp
#
machinelearning
#
transformers
#
tutorial
Comments
Add Comment
10 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account