Quantcast
Channel: /dev/posts/
Viewing all articles
Browse latest Browse all 104

Transformer-decoder language models

$
0
0

Some notes on how transformer-decoder language models work, taking GPT-2 as an example, and with lots references in order to dig deeper.


Viewing all articles
Browse latest Browse all 104

Trending Articles