Wu Dao 2.0

tags
Transformers, NLP
website
Wikipedia page for Wu Dao

Architecture

It is similar to GPT, being a decoder architecture but it applies a different pre-training task.

Parameter count

1.75T

Last changed | authored by

Comments


← Back to Notes