BLOOM

tags: Transformers, GPT, NLP
blog post: BLOOM announcement blog post

Architecture

It is similar to the architecture of GPT-3, using full attention instead of sparse attention.

Parameter count

176B

Bibliography

Last changed 2022.07.22 | authored by Hugo Cisneros

Comments

Loading comments...

Back to Notes