BLOOM

tags
Transformers, GPT, NLP
blog post
BLOOM announcement blog post

Architecture

It is similar to the architecture of GPT-3, using full attention instead of sparse attention.

Parameter count

176B

Bibliography

    Last changed | authored by

    Comments


    ← Back to Notes