Is BLOOM-560M also trained in BF16?

#14

by patrickramos - opened Oct 8, 2022

Oct 8, 2022

The BLOOM training README says that BLOOM was trained in bf16, and the model card for bigscience/bloom also mentions bf16 weights, but I can't find anything in this model card about the data type of the weights . I assume BLOOM-560M was also trained in bf16 since the model card still links to the same training README, but I just want to make sure. Thanks!

thies

Oct 10, 2022

•

edited Oct 10, 2022

Only the 176B model is trained in BF16. The smaller models are all trained in FP16.
https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/343#issuecomment-1267299209

patrickramos

Oct 11, 2022

Thanks!

patrickramos changed discussion status to closed Oct 11, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment