Unusually High Performance

#12

by fathom - opened Aug 10, 2022

Aug 10, 2022

•

edited Aug 10, 2022

Is it just me or is this 560m parameter version of bloom qualitatively out-performing the 1,3, and 7 billion versions when it comes to instructional prompts? Has anyone else noticed this?

christopher

BigScience Workshop org Aug 13, 2022

@fathom Interesting; could you give a few examples where you noticed that?

julien-c

BigScience Workshop org Aug 23, 2022

yes interested as well

Robo0890

Mar 24, 2023

Makes sense, it’s possible the smaller model tried learning the actual skills involved with producing text because it didn’t have enough memory (parameters in this case) to memorize it. Still curious, keep us updated.

christopher changed discussion status to closed Jun 30, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment