Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes

Zetaphor@zemmy.cc · 2 years ago

Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes

Zetaphor@zemmy.cc · 2 years ago

The code is available here:

https://github.com/google-research/distilling-step-by-step

noneabove1182@sh.itjust.works · 2 years ago

Somehow this is even more confusing because that code hasn’t been touched in 3 months, maybe just took them that long to validate? Will have to read through it, thanks!