Notes de Francis

Cramming: Training a Language Model on a Single GPU in One Day – Jonas Geiping and Tom Goldstein University of Maryland 2022

This repository contains code to replicate our research described in "Cramming: Training a Language Model on a Single GPU in One Day". We experiment with language model pretraining a BERT-type model with limited compute, wondering "how bad can it really be"?

https://github.com/JonasGeiping/cramming

GitHub – JonasGeiping/cramming: Cramming the training of a (BERT-type) language model into limited compute.
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior.
github.com

Publié

29 décembre 2022

dans

par

Francis

Étiquettes :