GitHub - PiotrNawrot/nanoT5: Fast & Simple repository for pre-training and fine-tuning T5-style models en (github.com)
This repository comprises the code to reproduce the pre-training of a "Large Language Model" (T5) under a limited budget (1xA100 GPU, < 24 hours) in PyTorch. We start from the randomly initialised T5-base-v1.1 (248M parameters) model, and we pre-train it on the English subset of the C4 dataset and then fine-tune it on...
![](http://kbin.fedi.cr/media/cache/resolve/entry_thumb/58/97/5897c300d26b6938b07a22e3f8f2f8d395598b9dbc05fed4cc6cbd8332ba351f.png)