Cost to Train GPT-3 Today
Several sources like Lambda Labs and TechTalks have tried to estimate the cost of training GPT-3. In a 2020 article, Lambda Labs stated it would take 355 GPU years and cost $4.6mm just to do a single training run of GPT-3 using Nvidia’s V100 GPU. Considering several training runs, the cost could be multiples of this.
Applying a similar framework but using Nvidia’s flagship A100 GPU, I arrived at $4.4mm for a single training cost.
Assumptions are as follows:
Lambda’s assumption that training GPT-3 requires 3.14E23 FLOPS of computing for training
At FP32, A100 can reach 19.5 TFLOPS. We will use 39 TFLOPS at FP16
Assume 8 GPUs are used, giving it a memory of 640GB and 312 TFLOPS
Use an average of GPU price and Spot price that Google charges, resulting in $2.59 per hour
It would’ve taken 11,648 days or 32 years using 8 GPUs. In cost alone, this would’ve been about $4.4mm. Theoretically, OpenAI could run 1,000 GPUs at once and bring the training time down to just 93 days or 3 months.