Training vs Inference Footprint
Training vs Inference Footprint
The environmental cost of AI has two major components:
Training: The one-time (or periodic) process of creating the model. This involves massive compute over weeks or months, consuming significant energy.
Inference: The ongoing process of running the model to generate outputs. Each query consumes some energy. This happens billions of times.
Understanding which dominates matters for reasoning about AI’s environmental impact.
Training Costs
Training a large language model requires:
- Thousands of GPUs running for weeks or months
- Megawatts of power over that period
- Cooling infrastructure for the hardware
- Multiple training runs (failed experiments, hyperparameter search, safety iterations)
Headlines often focus on training costs because they’re dramatic: “Training GPT-4 used as much energy as X households for Y years.”
But training is (mostly) a one-time cost. Once the model exists, it can serve billions of queries without retraining.
Inference Costs
Each query to a deployed model requires:
- Loading context into GPU memory
- Running a forward pass through the model
- Generating tokens one by one
- Network transmission
Individual queries are cheap. But scale changes everything:
- Billions of queries per day across major providers
- Each query has non-trivial compute cost
- Query volume grows as AI becomes ubiquitous
- Longer conversations and more complex tasks increase per-query cost
Which Dominates?
For a given model, the crossover point depends on:
- Model size (larger models cost more to train but may be more efficient per query)
- Query volume (more users = more inference)
- Model lifespan (how long before the model is replaced)
- Efficiency improvements (inference optimization reduces per-query cost)
As AI becomes mainstream, inference increasingly dominates. A model trained once and used by billions of people for years accumulates inference costs that dwarf training costs.
The Efficiency Trap
Models are getting more efficient to run. This sounds good environmentally, but:
- Efficiency enables wider deployment (more use cases, more users)
- Total energy use can increase even as per-query cost decreases
- This is a classic rebound effect (Jevons paradox)
Inference becoming cheaper doesn’t mean AI’s environmental footprint shrinks. It may mean it grows more sustainably — or it may mean it grows faster.
Accounting Challenges
Neither training nor inference costs are well-disclosed:
- Providers don’t publish energy consumption data
- Training costs are confidential
- Inference efficiency varies by query type
- Renewable energy claims are hard to verify
Users cannot assess the environmental impact of their AI use because the data isn’t available.
Implications
- Environmental accounting for AI must include both training and inference
- As AI scales, inference may be the larger concern
- Efficiency improvements don’t automatically reduce environmental impact
- Transparency about energy consumption is lacking
Open Questions
- At current deployment scales, which cost is larger?
- How should efficiency improvements be weighed against increased use?
- What would meaningful energy disclosure from providers look like?
- Is there a sustainable level of AI inference?
See Also
- Embodied Carbon — the pre-training environmental costs
- The One More Query Problem — the marginal reasoning trap for users
- Geographic Inequality of Compute — where the energy comes from matters
- The Irony of AI for Climate — the net impact question
- The Nuclear Renaissance Question — nuclear’s decade-long timeline matches training investment horizons, not inference demand curves
- Stranded Assets Risk — if the training-inference balance shifts dramatically, infrastructure sized for one may strand