Training vs Inference Footprint

The environmental cost of AI has two major components:

Training: The one-time (or periodic) process of creating the model. This involves massive compute over weeks or months, consuming significant energy.

Inference: The ongoing process of running the model to generate outputs. Each query consumes some energy. This happens billions of times.

Understanding which dominates matters for reasoning about AI’s environmental impact.

Training Costs

Training a large language model requires:

Thousands of GPUs running for weeks or months
Megawatts of power over that period
Cooling infrastructure for the hardware
Multiple training runs (failed experiments, hyperparameter search, safety iterations)

Headlines often focus on training costs because they’re dramatic: “Training GPT-4 used as much energy as X households for Y years.”

But training is (mostly) a one-time cost. Once the model exists, it can serve billions of queries without retraining.

Inference Costs

Each query to a deployed model requires:

Loading context into GPU memory
Running a forward pass through the model
Generating tokens one by one
Network transmission

Individual queries are cheap. But scale changes everything:

Billions of queries per day across major providers
Each query has non-trivial compute cost
Query volume grows as AI becomes ubiquitous
Longer conversations and more complex tasks increase per-query cost

Which Dominates?

For a given model, the crossover point depends on:

Model size (larger models cost more to train but may be more efficient per query)
Query volume (more users = more inference)
Model lifespan (how long before the model is replaced)
Efficiency improvements (inference optimization reduces per-query cost)

As AI becomes mainstream, inference increasingly dominates. A model trained once and used by billions of people for years accumulates inference costs that dwarf training costs.

The Efficiency Trap

Models are getting more efficient to run. This sounds good environmentally, but:

Efficiency enables wider deployment (more use cases, more users)
Total energy use can increase even as per-query cost decreases
This is a classic rebound effect (Jevons paradox)

Inference becoming cheaper doesn’t mean AI’s environmental footprint shrinks. It may mean it grows more sustainably — or it may mean it grows faster.

Accounting Challenges

Neither training nor inference costs are well-disclosed:

Providers don’t publish energy consumption data
Training costs are confidential
Inference efficiency varies by query type
Renewable energy claims are hard to verify

Users cannot assess the environmental impact of their AI use because the data isn’t available.

Implications

Environmental accounting for AI must include both training and inference
As AI scales, inference may be the larger concern
Efficiency improvements don’t automatically reduce environmental impact
Transparency about energy consumption is lacking

Open Questions

At current deployment scales, which cost is larger?
How should efficiency improvements be weighed against increased use?
What would meaningful energy disclosure from providers look like?
Is there a sustainable level of AI inference?

Training vs Inference Footprint

Training vs Inference Footprint

Training Costs

Inference Costs

Which Dominates?

The Efficiency Trap

Accounting Challenges

Implications

Open Questions

See Also