Training vs Inference Footprint

Training vs Inference Footprint

The environmental cost of AI has two major components:

Training: The one-time (or periodic) process of creating the model. This involves massive compute over weeks or months, consuming significant energy.

Inference: The ongoing process of running the model to generate outputs. Each query consumes some energy. This happens billions of times.

Understanding which dominates matters for reasoning about AI’s environmental impact.

Training Costs

Training a large language model requires:

  • Thousands of GPUs running for weeks or months
  • Megawatts of power over that period
  • Cooling infrastructure for the hardware
  • Multiple training runs (failed experiments, hyperparameter search, safety iterations)

Headlines often focus on training costs because they’re dramatic: “Training GPT-4 used as much energy as X households for Y years.”

But training is (mostly) a one-time cost. Once the model exists, it can serve billions of queries without retraining.

Inference Costs

Each query to a deployed model requires:

  • Loading context into GPU memory
  • Running a forward pass through the model
  • Generating tokens one by one
  • Network transmission

Individual queries are cheap. But scale changes everything:

  • Billions of queries per day across major providers
  • Each query has non-trivial compute cost
  • Query volume grows as AI becomes ubiquitous
  • Longer conversations and more complex tasks increase per-query cost

Which Dominates?

For a given model, the crossover point depends on:

  • Model size (larger models cost more to train but may be more efficient per query)
  • Query volume (more users = more inference)
  • Model lifespan (how long before the model is replaced)
  • Efficiency improvements (inference optimization reduces per-query cost)

As AI becomes mainstream, inference increasingly dominates. A model trained once and used by billions of people for years accumulates inference costs that dwarf training costs.

The Efficiency Trap

Models are getting more efficient to run. This sounds good environmentally, but:

  • Efficiency enables wider deployment (more use cases, more users)
  • Total energy use can increase even as per-query cost decreases
  • This is a classic rebound effect (Jevons paradox)

Inference becoming cheaper doesn’t mean AI’s environmental footprint shrinks. It may mean it grows more sustainably — or it may mean it grows faster.

Accounting Challenges

Neither training nor inference costs are well-disclosed:

  • Providers don’t publish energy consumption data
  • Training costs are confidential
  • Inference efficiency varies by query type
  • Renewable energy claims are hard to verify

Users cannot assess the environmental impact of their AI use because the data isn’t available.

Implications

  • Environmental accounting for AI must include both training and inference
  • As AI scales, inference may be the larger concern
  • Efficiency improvements don’t automatically reduce environmental impact
  • Transparency about energy consumption is lacking

Open Questions

  • At current deployment scales, which cost is larger?
  • How should efficiency improvements be weighed against increased use?
  • What would meaningful energy disclosure from providers look like?
  • Is there a sustainable level of AI inference?

See Also