Parameters Model Cost of inference per million token Context window
2018 117M GPT1    
2019 1.5B GPT2    
2020 175B GPT3 50  
2023 1.8T GPT4 25 32k
    GPT4o 0.75 125k
    GPT O1 preview    

Token cost (LLM inference cost) Cost of LLM inference (input+output tokens) came down from ~$50 to $0.50 per 1M tokens in 2 years.

  • Input and output cost

Facts

  • Average english speakers say ~10k words/day
  • This is about 3.5m words per year
  • At $1 per million tokens, that’s about $3.50/year

Having an LLM listen to everything you say for the entire year costs about $3.50.

Benchmarks (tbd) Current capability (tbd) Missing pieces (tbd)

Explained: Context window is size of a generative AI model which determines the extent of text it can analyze in one instance.