Parameters | Model | Cost of inference per million token | Context window | |
---|---|---|---|---|
2018 | 117M | GPT1 | ||
2019 | 1.5B | GPT2 | ||
2020 | 175B | GPT3 | 50 | |
2023 | 1.8T | GPT4 | 25 | 32k |
GPT4o | 0.75 | 125k | ||
GPT O1 preview |
Token cost (LLM inference cost) Cost of LLM inference (input+output tokens) came down from ~$50 to $0.50 per 1M tokens in 2 years.
- Input and output cost
Facts
- Average english speakers say ~10k words/day
- This is about 3.5m words per year
- At $1 per million tokens, that’s about $3.50/year
Having an LLM listen to everything you say for the entire year costs about $3.50.
Benchmarks (tbd) Current capability (tbd) Missing pieces (tbd)
Explained: Context window is size of a generative AI model which determines the extent of text it can analyze in one instance.