Using time, tokens, and computing resources in a way that maximizes useful output.
Efficiency in AI systems means getting the required result with minimal wasted effort. That may refer to lower token usage, faster execution, reduced infrastructure cost, or fewer retries caused by unclear prompts.
Prompt efficiency is not only about making prompts shorter. It is about removing unnecessary complexity while keeping enough context and instruction to produce reliable answers.
In larger systems, efficiency becomes a product and engineering concern because prompt design directly affects latency, throughput, and cost.
Efficiency can be improved through tighter prompts, better batching, reusable system instructions, structured outputs, and task decomposition that avoids repeated work.