The problem isn't the model. You're missing a mental model.
You tweak the prompt. Output changes unpredictably.
You switch models. Same inconsistency.
You add more instructions. It gets worse.
There are three fundamental ideas to build intuition about LLMs:
1. Models are (almost) Functions
Think fn(input) = output, but just more powerful. We use different terms for them across domains.
- Search: query → algorithm → results.
- LLMs: prompt → model → response.
Same idea, different vocabulary.
2. Models are (mostly) Stateless
For the same prompt and model, you get broadly the same response. They don't remember you.
So to make them (more) useful, we make them stateful. You add context, state, and memory:
input = prompt + context + state + memory
- Context: shapes the current response
- State: tracks what's happening across a session
- Memory: carries knowledge across conversations
3. Behaviour comes (mainly) from Context
If the model stays the same, context is what shapes its behaviour (within the limits of the model training).
- Schema: structured output
- Search: grounded answers (RAG)
- Tools: actions in the world (MCP etc.)
- Evals: knowing if any of it actually worked
Fancy names. Simple idea: change the context, change the behaviour.
You don't need to understand transformers to build with LLMs. You need to understand functions, state, and context.
That's the whole foundation.