client.messages.create(model, max_tokens, messages); the API is stateless, so you resend
the full conversation history every turn.
Key points
max_tokensis a safety cap, not a target length; generation stops there or at an end-of-sequence token (stop_reason).- System prompt (
system=) controls how Claude responds (tone/role), not what. - Temperature 0–1: low = deterministic/extraction, high = creative/varied.
- Streaming (
messages.stream()) renders chunk-by-chunk viacontent_block_deltafor UX;get_final_message()reassembles for storage. - Pipeline: tokenization → embedding → contextualization → generation.
Related
Sources
- 2026-06-28-claude-course
Compiled from
wiki/study/claude-course/Anthropic-API-Basics.md · git is the source of truth