Witchcraft makes inference a built-in operation. You declare the shape of the answer you need, and that shape constrains the model as it generates — so a malformed answer isn't rejected afterwards, it's unreachable. The model itself lives outside your code, swappable by a config file.
# urgency must be a number 0–10 divine r: Urgency from (msg) using reader
# no constraint — free generation divine r from (msg) using reader
The output changes, because the type is fed into generation token by token — it's part of the computation, not validation bolted on after. That's the line between a real primitive and a wrapper with nice syntax — and it's verified against real model weights.
Four ideas the compiler actually reasons about — not conventions you hope everyone follows.
Declare the answer as a type — a number in a range, a record, one of a fixed set of choices. The model is constrained to produce exactly that. No “it returned prose when I needed a number” bugs.
Your program names a need, never a model. Which model fills it is a one-line config file. Swap a tiny local model for a frontier one — the program gets smarter with zero code changes.
Every inference carries a confidence score and must clear a gate before you can act on it. Below the threshold, your fallback runs. Uncertain answers route to a human by construction, not by remembering to check.
Reaching the network, touching scoped data, escalating — every capability is declared in the source and enforced at compile time. A program that could phone a cloud model has to say so, where you can read it.
You need Rust. The default build ships a deterministic offline engine, so you can write and run programs with no model installed.
Describe the shape you want; ask a model to fill it. Save as mood.witch.
Type-check without running, then run against the offline engine. Point it at a real model when you're ready.
Your program never names a model — it names a need. A manifest binds that need to a real engine. Swap the manifest, keep the source.
[need.MoodReader] engine = "small" locality = "local" [engine.small] kind = "llama" gguf = "./models/qwen2.5-0.5b.gguf"
[need.MoodReader] engine = "big" locality = "local" [engine.big] kind = "llama" gguf = "./models/qwen2.5-7b.gguf"
Anywhere a model reads unstructured text and must return a clean, structured decision your code can trust: support routing, inbox triage, alert and log classification, moderation pre-filters, structured extraction from documents.
Build, install, run, and compile your first program. The practical front door.
Every construct explained from scratch — types, divine, memory, agents, capabilities.
Prebuilt binaries and tagged source. Grab the toolchain for your platform.
Working .witch programs — the triage flagship, the strict-network demo, and more.
“What would an AI-native programming language actually make primitive?” — the idea behind it.
The spec-driven design record — every proposal, decision, and guarantee, in the open.