Document
Extraction Schema
Generated AILANG
-- Define a schema to see generated AILANG code
Pipeline
Results
Select a demo or provide your own document and schema
AILANG Effects & Stdlib in Action
See how AILANG's effect system, contracts, and stdlib modules work together
to create a trustworthy AI extraction pipeline. Every line below runs in this demo via WebAssembly —
from parsing Office documents with std/xml to validating AI output with contracts.
DocParse: Office Documents via std/xml
DOCX, PPTX, and XLSX files are ZIP archives containing XML. DocParse uses AILANG's
std/xml module — parseXml, findAll,
getText, getAttr — to extract document structure
directly in WebAssembly. No server, no heavy dependencies.
8 AI-generated AILANG modules handle the hard parts: merged table cells,
text boxes, headers/footers, footnotes, track changes, and hyperlinks.
All pure functions, all contract-verified. 17 real-world test files all parse successfully.
<w:tbl>
<w:tr>
<w:tc>
<w:tcPr>
<w:gridSpan w:val="2"/>
</w:tcPr>
<w:p><w:r><w:t>Merged Header</w:t>
</w:tc>
<w:tc>
<w:tcPr><w:vMerge/></w:tcPr>
<w:p/>
</w:tc>
</w:tr>
</w:tbl>
{
"type": "table",
"headers": [
{ "text": "Merged Header",
"colSpan": 2,
"merged": false },
{ "text": "",
"colSpan": 1,
"merged": true }
]
}
import std/xml (parseXml, findAll, findFirst, getText, getAttr)
import std/list (map, flatMap, filter, length as listLength)
-- Algebraic data type: every document becomes typed blocks
export type Block = TextBlock({text: string, style: string, level: int})
| TableBlock({rows: [[TableCell]], headers: [TableCell]})
| ImageBlock({data: string, description: string, mime: string})
| HeadingBlock({text: string, level: int})
| SectionBlock({kind: string, blocks: [Block]})
-- Merged cell handling: reads gridSpan + vMerge from XML
pure func parseTableCell(node: XmlNode) -> TableCell {
let props = findFirst(node, "w:tcPr")
let span = optMap(\p. getAttr(findFirst(p, "w:gridSpan"), "w:val"), props)
let merge = optMap(\p. findFirst(p, "w:vMerge"), props)
{ text: getText(node), colSpan: getOrElse(span, 1),
rowSpan: 1, merged: isSome(merge) }
}
-- Contract: filters never grow the list
pure func filterHeadings(blocks: [Block]) -> [Block]
ensures { listLength(result) <= listLength(blocks) }
{ filter(isHeading, blocks) }
The AI Effect: ! {AI}
AILANG tracks side effects in the type system. The AI effect marks functions
that call an external AI oracle. The host grants this capability — in the browser,
a JavaScript callback calls Gemini Flash. In production, ailang run --ai gemini-2-5-flash --caps AI
handles it natively.
import std/ai (call)
-- Effectful: calls AI oracle for field extraction
-- The ! {AI} annotation declares this function has the AI effect
func extractFields(document: string) -> string ! {AI} {
let prompt = "Extract these fields as JSON..." ++ document
call(prompt) -- std/ai.call invokes host AI handler
}
-- Main pipeline: effectful extraction + pure validation
export func processDocument(doc: string) -> string ! {AI} {
let raw = extractFields(doc)
validateOnly(raw) -- pure validation, no effects
}
-- Pure fallback: validate pre-extracted data
export pure func validateOnly(json: string) -> string = ...
Contracts: requires / ensures
Preconditions and postconditions are declared in the function signature. AILANG checks them at call time — invalid inputs are caught before any computation happens. No defensive checks scattered through code.
-- Contract: non-empty input, non-empty output
pure func validateExtraction(jsonString: string) -> string
requires { length(trim(jsonString)) > 0 }
ensures { result != "" }
{
match decode(jsonString) {
Err(e) => encodeError("Invalid JSON: " ++ e),
Ok(obj) =>
match parseRecord(obj) {
None => encodeError("Missing fields"),
Some(r) =>
match validateFields(r) {
Some(err) => encodeError(err),
None => encodeResult(r)
}
}
}
}
std/json: Type-Safe JSON
AILANG's JSON library returns Option types for every access,
forcing you to handle missing fields. No undefined,
no null surprises.
import std/json (Json, decode, encode, getString, getInt, jo, kv, js, jb)
-- Parse a record from JSON: every field access returns Option
pure func parseRecord(j: Json) -> Option[MyRecord] =
match getString(j, "vendor") {
None => None, -- field missing? return None
Some(vendor) =>
match getInt(j, "total_cents") {
None => None,
Some(total) =>
Some({ vendor: vendor, total_cents: total })
}
}
-- Encode result as JSON
pure func encodeResult(r: MyRecord) -> string =
encode(jo([kv("valid", jb(true)), kv("vendor", js(r.vendor))]))
std/option & std/result: No Null, No Exceptions
Option[a] is Some(value) or None.
Result[a, e] is Ok(value) or Err(error).
Pattern matching forces you to handle both cases. No forgotten null checks.
import std/option (Option, Some, None, getOrElse)
import std/result (Result, Ok, Err)
-- Every JSON decode returns Result: you MUST handle both
match decode(jsonString) {
Err(e) => -- parse failed, e is the error message
encodeError("Invalid JSON: " ++ e),
Ok(obj) => -- parse succeeded, obj is the JSON value
processObj(obj)
}
-- Optional fields with defaults
let discount = getOrElse(getInt(j, "discount"), 0)
-- ^ If "discount" is missing, defaults to 0. No NaN. No undefined.
Capability-Based Security
AILANG uses deny-by-default capabilities. Code can only perform effects that are explicitly granted by the host. The WASM sandbox has no network or filesystem access — the AI handler is injected by JavaScript.
-- CLI: explicitly grant capabilities
-- $ ailang run --ai gemini-2-5-flash --caps AI,IO module.ail
-- Effect budgets: limit operations for cost control
func pipeline() -> string ! {AI @limit=10, IO @limit=50} {
-- At most 10 AI calls, 50 IO operations
-- Budget exceeded? Runtime error with clear message
...
}
-- In the browser (WASM):
-- No FS, no Net, no IO by default
-- AI effect only works if host registers a handler:
-- ailangSetAIHandler(jsCallback)
-- This is the capability model in action:
-- the language controls the pipeline,
-- the host provides capabilities
Language Reference
— Why AILANG
— ailang docs std/ai
How It Works
Upload document
Paste text, upload PDF/images, or Office docs (DOCX, PPTX, XLSX)
DocParse AILANG
Office files are parsed locally via DocParse — AILANG's std/xml extracts text from ZIP+XML archives
Define schema
Specify fields to extract — AILANG generates typed validation code
AI extracts ! {AI}
Gemini Flash extracts structured data via AILANG's AI effect system
AILANG validates contracts
Contracts, types, and std/json guarantee correctness of AI output
Try AILANG in Your Project
AILANG is open source and AI-native. Build reliable applications where AI handles implementation and AILANG guarantees correctness.