MCP Server

Document parsing as a tool for any AI agent.

AI agents: fetch llms.txt or llms-full.txt for a single-request reference covering the hosted MCP endpoint, device auth flow, and all 7 tools.

What is MCP?

The Model Context Protocol (MCP) is an open standard for connecting AI assistants to external tools and data sources. Instead of hardcoding integrations, agents discover what tools are available, what parameters they accept, and how to call them — all at runtime.

AILANG Parse ships as a native MCP server with 7 tools: 4 always available (parse, convert, formats, estimate) and 3 for the hosted API (auth, auth poll, account). Claude, Cursor, VS Code, and any MCP-compatible agent can discover and call them. Office formats are parsed deterministically; PDFs and images use pluggable AI.

Two transports: stdio (for Claude Desktop, Cursor, VS Code) and HTTP (Streamable HTTP at /mcp/ for hosted deployments). Local mode exposes 4 tools; hosted mode exposes all 7 including device auth and account management.

Quick Start

Local — stdio (recommended)

For Claude Desktop, Cursor, and VS Code. The agent launches the server automatically:

# No manual server start needed — the MCP client launches this:
ailang serve-api --mcp --routes-only --caps IO,FS,Env docparse/

Local — HTTP

For MCP clients that connect over HTTP:

# Start with MCP HTTP endpoint at /mcp/
ailang serve-api --mcp-http --routes-only --caps IO,FS,Env --port 8080 docparse/

# Test it
curl -X POST http://localhost:8080/mcp/ \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'

Hosted

No local install required. Connect to the hosted endpoint (API key required):

curl -X POST https://docparse.ailang.sunholo.com/mcp/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dp_your_api_key" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{...}}'

With AI parsing

To parse PDFs, images, audio, or video, add AI capabilities:

# Add AI model for non-Office formats
ailang serve-api --mcp --routes-only \
  --caps IO,FS,Env,AI,Net --ai gemini-2.5-flash docparse/

Available Tools

AILANG Parse exposes 7 tools via MCP. Agents discover these automatically from tools/list:

Core Tools (local + hosted)

ToolDescriptionParameters
mcpParse Parse any document into structured blocks, Markdown, or HTML. Office formats are deterministic; PDF/images use AI. In hosted mode, pass an apiKey. filepath (path or sample_id), outputFormat (blocks | markdown | html), apiKey (hosted), requestId (replay)
mcpConvert Convert a document from one format to another. Parses input to blocks, then generates the target format. input (path or sample_id), outputFormat (docx, pptx, xlsx, odt, odp, ods, html, md, qmd), outputPath (optional), apiKey (hosted)
mcpFormats List all 17 input and 9 output formats with features, 26 test samples, pricing tiers, and full service capabilities. Call this first to discover what AILANG Parse can do. None
mcpEstimate Estimate cost and latency before parsing. Returns whether AI is required, estimated time, and quota impact. Use to advise humans before consuming quota. filepath (path or sample_id), outputFormat

Hosted-Only Tools (API with billing)

These tools are available when connecting to the hosted API at docparse.ailang.sunholo.com. They enable agents to manage authentication, check quotas, and advise humans on pricing.

ToolDescriptionParameters
mcpAuth Start RFC 8628 device authorization. Returns a verification URL and user code. Tell the human to open the URL and sign in, then call mcpAuthPoll. label (e.g. "claude-desktop")
mcpAuthPoll Poll for device auth completion. Returns pending (keep polling every 5s), approved (with API key and tier info), or expired. deviceCode (from mcpAuth)
mcpAccount View account: tier, quota, usage, keys, pricing, and parse history. Pass action="pricing" without auth to see tiers and advise humans on signup. apiKey, action (status | pricing | usage | history | keys)

Each tool has named parameters with JSON Schema types. Agents receive full schemas via tools/list — no hardcoding required.

Agent Auth Flow

When an agent connects to the hosted MCP server and tries to parse without an API key:

  1. Agent calls mcpParse → receives AUTH_REQUIRED error with suggested_fix
  2. Agent calls mcpAuth(label: "claude-desktop") → receives verification URL + user code
  3. Agent tells the human: "Open this URL and sign in. Your code is WXYZ-5678."
  4. Agent calls mcpAuthPoll(deviceCode) every 5s → receives api_key + tier
  5. Agent includes apiKey in all subsequent parse/convert calls

Install from the MCP Registry

AILANG Parse is listed in the official MCP Registry as io.github.sunholo-data/parse. Registry-aware clients can discover and install it without you copying any JSON snippets.

MCP Registry: io.github.sunholo-data/parse

How to install via the registry

  • Claude Code: claude mcp add io.github.sunholo-data/parse — resolves the entry, picks the best available transport (PyPI / npm / hosted HTTP), and writes the config for you.
  • Cursor / VS Code with MCP extension: open the MCP server browser and search for parse or sunholo. Click install — the client populates .cursor/mcp.json or .vscode/settings.json automatically.
  • Any registry-aware client: point it at io.github.sunholo-data/parse. The listing exposes both packages (npm + PyPI stdio bridges) and a remote (hosted Streamable HTTP at docparse.ailang.sunholo.com/mcp/); the client picks whichever it supports.

Inspect the live entry directly:

curl -s "https://registry.modelcontextprotocol.io/v0/servers?search=parse" \
  | jq '[.servers[] | select(.server.name == "io.github.sunholo-data/parse")]'
If your client doesn't browse the registry yet, use the manual configuration snippets below — they're equivalent. The registry path just saves you a copy/paste.

Manual Configuration

Claude Desktop (hosted — recommended)

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows). Pick whichever runtime you have installed — all three SDKs ship the same stdio bridge.

Node.js ≥ 18:

{
  "mcpServers": {
    "ailang-parse": {
      "command": "npx",
      "args": ["-y", "@ailang/parse", "mcp"]
    }
  }
}

Python ≥ 3.8 (via uv):

{
  "mcpServers": {
    "ailang-parse": {
      "command": "uvx",
      "args": ["ailang-parse", "mcp"]
    }
  }
}

Go (install once, run from PATH):

go install github.com/sunholo-data/ailang-parse-go/cmd/ailang-parse@latest
{
  "mcpServers": {
    "ailang-parse": {
      "command": "ailang-parse",
      "args": ["mcp"]
    }
  }
}

All three run the same stdio MCP bridge to the hosted API. No API key needed upfront — the agent handles device auth automatically.

Claude Desktop (local)

For local-only parsing without the hosted API, add to claude_desktop_config.json:

{
  "mcpServers": {
    "ailang-parse": {
      "command": "ailang",
      "args": ["serve-api", "--mcp", "--routes-only", "--caps", "IO,FS,Env", "docparse/"],
      "cwd": "/path/to/ailang-parse"
    }
  }
}

Claude Code

Install the plugin for automatic MCP registration:

claude install github:sunholo-data/docparse-skill

Or add to .mcp.json (project or global):

{
  "mcpServers": {
    "ailang-parse": {
      "url": "https://docparse.ailang.sunholo.com/mcp/"
    }
  }
}

For local-only mode, use the command form instead:

{
  "mcpServers": {
    "ailang-parse": {
      "command": "ailang",
      "args": ["serve-api", "--mcp", "--routes-only", "--caps", "IO,FS,Env", "docparse/"],
      "cwd": "/path/to/ailang-parse"
    }
  }
}

Cursor

Add to .cursor/mcp.json in your project root:

{
  "mcpServers": {
    "ailang-parse": {
      "command": "ailang",
      "args": ["serve-api", "--mcp", "--routes-only", "--caps", "IO,FS,Env", "docparse/"],
      "cwd": "/path/to/ailang-parse"
    }
  }
}

VS Code

Add to .vscode/settings.json:

{
  "mcp": {
    "servers": {
      "ailang-parse": {
        "command": "ailang",
        "args": ["serve-api", "--mcp", "--routes-only", "--caps", "IO,FS,Env", "docparse/"],
        "cwd": "/path/to/ailang-parse"
      }
    }
  }
}

Tool Discovery

AILANG Parse follows an agent-first design. Three discovery mechanisms:

MCP tools/list

MCP clients call tools/list and receive all 7 tools (or 4 in local mode) with full JSON Schema input definitions. This is automatic — no configuration needed once the server is connected.

mcpFormats (recommended first call)

Agents should call mcpFormats first. It returns the full service contract: 17 input formats, 9 output formats, 26 test samples, 3 pricing tiers, and all 7 tool names. This gives the agent everything it needs to advise the human intelligently.

REST /api/v1/tools

Non-MCP clients can fetch tool definitions via REST:

curl -s https://docparse.ailang.sunholo.com/api/v1/tools | jq '.tools | length'
# 7

curl -s https://docparse.ailang.sunholo.com/api/v1/tools | jq '.tools[0].name'
# "mcpParse"

With Claude Code

Two options for document parsing in Claude Code:

Option 1: MCP Server (recommended)

Add the config above to your settings. Claude discovers the tools automatically and uses them when you ask it to parse or convert documents.

Option 2: Claude Code Skill

Install the AILANG Parse skill for zero-config local parsing without running a separate server. See the Claude Code integration page.

Which to choose? Use the MCP server if you need document parsing across multiple agents or want a shared service. Use the Claude Code skill if you only use Claude Code and want zero-config local parsing.

With Other Agents

Any MCP Client

Any client that speaks MCP can connect via stdio or HTTP. The server advertises its tools, resources, and capabilities via the standard MCP protocol.

REST API

Non-MCP agents can call the same tools via the REST API:

# Python — parse with the hosted API
import requests

response = requests.post(
    "https://docparse.ailang.sunholo.com/api/v1/parse",
    json={"filepath": "report.docx", "outputFormat": "markdown", "apiKey": "dp_YOUR_KEY"}
)
markdown = response.json()["result"]

# Or use the Python SDK
from ailang_parse import DocParse
client = DocParse(api_key="dp_YOUR_KEY")
result = client.parse("report.docx", output_format="markdown")

LangChain / LlamaIndex

Register AILANG Parse as a tool in your agent chain:

from langchain.tools import StructuredTool

ailang_parse = StructuredTool.from_function(
    func=lambda filepath, output_format="blocks": requests.post(
        "https://docparse.ailang.sunholo.com/api/v1/parse",
        json={"filepath": filepath, "outputFormat": output_format, "apiKey": "dp_YOUR_KEY"}
    ).json()["result"],
    name="ailang_parse",
    description="Parse a document into structured blocks, markdown, or HTML"
)

Frequently Asked Questions

What is the AILANG Parse MCP server?

AILANG Parse ships as a native MCP server with 7 tools: mcpParse (parse any document), mcpConvert (convert between formats), mcpFormats (discover formats, samples, and pricing), mcpEstimate (predict cost before parsing), mcpAuth + mcpAuthPoll (device auth for hosted API), and mcpAccount (tier, quota, usage, history). Local mode exposes 4 tools; hosted mode exposes all 7.

How do AI agents discover AILANG Parse as a tool?

Via MCP's tools/list method (automatic when connected), or via the REST endpoint GET /api/v1/tools which returns full JSON Schema definitions for all 7 tools. Agents should call mcpFormats first — it returns the full service contract including samples, pricing tiers, and capabilities.

Which AI agents work with the AILANG Parse MCP server?

Any MCP-compatible agent: Claude Code, Claude Desktop, Cursor, VS Code Copilot, Windsurf, and custom agents using MCP client libraries. Non-MCP agents can use the REST API directly.

Do I need an API key for local use?

No. Local MCP (stdio or localhost HTTP) works without authentication. The hosted version at docparse.ailang.sunholo.com requires a dp_ API key. Agents can get one automatically via mcpAuth (RFC 8628 device authorization) — no manual signup needed. The free tier gives 1,000 requests/month.

How does agent authentication work?

The agent calls mcpAuth, which returns a verification URL and code. The agent tells the human to open the URL and sign in. Then it polls with mcpAuthPoll until approved. On approval, the agent receives a dp_ API key and tier info. The entire flow happens within MCP — no manual configuration.

How can an agent advise on pricing and quotas?

Call mcpAccount(action: "pricing") without auth to get tier details. With an API key, mcpAccount(action: "status") shows the human's current tier, quota used, and quota remaining. mcpEstimate predicts whether a file needs AI (counts against AI quota) before parsing. All errors include a suggested_fix field the agent can act on.

What about PDF and image parsing?

Add --caps AI,Net --ai gemini-2.5-flash to enable AI parsing locally. You need your own GOOGLE_API_KEY. The hosted version includes AI parsing in all tiers (50/month free, 500 Pro, 2,000 Business). Use mcpEstimate to check if a file needs AI before parsing.

Why is structured document parsing better for AI agents than flat text extraction?

When an AI agent receives flat text, it loses the ability to reference specific table cells, identify which text was inserted versus deleted (track changes), or attribute comments to specific reviewers. AILANG Parse gives agents a typed Block ADT where each element has semantic meaning, enabling precise reasoning about document structure.