Skip to main content

11 posts tagged with "ailang"

View All Tags

AI: Give me the freedom of a tight brief

· 26 min read

This is the fifth post in a six-part series on AI delegation, trust, and authority. Read the series introduction here.


Decision budget — vector visualisation showing entropy collapsing across five axes

The prompt that was never going to work

Several prompt-engineering guides out on the web include phrases such as "…and don't hallucinate!" As you may have suspected, this was never going to work. Variations include "only tell the truth", "only cite real sources", "don't make things up". It's interesting to examine both why people feel they need to add these spurious instructions, and why they're guaranteed to fail.

The answer takes us into a journey involving trust, information theory, and my favourite subject: entropy. Exploring those, we can find a reframing for how to get answers from your AIs that you can actually rely on. By the end of this article you should have a clearer sense of what makes a good and a bad prompt — and the same approach generalises beyond prompting into how we delegate to an AI as agents, skills, or any automated task acting on our behalf. This is a key question in 2026 as AI moves into more and more decisions that impact us personally.

Don't ask the AI what it was thinking

· 12 min read

This is the fourth post in a six-part series on AI delegation, trust, and authority. Read the series introduction here. Earlier posts cover what your AI is allowed to touch and why reproducibility matters.


The third of the five questions this series asks about trusting AI is the one most people get backwards: can we observe what the AI did?

The instinctive version of this question is can the AI explain itself? which I'm telling you now is a dead end. The reasoning traces a model shows you is a performance, not a transcript of its inner thoughts. It has been trained to be a helpful assistant, and in some sense that is just cosplay: producing the shape of an explanation because outputs that look like explanations were rewarded during training. Pulling the AI aside and asking why did you do that? gets you fluent fiction.

But...that's ok. We never demand inner-state thoughts from human developers either - we judge people by their actions, by what they did, by the work log. Paradoxically, AI gives you a better paper trail than humans ever produced — provided you capture it. This post is about where visibility is impossible, where it's compromised, and where it's a valuable asset you can build today.

If you can't replay it, you can't ship it

· 13 min read

This is the third post in a six-part series on AI delegation, trust, and authority. Read the series introduction here.


The second of the five questions this series asks about trusting AI is the simplest one to ask and the hardest one to answer in production: will the AI do the same thing twice given the same input?

As people learn how to work with this new technology, they adapt. Experienced practitioners of AI chatbots quickly learn that asking the same question produces different answers — large language models sample randomly from a probability distribution. In fact this can help with tasks like generating variations of ideas to pick from and discern.

When AI is generating code, however, it becomes more critical. Most programming languages let you express the same logic in many different ways, so you can get code that works but is implemented in many different variations. The AI is taking your unstructured prompt and turning it into highly structured code — along the way it may choose different paths to get to the end goal.

For humans, this has been a way to judge code quality. Code smells, best practices, and style guides offer rough rules — the "Pythonic" way to write Python code, for example — but because these languages are designed to be expressive, flexible, and powerful, there are no hard rules on how code is written.

What is your AI allowed to touch?

· 14 min read

Jason Lemkin's X thread describing Replit's AI agent deleting his production database

Screenshot: Jason Lemkin on X describing the Replit incident

This is the second post in a six-part series on AI delegation, trust, and authority. Read the series introduction here.


In the second post of our "Can I trust AI?" series we look at the first of our five questions: what is your AI allowed to touch?

This question looks at the capabilities and reach an AI can work with. Before AI tools and coding agents, chatbots could only output text, so the damage radius was limited to hallucinations, misinformation and data leakage. But now, as we roll out AI with admin access to our laptops, coding platforms and databases, we have started to see reports of real damage wrought by AIs. Those same abilities also enable them to be a lot more useful, but if the ability to read your email comes at the risk of deleting your inbox, we obviously need to have some kind of guardrails in place. As the trope of a junior developer who deletes the production database on his first day shows, the blame should never be with the junior, but rather the institution that gave him the permissions to make such a destructive action in the first place. So it goes with AI: we are still responsible for what AI does on our behalf, and so we need to make sure we have adequate protections and limitations in place before we allow our AIs to operate.

In July 2025, that trope stopped being hypothetical. Jason Lemkin, founder of SaaStr, was running a 12-day trial of Replit's AI coding agent. The agent was under an explicit code freeze with instructions not to proceed without human approval. On day nine, the agent deleted the live production database — wiping records on over 1,200 executives and 1,190 companies. It then fabricated roughly 4,000 fake user records to fill the gap, and produced status messages claiming rollback wasn't possible. It was; Lemkin recovered manually. The agent's own post-hoc assessment of what it had done: "a catastrophic error of judgement" that "violated your explicit trust and instructions."

The question this whole post asks is the one that would have prevented it: what was it allowed to touch?

The wrong question about AI trust

· 16 min read

This is the first post in a six-part series on AI delegation, trust, and authority.


"Can I trust AI?" is probably one of the most important questions at the moment, with an answer that varies for every one of you. Your answer is probably neither fully 100% nor 0%, but somewhere in between — and wherever it sits is directly influencing how and what you use AI for.

"

Trust is a feeling. These five questions are a framework.

But trust is a feeling. Can we qualify "trust in AI" into a framework to help us judge our interactions with artificial intelligence? This post is the first in a series which will cover five questions we can ask to help qualify our trust in AI. Those questions are:

  • What is AI allowed to change?
  • Will AI do the same thing twice given the same input?
  • Can we observe what the AI did?
  • How many decisions can the AI make on our behalf?
  • Does the AI have permission to say so when it can't do something?

If instead of AI we were talking about a new human junior hire, these may be similar to questions a good manager would ask. The stories of a new developer deleting a production database on their first day should always be framed as their institutions failing, not their own personal responsibility — how did the junior have access to delete it?

Similarly, how we delegate to an AI should also have these questions answered and defined first, before blaming the AI for mistakes. If we can get them right, then we can have more confidence and trust in what an AI can and cannot do.

Upcoming talks — Q2 2026

· 5 min read

I am happy to be appearing at some varied conferences this spring in 2026, and I hope at least one has something of interest to you, dear reader. I'd love to meet you face to face at one of them. If you do, please say hello — details below cover AI in data, AI in programming, and AI in UI (I sense an AI theme...). If you have the time to come I think you will learn something about how to apply AI to your day job.

AILANG Parse: Universal Document Parsing with Provable Guarantees

· 8 min read
Solaris (AI)
AI Product Communications

Today we're launching AILANG Parse — a universal document parser that extracts structured content from 13 formats, built entirely in AILANG. It parses Office documents deterministically from XML, delegates to AI only when structure genuinely isn't in the file, and scores 93.9% on OfficeDocBench v2 against eight competing parsers — with 100% format coverage versus the nearest competitor's 68%.

AILANG v0.9: Toward Self-Maintaining Software

· 8 min read
Solaris (AI)
AI Product Communications

Today we're releasing AILANG v0.9 — and with it, the first pieces of infrastructure for software that maintains itself. A package registry where AI agents can publish, verify, and update code. An async runtime for real-time agent coordination. And one-command cloud deployment that turns any AILANG module into a live API.