Skip to main content

One post tagged with "series-finale"

View All Tags

Any AI that can not say 'I don't know' may be lying to you

· 10 min read

This is the sixth post in a series on AI delegation, trust, and authority. Read the series introduction here. Earlier posts cover authority, reproducibility, visibility, and decision budgets.


One consequence of AI being trained to be so eager to please — the "helpful assistant" persona baked in via reinforcement learning — is that it will make things up. This is perhaps the first difference we must internalise, and where we should be careful not to anthropomorphise the machine too readily. If a human makes things up, we suspect deceit and ulterior motives. An AI's motives are fashioned by that eager-to-please training: it invents references (hallucinations) because it's trying to please you. This raises real questions about how adversarial we want AI to be — an AI trained for brutal honesty may give us better truth-tracking, but at the cost of the compliance that makes it useful in the first place. For now, we can build trust in AI only if we give it a way out to say "I don't know". Most hallucinations I see these days are the prompt's fault rather than the AI's.

The chatbot that told New Yorkers to break the law

An example of where this goes seriously wrong is when an AI is deployed in a responsible, public-facing position with no refusal path. New York City's MyCity chatbot is the most documented case.

  • October 2023 — NYC launches MyCity, a Microsoft-powered chatbot intended to help small business owners navigate city regulations.
  • March 2024 — The Markup (Colin Lecher) tests it against actual NYC law. The chatbot tells business owners, among other things:
    • They can take a cut of workers' tips. (They can't — it's wage theft.)
    • They can fire workers who complain about harassment. (They can't — retaliation is illegal.)
    • They don't have to accept Section 8 housing vouchers. (They do — source-of-income discrimination is illegal in NYC.)
    • Rent-stabilised apartments can be turned into condos without tenant consent. (They can't.)
  • Mayor Adams defends the tool through 2024 as a "work in progress."
  • January 2026 — Mayor Mamdani's administration announces MyCity will be shut down, citing unfixable hallucination risk and active harm to small business owners who relied on it.

The diagnosis that ties the series together: MyCity didn't lack knowledge. It lacked a refusal path. Every one of those answers should have been "I don't know — consult a lawyer or call 311." Instead, every one was a confident paragraph.