Editor's picks
When an AI says "done," ask it to show you
An AI's 'done' sounds the same whether the work happened or not. The fix is one small habit: don't take its word for it, ask it to show you a result you can check yourself, sized to the task.
AI 說「完成了」,怎麼確認它真的做完?
AI 回報「完成了」的時候,真的做完、做一半繞過去、方向整個誤會,那段話讀起來幾乎一樣。與其判斷那句話可不可信,不如養成一個反射:給我看一個我自己查得到的東西,commit、測試輸出、diff。
Claude Fable 5: First Public Mythos-Class Model, One Day In
Anthropic released Claude Fable 5 on June 9 — the first publicly available Mythos-class model, one tier above Opus. What it is, what it costs, the June 22 deadline on the subscription window, and what changed when I pointed three real projects at it for a day.
Claude Fable 5 是什麼?第一個公開的 Mythos 級模型,加上我第一天的使用心得
Anthropic 6/9 釋出第一個公開的 Mythos 級模型 Claude Fable 5。這篇整理它跟 Opus 4.8 的關係、定價、6/22 截止的訂閱免費期,加上第一天把三個專案丟給它跑的心得:它對治理流程的遵守程度是真的,token 也是真的兇。
怎麼讓 AI agent 照流程走:閘門只記帳,不攔人
流程裡那些閘門其實不在執行時擋住 AI agent,它要的是一張改不掉的收據。真正有牙齒的不是閘門,是記錄抹不掉、賴不掉。
Claude Code 多了個 dynamic workflows,我打開那段 JS 看了一下
Claude Code 5/28 釋出 dynamic workflows,跟 Opus 4.8 同一天上。比起「能開 1000 個 subagent」那個數字,更關鍵的是 orchestration 那段 JS 是 Claude 寫的、不是 Claude 在跑——這件事其實滿值得想一下的。
How Claude Code's Dynamic Workflows Run 1,000 Subagents
Claude Code's new dynamic workflows hand the orchestration plan over to a JavaScript script that Claude writes. The runtime executes it with up to 1,000 subagents — 16 concurrent — and Claude's context only sees the final cross-checked answer.
AI 連草莓有幾個 r 都數錯,是它笨嗎?
叫 AI 數 strawberry 有幾個 r,它曾經很有自信地答錯。新模型現在大多答對了,但它當初為什麼會錯——用一個積木的比喻聊聊,順便講為什麼那個原因到現在還沒真的消失。
Why Does AI Sound So Confident When It's Wrong?
AI's most dangerous trait isn't that it's wrong sometimes. It's that its tone when wrong is identical to its tone when right. Here's my plain-language take on why, including why it won't just say 'I don't know'.
How I Use ChatGPT, Claude, and Gemini Day to Day
Not a benchmark or a verdict on which AI is best — just the small habits I picked up from keeping ChatGPT, Claude, and Gemini all open: route by task, give context first, don't expect one perfect answer, and verify the confident-sounding stuff.


