Borrowed Chores

The better you get at using AI, the more work you end up doing around it.

That's the pattern I keep seeing—in my own work and in six teams who've shared their workflows with me over the past three months. You reload context because the model forgot. You check outputs because they drift. You rephrase instructions because the first version didn't stick. You switch models mid-task because one chokes where another doesn't. You run a second pass, or a third.

All of that is real skill. But how much of it will still matter in two years?

Across several reports, Anthropic's research has found that skilled users provide more examples, clarify goals more explicitly, iterate more frequently, switch approaches when stuck, and develop mental models of where the system tends to fail.¹ Read one way, that's a portrait of fluency. Read another way, it's overcompensation for a tool with weak memory, uneven reliability, and a chat interface that keeps its inner workings opaque.

Both readings are true. Which raises the obvious question: what kind of work is this, exactly? Is it expertise, or is it workaround?

Consider how a good human editor operates. You hand them a draft, they push back based on genuine understanding of your intent and the subject matter, you rewrite, and the rewriting is where the thinking actually happens. Some of the back-and-forth with AI resembles this productive friction. You discover what you want by reacting to partial attempts. Collaboration is inherently interactive, even with an excellent partner. Not all compensatory labor is merely compensatory.

But the analogy has limits. When a human editor asks clarifying questions, they're drawing on contextual understanding and communicative intent. When an LLM asks questions, it's generating statistically plausible continuations without grounding in what you actually need. When I catch myself reloading context for the fourth time because the model silently lost track of a constraint I set twenty minutes ago, that's maintenance, not collaboration. The fact that it feels like collaboration—because conversation is natural, because chat makes everything look like dialogue—is part of the problem.

And a lot of that feeling comes from the interface itself.

Chat is seductive. You type, it replies, you keep going. But chat sneaks in bad assumptions. State is mostly implicit. Context is easy to lose. What the system knows, what constraints still apply, what it's actually tracking—none of it is well represented. The whole workflow gets flattened into turn-taking, as if conversation were the same thing as work. These systems can sound like they understand because statistical fluency wears the mask of comprehension. Chat amplifies that effect. It also makes the user's compensating labor feel natural.

Natural is not the same as good. And for serious work, chat is often the wrong interface.

So I've started sorting the labor into two piles. Borrowed chores are tasks the human does because the system can't yet carry its own weight—work that better software should eventually absorb. Durable responsibilities are the things that require having a stake in the outcome: judgment, accountability, ownership of the consequences.

The distinction is easy to state and hard to apply, because the two things are tangled in practice. Yesterday I watched a model hallucinate a person's work history when it struggled to parse a PDF. Catching that is a borrowed chore—better tooling should surface the error before I have to. Deciding whether to hire a job candidate? That's durable. The first kind of work will shrink as the tools improve. The second kind won't, because it's not about capability—it's about stakes.

Clarifying goals, specifying constraints, iterating on drafts, checking work—these are not labor invented by AI. They're part of good thinking. The question is how much of that labor reflects the human's genuine contribution and how much reflects the tool's current limitations. Usually it's both, in proportions that shift depending on the task. Which brings us to the claim that's been circulating most confidently.

The most popular candidate for a durable human contribution is taste. But I'm not convinced.

From startup essays to design discourse, taste has become the go-to answer for what AI can't touch—the durable human skill, the moat. But taste isn't a durable responsibility. It doesn't require having a stake in the outcome. It's a skill, and skills can be learned—including, probably, by machines. Current AI is so bad at taste that the claim feels obvious: the output hedges, gravitates toward the safe and median, and performs a kind of diplomatic emptiness that feels like the opposite of judgment. But this is a current limitation, not a permanent one, and it's worth being honest about why. Large language models are, at the mechanical level, preference machines. Next-token prediction is ranking, billions of times over. The models can discriminate at a scale no human could match—they just discriminate toward the average, which is why the output looks tasteless. The foundation is there, though.

Whether embodied experience, cultural situatedness, and the quality of having lived through the aesthetic traditions one judges can be modeled from data alone remains genuinely open. The claim that taste is permanently safe from AI is not something the current evidence settles—it's a question the field is still working through, and honest people disagree about it.

What makes the claim for taste seductive is that taste often overlaps with something harder to delegate: making a value judgment about what matters and being willing to own the consequences. Deciding that this legal brief is good enough to send, that this diagnosis warrants action, that this product should ship—is not just applying taste. It's also taking accountability for the taste.

Accountability, properly understood, requires being able to bear consequences. Corporations can be legal persons because they can be sued, fined, dissolved—there are mechanisms for translating organizational decisions into tangible stakes. Software has no such capacity. An LLM cannot meaningfully own a bad outcome. It cannot be held responsible in ways that matter. Until AI systems can genuinely bear consequences, not merely simulate consequences through training signals but actually absorb and respond to stakes, accountability remains a durable human responsibility.

That's the real split, and it blocks two bad conclusions at once. The maximalist one says AI will do everything and the human disappears. The comforting one says human-in-the-loop means the human role stays basically the same forever. Neither is right. Some work is migrating into software. Some human value is migrating upward toward framing, standards, review, and accountability. And some things that look like important human contributions right now are really temporary scaffolding around tools that aren't finished yet.

What AI should absorb is the pile of borrowed chores—holding context without being handheld, making state visible, preserving constraints across steps, surfacing uncertainty in ways a person can act on, making verification cheaper while silent failure gets rarer. The durable part—judgment, standards, accountability—stays. What sits between those poles is a mix of genuinely formative work and temporary scaffolding we will eventually be glad to let go. Both are present, in proportions we won't see clearly until the tools improve and we find out what we still reach for.

Notes

1. Swanson, K. et al. "Anthropic Education Report: The AI Fluency Index." Anthropic, February 2026. Users creating artifacts were more likely to clarify goals (+14.7pp), provide examples (+13.4pp), and iterate (+9.7pp), but less likely to check facts (-3.7pp) or question the model's reasoning (-3.1pp). See also Handa, Tamkin, et al. "Which Economic Tasks are Performed with AI?" The Anthropic Economic Index, February 2025, which found 57% of tasks augmented and 43% automated.