In Defense of Tokenmaxxing

May 13, 2026

28 min episode · 2 min read

Episode

28 min

Read time

2 min

AI-Generated Summary

Published May 13, 2026

Key Takeaways

✓Goodhart's Law vs. Experimentation Value: Token leaderboards do create gaming incentives — Amazon employees admitted inflating usage scores — but this flaw in measurement design doesn't invalidate the underlying goal. Companies should distinguish between vanity token consumption and genuine agentic experimentation, using output reviews alongside usage metrics rather than abandoning incentive structures entirely.
✓Assisted-to-Agentic Shift Requires New Work Primitives: Managing AI agents represents a fundamentally new knowledge-work primitive, unlike prompting ChatGPT which was merely a new skill. No established best practices exist yet, meaning the only path to organizational competency is hands-on experimentation. Companies that delay this experimentation phase risk falling irreversibly behind competitors who absorbed early lessons.
✓Selection Bias Distorts the Tokenmaxxing Narrative: Media coverage of token fraud — like the Amazon internal tool abuse story — is inherently unrepresentative. Employees generating genuine value through AI don't generate headlines. Treating visible edge-case abuse as evidence of majority token consumption being wasteful is a hasty generalization that leads enterprises toward incorrect adoption strategy decisions.
✓R&D Logic Applies at the Individual Level: Token consumption without immediate quarterly financial return mirrors traditional R&D spending — costly upfront, valuable long-term. One host example: roughly one billion tokens consumed monthly with near-zero direct revenue generated, yet producing substantial learnings that compound into future token efficiency gains and audience-facing products with measurable downstream value.
✓Salesforce's Alternative Metric Points Forward: Salesforce introduced "agentic work units" as a measurement framework designed to track output and impact rather than raw token consumption. This approach — already earning earned media coverage in Axios — signals the direction enterprises should move: incentivize experimentation volume while coupling it with demonstrable output review to close the Goodhart's Law loophole.

What It Covers

Host NLW defends "tokenmaxxing" — the enterprise practice of incentivizing employees to consume more AI tokens — arguing that critics misapply Goodhart's Law, commit selection bias, and underestimate the essential role of unstructured experimentation during the shift from assisted to agentic AI workflows.

Key Questions Answered

•Goodhart's Law vs. Experimentation Value: Token leaderboards do create gaming incentives — Amazon employees admitted inflating usage scores — but this flaw in measurement design doesn't invalidate the underlying goal. Companies should distinguish between vanity token consumption and genuine agentic experimentation, using output reviews alongside usage metrics rather than abandoning incentive structures entirely.
•Assisted-to-Agentic Shift Requires New Work Primitives: Managing AI agents represents a fundamentally new knowledge-work primitive, unlike prompting ChatGPT which was merely a new skill. No established best practices exist yet, meaning the only path to organizational competency is hands-on experimentation. Companies that delay this experimentation phase risk falling irreversibly behind competitors who absorbed early lessons.
•Selection Bias Distorts the Tokenmaxxing Narrative: Media coverage of token fraud — like the Amazon internal tool abuse story — is inherently unrepresentative. Employees generating genuine value through AI don't generate headlines. Treating visible edge-case abuse as evidence of majority token consumption being wasteful is a hasty generalization that leads enterprises toward incorrect adoption strategy decisions.
•R&D Logic Applies at the Individual Level: Token consumption without immediate quarterly financial return mirrors traditional R&D spending — costly upfront, valuable long-term. One host example: roughly one billion tokens consumed monthly with near-zero direct revenue generated, yet producing substantial learnings that compound into future token efficiency gains and audience-facing products with measurable downstream value.
•Salesforce's Alternative Metric Points Forward: Salesforce introduced "agentic work units" as a measurement framework designed to track output and impact rather than raw token consumption. This approach — already earning earned media coverage in Axios — signals the direction enterprises should move: incentivize experimentation volume while coupling it with demonstrable output review to close the Goodhart's Law loophole.

Notable Moment

A viral Slack screenshot showed a manager praising an employee for spending $600 on Anthropic overnight while flagging a $23 Uber Eats order as a policy violation. Though likely staged, its two million views revealed widespread frustration with how enterprises are currently framing AI investment priorities.

Know someone who'd find this useful?