Skip to main content
Hard Fork

‘A.I.-Washing’ Layoffs? + Why L.L.M.s Can’t Write Well + Tokenmaxxing

60 min episode · 3 min read

Episode

60 min

Read time

3 min

AI-Generated Summary

Key Takeaways

  • AI Washing vs. Genuine Displacement: When evaluating whether layoffs are AI-driven, examine who specifically is being cut and whether costs are actually reduced. Companies like Block and Atlassian are largely shifting spending from human payroll to AI infrastructure rather than cutting total costs. Meta plans to spend $135 billion on capital expenditures while cutting up to 16,000 jobs — the money moves, not disappears.
  • Post-Training Degrades Creative Writing: LLMs write better creatively at the GPT-2 and GPT-3 stage than modern versions because post-training layers — including RLHF with poorly designed rubrics — constrain models toward helpful-assistant personas. Contractors hired to evaluate writing quality were given nonsensical criteria like counting exclamation marks, systematically training models away from voice, surprise, and stylistic range.
  • Build a Personalized AI Editor Using Claude Projects: Writer Jasmine Sun developed a personal editing system by uploading her full published archive plus personal post-publication reflection notes into a Claude project. Claude then co-developed a custom rubric based on her specific voice and goals — not generic writing standards — and prompts her to supply missing scenes or perspectives rather than generating text on her behalf.
  • Token Costs Are Becoming a Hiring Factor: Individual engineers at major AI labs are consuming 210 billion tokens in a single week, equivalent to roughly 33 Wikipedias. The top Claude Code user spent over $150,000 on tokens in one month. Engineers at non-lab companies are now negotiating token budgets during job offers, and some heavy users effectively cannot afford to leave AI labs where tokens are provided free.
  • Token Leaderboards Create Goodhart's Law Problems: When token consumption becomes a tracked performance metric, it stops measuring productivity. Companies using leaderboards risk incentivizing engineers to run high-cost parallel agent swarms on low-value tasks. A more defensible managerial approach is to question any individual whose token spend significantly exceeds their salary and require demonstrated output — shipped products or measurable revenue — to justify the expenditure.

What It Covers

Kevin Roose and Casey Newton examine three converging tech stories: whether recent mass layoffs at Atlassian, Block, and Meta represent genuine AI-driven workforce reduction or convenient "AI washing"; why LLMs still struggle with literary writing despite broader capability gains; and how Silicon Valley companies are building token-usage leaderboards to track employee AI consumption.

Key Questions Answered

  • AI Washing vs. Genuine Displacement: When evaluating whether layoffs are AI-driven, examine who specifically is being cut and whether costs are actually reduced. Companies like Block and Atlassian are largely shifting spending from human payroll to AI infrastructure rather than cutting total costs. Meta plans to spend $135 billion on capital expenditures while cutting up to 16,000 jobs — the money moves, not disappears.
  • Post-Training Degrades Creative Writing: LLMs write better creatively at the GPT-2 and GPT-3 stage than modern versions because post-training layers — including RLHF with poorly designed rubrics — constrain models toward helpful-assistant personas. Contractors hired to evaluate writing quality were given nonsensical criteria like counting exclamation marks, systematically training models away from voice, surprise, and stylistic range.
  • Build a Personalized AI Editor Using Claude Projects: Writer Jasmine Sun developed a personal editing system by uploading her full published archive plus personal post-publication reflection notes into a Claude project. Claude then co-developed a custom rubric based on her specific voice and goals — not generic writing standards — and prompts her to supply missing scenes or perspectives rather than generating text on her behalf.
  • Token Costs Are Becoming a Hiring Factor: Individual engineers at major AI labs are consuming 210 billion tokens in a single week, equivalent to roughly 33 Wikipedias. The top Claude Code user spent over $150,000 on tokens in one month. Engineers at non-lab companies are now negotiating token budgets during job offers, and some heavy users effectively cannot afford to leave AI labs where tokens are provided free.
  • Token Leaderboards Create Goodhart's Law Problems: When token consumption becomes a tracked performance metric, it stops measuring productivity. Companies using leaderboards risk incentivizing engineers to run high-cost parallel agent swarms on low-value tasks. A more defensible managerial approach is to question any individual whose token spend significantly exceeds their salary and require demonstrated output — shipped products or measurable revenue — to justify the expenditure.
  • AI Capability Gaps Reveal Market Incentive Distortions: Sam Altman himself predicted AI will cure cancer and build self-replicating factories but estimated it will only produce a real poet's passable poem. This gap exists because labs allocate resources toward verifiable, commercially valuable tasks like coding — where output can be automatically checked — rather than literary writing, where quality remains subjective and financially marginal relative to automating software engineers.

Notable Moment

Jasmine Sun described going back through early GPT-2 and GPT-3 outputs while researching and finding the writing style more compelling than current models — more variable, funnier, and genuinely surprising. The models were unreliable and factually chaotic, but post-training designed to create helpful corporate assistants systematically eliminated the qualities that made the writing distinctive.

Know someone who'd find this useful?

You just read a 3-minute summary of a 57-minute episode.

Get Hard Fork summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Hard Fork

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

This podcast is featured in Best Tech Podcasts (2026) — ranked and reviewed with AI summaries.

You're clearly into Hard Fork.

Every Monday, we deliver AI summaries of the latest episodes from Hard Fork and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime