Far West Consulting Far West Consulting
EN 繁中

立場 · 2026年5月19日 · Far West Consulting

Why we don't pilot with the most senior team

The conventional advice is to start AI pilots with the most senior cohort to build executive buy-in. The data says that cohort produces the least productivity gain, the most quality risk, and the highest visibility cost. We do the opposite.

目前研究筆記的主體以英文撰寫;繁體中文版本陸續上線中。

The conventional consulting playbook says start your AI pilot with the most senior team. The argument is straightforward: senior people set tone, executive buy-in cascades, the rest of the organization follows. The argument has been right enough often enough that it’s load-bearing on a lot of rollouts.

Three findings argue it’s wrong for AI specifically.

The gain is at the bottom, not the top. Brynjolfsson, Li, and Raymond’s 2025 RCT (n=5,172, Quarterly Journal of Economics) found AI delivered a 30% productivity gain at the bottom of the skill distribution, no measurable gain at the top, and a quality decline among the most experienced.7 A pilot scoped to the senior team starts with the cohort where the productivity story is hardest to demonstrate. The week-six telemetry shows flat or slightly negative quality on the senior cohort and there’s no clean before-and-after to take to the board.

The visibility tax is highest at the top. Slack’s Fall 2024 Workforce Index found 48% of desk workers are uncomfortable telling their manager they use AI.14 The Duke PNAS work on manager-visibility narrows that further: AI use that goes through a manager’s awareness operates differently than AI use that doesn’t.12 Senior employees have the most to lose from being seen as AI-dependent — the role identity is “I make the judgment call” — so they hide use, or perform non-use, or partially adopt and partially revert. The pilot data comes back noisy because the cohort is gaming what gets observed.

Self-report drifts away from telemetry at the senior level. Microsoft Research’s GitHub Copilot RCT (n=200+, randomized) found engineers self-reported time savings that telemetry did not corroborate.6 The gap was larger on the more-experienced cohort. Senior people are good at narrating their own productivity. That doesn’t make them good measurement instruments for the rollout.

What we do instead. We sequence pilots to the cohort that has the most room to grow and the lowest visibility cost. New hires, recent rotators, people who started the role inside the last 18 months. Their productivity gain is measurable. Their relationship to manager-visibility is less fraught. Their adoption hold at the week-eight calibration is the data that justifies the next phase of the rollout. The senior cohort’s training comes second, with a different curriculum aimed at the judgment layer where automation degrades output, and it’s anchored against the evidence the first cohort produced.

The pattern is consistent across the engagements where we’ve watched it play out. Pilots that start at the top stall around week four. Pilots that start one layer down hold through week eight and produce the evidence that funds the senior-cohort program. The buy-in argument has the order wrong. Buy-in comes from outcomes, not from sequence.

Book a call

診斷、探索通話,或框架深入閱讀——選擇下一步合適的方式。

參考文獻

  1. 7Brynjolfsson et al., Quarterly Journal of Economics 2025 — field experiment, n=5,172 customer service agents. AI assistance produced +30% productivity for low-skill workers, ~0% for high-skill workers, and a measurable quality decline at the top of the skill distribution.

  2. 14Slack Workforce Lab. The Fall 2024 Workforce Index. November 12, 2024. Source: slack.com/blog/news/the-fall-2024-workforce-index-shows-executives-and-employees-investing-in-ai-but-uncertainty-holding-back-adoption.

  3. 12Reif, J.A., Larrick, R.P., & Soll, J.B. (2025). Evidence of a social evaluation penalty for using AI. Proceedings of the National Academy of Sciences. n>4,400 across four experiments. DOI: 10.1073/pnas.2426766122.

  4. 6Microsoft Research, GitHub Copilot randomized controlled trial — n=200+ engineers, randomized. Telemetry showed no measurable productivity improvement, despite engineers self-reporting time savings (2024).