Skip to main content
We've Been Training Junior Data Engineers Wrong: AI Just Made It Obvious.
  1. Posts/

We've Been Training Junior Data Engineers Wrong: AI Just Made It Obvious.

·1226 words·6 mins
Table of Contents

A few weeks ago I read a paper that stopped me mid-coffee. Shaw and Nave at the Wharton School, titled “Thinking-Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender” [1]. Across 1,372 participants and over 9,500 individual trials, they found that people followed a deliberately faulty AI’s wrong answers 73.2% of the time. Not because they were careless, but because the AI was fluent, confident, and frictionless - and that combination suppresses the metacognitive alarm that would normally make us stop and think.

That’s alarming enough on its own. But here’s what really got me.

When the AI was wrong and participants had been given financial incentives to be accurate, plus immediate feedback after every answer - the surrender rate dropped. But it didn’t disappear. Even motivated, feedback-receiving people still followed wrong AI answers roughly 58% of the time.

Think about that. Then think about your junior data engineers.

The Training We Built for a World That No Longer Exists
#

I have been a trainer for 25 years. For ten of those years I held a Microsoft Certified Trainer certification, which meant I was formally qualified to stand in front of a room and teach Microsoft’s official curriculum. I’ve delivered more training days than I can count, across more countries than I care to admit, on topics spanning SQL Server, Azure, Power BI, and everything in between.

And I can tell you exactly what that training looked like, because I delivered it myself: here is how you do this thing. Now do it. Here is how you do the next thing. Now do it.

The “how” was everything. The “why” was barely a footnote.

That wasn’t laziness. That was what the curriculum demanded, and what the market rewarded. Students left knowing how to execute. Employers got people who could follow a pattern on day one. Certifications got issued. Everyone felt productive.

From my perspective, that world is gone.

The things we have been teaching as core competencies - the syntax, the boilerplate, the structural patterns - are exactly what agents handle now. And increasingly, they handle them relatively well. A junior engineer who has been trained primarily on procedures now sits in front of an agent’s output with no framework for evaluating whether it’s right. They have been trained to produce. They have not been trained to judge.

That’s the cognitive surrender trap, and we built it ourselves. And I say that as someone who spent a decade helping build it.

What the Research Actually Tells Us About Protection
#

The Shaw and Nave paper identified the individual differences that predicted resistance to cognitive surrender. High trust in AI made people more vulnerable - no surprise there. But the protective factors are more instructive: high fluid IQ, and high “Need for Cognition” - the stable disposition to engage in effortful analytical thinking and actually enjoy it.

You cannot train fluid IQ. But “Need for Cognition” is absolutely something you can cultivate. It grows when people are repeatedly placed in situations where thinking hard is rewarded, where skipping the reasoning step has visible consequences, where the skeptical reflex is treated as a professional virtue rather than an inefficiency.

That’s what needs to change. Not the tools we teach, but the thinking we build.

Teach Through Broken Things - Including AI
#

Here’s the shift I want to make, and it’s deceptively simple.

Stop giving junior engineers well-designed systems to implement. Start giving them broken ones to diagnose.

Not broken in obvious ways. Subtly broken. A medallion architecture where the grain differs between sources without anyone documenting it. A pipeline that works perfectly in development but silently drops rows in production because of a NULL handling difference. A slowly-changing dimension implemented as Type 1 in a context that screams for Type 2 - and a business user wondering why historical reports keep changing.

The task is not to build. The task is to find what’s wrong, articulate why it’s wrong, and explain what the consequences are.

This is how you build the diagnostic instinct that an agent cannot replace. Construction can be delegated. Diagnosis requires actual understanding of what the thing is supposed to do and why.

And critically: don’t tell them what kind of wrong. Just that something is off. Let them develop the scanning behaviour from scratch.

Now extend that same principle to AI outputs - because that’s where the stakes are highest. A join that produces fan-out because the relationship wasn’t checked. A window function with the wrong frame boundary that gives plausible-looking but incorrect running totals. An aggregation at the wrong grain that inflates revenue figures by 15% - believable enough to slip through a quick review.

Let them work with the output. Then show them it’s wrong. Then ask: what should you have checked before you trusted this?

The Shaw and Nave experiment worked exactly this way - deliberately seeding the AI with confident wrong answers. Right now, junior engineers are encountering those same failures in production, with no prepared override reflex, and with a business stakeholder watching. Better they encounter it in a safe environment where being wrong is the point.

The Instrument Rating Principle
#

I’m a pilot, rated in both gliders and single-engine piston aircraft. While I don’t hold an instrument rating for either, part of the training curriculum for both involve instrument flying. The reason is for the student pilot to develop a deep understanding of the inner workings of the instruments. There’s a training requirement in instrument flight that I keep coming back to in this context: partial-panel flying. Your instructor cover, for instance, the attitude indicator and the directional gyro - two primary instruments - and you fly on raw data only. Altimeter, airspeed, turn coordinator, compass. It’s uncomfortable and forces you to reconstruct a mental model of what the aircraft is doing without the tools that normally tell you.

The reason it’s in the curriculum isn’t because those instruments fail often. In fact, it’s exceedingly rare for well-maintaned instruments to fail at all. No, it’s because if your mental model only exists as a readout of the instruments, you cannot catch it when they lie to you. The underlying model has to exist independently of the tools that normally confirm it.

The same principle applies here.

A junior engineer who has only ever seen an agent produce a fact table has no independent model of what a fact table is - grain, additivity, foreign key integrity, what late-arriving facts do to it, or what fan-out looks like before it’s too late. They have a procedural memory of an output, not a conceptual understanding of the thing.

The agent becomes their attitude indicator. And when it lies to them, they have nothing to cross-check against.

So what does a training programme that builds the underlying model actually look like - and does it work in practice? That’s what Part 2 is about.


Join the Conversation
#

How do you train junior engineers? How are you trained as a junior engineer? I’d love to hear what the process in your organization looks like. Find me on LinkedIn or BlueSky.


References
#

  1. Shaw, S. D., & Nave, G. (2026). Thinking-Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender. SSRN.

Photo by Pavel Danilyuk: https://www.pexels.com/photo/white-toy-robot-in-black-background-8294630/