The AI Skills Shift Inside Finance

Siqi Chen

Are you keeping up? Here’s what’s needed now—including the first question to ask before any AI rollout.

Most days now, I have several agents running in parallel on my computer.

One has been fixing dozens of broken tests. It doesn’t get bored or frustrated. It just keeps adjusting until the tests pass, again and again, across multiple workers. Another agent spent the night experimenting with our backend build process. In the morning, it had shaved eight minutes off on its own while I slept.

A third has been trying to export financial models to Excel with correct formulas. It exports to Excel, uploads to Google Sheets, exports to CSV, compares the calculated values, fails, adjusts, then tries again. Over and over. That kind of slow, repetitive debugging would usually sit in someone’s backlog for days.

My computer crashes about eight times a day now because every agent spins up its own environment. I’m out of RAM, and I had to move all of this to the cloud so I could run as many instances as I want. It’s a ridiculous problem to have.

But this is what AI looks like in practice right now. Just one person managing a small fleet of systems that work in parallel and don’t get tired. Finance isn’t operating this way yet, but it will. And when it does, the difference between who benefits and who doesn’t won’t be who bought the right tool. It’ll come down to something that’s always mattered, long before AI showed up.

There’s a lot of conversation in finance about AI: which vendors to choose, how to write policies, how to roll out training. Those questions matter, but they’re downstream. The actual constraint is management. If you can define the problem clearly, provide enough context and recognize when the output is wrong, AI becomes useful quickly. If you can’t, the results feel erratic and disappointing. And this comes down to how well you already know how to think.

Your mental model of AI is already stale

A lot of the skepticism in finance right now comes from bad experiences people had in 2025. They tried an agent, it got the thing wrong, they concluded AI wasn’t ready and they moved on. That conclusion made sense at the time.

But something significant happened in late November and early December of 2025. Claude Opus 4.5 and GPT-5 crossed a threshold that most people haven’t registered yet, in long-range planning and their ability to evaluate their own work. If you’ve used Claude Code or Open AI’s Codex recently, you probably know what I mean. A year ago, you’d look at what the model was doing and think: We’re clearly far from AGI. Now, that answer is harder to give. The philosophical debate continues, but the practical difference for most tasks has become very difficult to see. The output is nearly indistinguishable from an actual average human.

Most products haven’t caught up to what the underlying models can now do. Which means most people’s mental model of AI is still calibrated to something that no longer exists.

The “I tried it and it didn’t work” problem is real, but when I dig into it, it’s almost always about people: what they asked for and how much context was provided. It comes down to how the work was defined.

AI exposes how you manage

[Engineer] Geoffrey Huntley says, “LLMs are mirrors of operator skill,” and that’s been true in my experience. What you get from an agent depends on how clearly you define the task, how much context you provide, what constraints you set and how well you can judge the output.

None of that is new. It’s just management.

And many people are not particularly good at it.

Here’s how most people use AI: They give a vague prompt, keep most of the context in their head, never define what good looks like, get a generic result and conclude that AI doesn’t work. When I ask whether they tried giving more context or being more specific about what they wanted, the answer is usually, “No.” If you change only one thing—define the outcome more precisely or show an example of a good answer—the results improve disproportionately.

This is the same dynamic you see with a new hire. If a manager hands off a task without context, the hire gets it wrong, and instead of asking, “Was I clear enough?” they decide the hire isn’t capable. With people, we eventually recognize this pattern and try to fix it. We invest in onboarding and we coach managers. But with AI, we just stop using it.

Finance makes the gap visible

In finance, this gets amplified. A good finance professional operates on multiple layers of context that live in their head. They know which systems are reliable, which variances actually matter, which controls are sacred and which are legacy, and what the CEO truly cares about this quarter (even if it wasn’t said directly). They operate on intuition built from experience, not from something written down. Translating that intuition into explicit context (for a human or a model) is genuinely hard, so it often doesn’t happen.

If the context stays implicit, AI can’t reason with it. So the model produces something that looks reasonable at first glance, and only falls apart on closer inspection. And at that point, many teams conclude that the technology isn’t ready. But often, what they’re seeing is that their organization hasn’t learned how to externalize its judgment in a way that others can work with.

That constraint was always there. AI just makes it harder to ignore.

The question to ask before any AI rollout

Before you introduce AI into a team, it’s worth asking how well you onboard a strong junior analyst. When someone capable joins your team, can you describe what good work looks like in concrete terms? Can you give them enough written context that they don’t have to interrupt you every 15 minutes? When they miss the mark, can you explain precisely where and why?

If you’re already good at that, AI tends to feel manageable. If you’re not, you might end up blaming the tool.

So before asking which tool to buy, it’s worth pausing on something more basic:

Can your team clearly explain what they want, in writing?
Can they externalize context instead of holding it in their heads?
Can they tell the difference between output that’s actually correct and output that just sounds correct?

If the answer is, “No,” that’s the problem to solve first, because no model can fix a management problem.

This is the work

When organizations respond to AI, the instinct is often to build training programs and policies. I understand the instinct (it feels responsible), but it frames AI as something separate from the actual work. Like it’s something you prepare for, and only later apply.

But you don’t understand what these systems can do by attending a session or reading a guide. You understand it by running them against your own data, watching where they fail, adjusting the prompt and realizing that what sounded precise in your head was actually ambiguous. That’s when learning happens.

So your goal each week should be a workflow that’s meaningfully better than it was before, and that’s usable again next week. If it doesn’t survive contact with reality, it doesn’t count.

Next, to accelerate adoption, your team needs to watch someone they respect do something genuinely useful with AI. But that only happens if you build the intuition yourself first. Not by reading about it or delegating an “AI champion” to figure it out. You have to get your hands in it, try to use it on real work, hit the failures, understand why they happened and iterate. That’s how you develop the judgment to know where the models are strong and what kind of context they need. And that judgment is what you then share.

Once you have that judgment, share it plainly. Demo a workflow you actually built and walk through what failed. Use real data. Specific and honest examples are more valuable than any polished success story, and they give other people something they can try.

The skill shift is already happening

The finance teams that win over the next 12 months will be the ones whose leaders can clearly define what good looked like, externalize their judgment, give real context and tell the difference between something correct and something that just sounds confident.

Those were always the skills that made someone good at managing a team of people.

They’re the same skills that make someone good at managing a fleet of agents.

Many finance leaders won’t realize this until they watch a competitor move faster with fewer people. A few will notice earlier.

Right now, I can feel it in my own work. My machine strains under the load of agents running in parallel. It’s messy and it crashes often. It’s far from elegant. But the leverage is real, and it’s already here.

Siqi Chen

Siqi Chen is the CEO of Runway, a financial planning software company based in San Francisco, California.

Get the CFO Leadership Briefing

Sign up today to get weekly access to the latest issues affecting CFOs in every industry

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Name*

First Last

Company*

Email*

This field is hidden when viewing the form

Send me more information about the CFO Peer Network.

A members-only peer network for CFOs. Members meet both online and in-person a few times a year.

Yes, please send me details.

CAPTCHA

Insights

Topics

Research

Subscribe

Chapters

Events

More Networks

Events

Education

Insights

Topics

Research

Subscribe

Community

Chapters

Events

More Networks

Events

Education

The AI Skills Shift Inside Finance

Siqi Chen

Your mental model of AI is already stale

AI exposes how you manage

Finance makes the gap visible

The question to ask before any AI rollout

This is the work

The skill shift is already happening

Siqi Chen

Get the CFO Leadership Briefing

Sign up today to get weekly access to the latest issues affecting CFOs in every industry

MORE INSIGHTS

One-Time Password Verification