The AI Hype Trap – Why Actual Productivity Gains Are Still Elusive

4 months ago

Artificial intelligence is often hailed as a game-changer across many industries, promising to revolutionise productivity, reduce costs, and open new frontiers of innovation. Even the UK government talks about it in such terms...

However the AI reality so far falls far short of the hype. AI’s supposed benefits seem impressive at first glance, but disappointing if you look more closely...

The Productivity Paradox

The research lab METR put AI’s productivity claims to the test by running controlled trials with experienced computer programmers. These developers were given AI tools to help them write code, with the expectation that they’d work faster and more efficiently.

The result? The exact opposite. Although developers believed they were developing 20% more efficiently due to AI, in reality, they were 19% behind.

The main reason for this was the time spent correcting AI errors, and then developing new processes to work with the AI tools rather than just tackling the problems directly with their human brain-things.

And real world experience is increasingly showing that rather than unlocking hidden productivity, businesses are facing increased errors, dependency on faulty systems, and more time spent on debugging.

But AI can reduce costs...

That is not to say that AI is of no use. The grand narrative of businesses these days is "cost reduction," not innovation.

And AI can certainly be used to automate some mundane tasks or reduce the need for some jobs, I'm thinking chat bots straight away, or any kind of advice line, and this has to be tempting for businesses when costs are increasing thanks to tariffs and taxes.

Final Thoughts

It's to be expected that AI is going to make mistakes, especially on more complex tasks. The trick is to learn how to work with it so one can weed out these errors.

Probably over time AI will evolve to be more accurate, but it would seem for now it still requires significant human steering!

ai cent proofofbrain neoxian tech stem

0.000

14 comments

@videoaddiction 71

4 months ago

The hours we spend correcting mistakes and getting used to new systems can often overshadow any advantages, reminding us that true innovation takes more than just fancy tools.

0.000

@riverflows 81

4 months ago

I find this really interesting. It's always the same with leaps in reach - sometimes what it promises is simply not that. However AI has made the biggest impact on the way we learn, so I expect we will see that knock on effect in the years to come as well. Whilst it might not be making leaps and bounds in some areas, it's certainly coming.

0.000

@revisesociology 82

4 months ago

I guess no one knows for certain what the longer term impacts will be!

0.000

@steevc 80

4 months ago

There's a lot of hype and money to be made. I'm not using these tools at work yet, but I'm old school for coding. They will get better, but mistakes will be made.

0.000

@revisesociology 82

4 months ago

i guess the other thing is that you don't develop as much of an understanding if yer using AI!

0.000

@mrtats 63

4 months ago

Setting-specific factors
We caution readers against overgeneralizing on the basis of our results.
The slowdown we observe does not imply that current AI tools do not often improve developer’s productivity—we find evidence that the high developer familiarity with repositories and the size and maturity of the repositories both contribute to the observed slowdown, and these factors do not apply in many software development settings. For example, our results are consistent with small greenfield
projects or development in unfamiliar codebases seeing substantial speedup from AI assistance.

AI-specific factors
We expect that AI systems that have higher fundamental reliability, lower
latency, and/or are better elicited (e.g. via more inference compute/tokens, more skilled prompting/scaffolding, or explicit fine-tuning on repositories) could speed up developers in our setting (i.e. experienced open-source developers on large repositories).

Agents can make meaningful progress on issues
We have preliminary evidence (forthcoming) that fully autonomous AI agents using Claude 3.7 Sonnet can often correctly implement the core functionality of issues on several repositories that are included in our study, although they fail to fully satisfy all requirements (typically leaving out important documentation, failing linting/styling rules, and leaving out key unit or integration tests). This represents immense progress relative to the state of AI just 1-2 years ago, and if progress continues apace (which is a priori at least plausible, although not guaranteed), we may soon see significant speedup in this setting.

From that same paper, you are doing what they warn against.

0.000

@hivebuzz 74

4 months ago

Congratulations @revisesociology! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)

	You have been a buzzy bee and published a post every day of the week.

_{You can view your badges on your board and compare yourself to others in the Ranking}
_{If you no longer want to receive notifications, reply to this comment with the word STOP}

0.000

@bala41288 80

4 months ago

I agree that AI does have a lot of errors but it is surprising to read that the results were 19% slow.

I started using AI a lot for my coding needs and recently I did a project that took me just 3 weeks. Otherwise it would have taken 3 months to complete. If I had a paid version, I might have been able to complete the project in 3 days maybe.

But yeah true that AI works for some people and it doesn't. It will take a lot more time to mature.

0.000

@revisesociology 82

4 months ago

I think maybe co-creating and using it selectively is the way to go!

0.000

@sagarkothari88 76

4 months ago (Edited)

Hey @revisesociology!

Now seniors have wrong impressions due to AI.
Just get it done with AI. It's doable in 10 mins.
Every other app in the world can do it, why can't you with AI?

Expectations are no less than absurd.

absurd

Here is a !PIZZA for you 🍕 for calling it out

0.000

@revisesociology 82

4 months ago

I guess it's a little more complex than that in many cases1

0.000

@pizzabot 59

4 months ago

PIZZA!

$PIZZA slices delivered:
@sagarkothari88_(16/20) tipped @revisesociology

_{Come get MOONed!}

0.000

@shainemata 65

4 months ago

Another thing that happened is that as we save more time, we fill it with other or superfluous activities. In other words, we come up with more to do to fill our time

0.000

@revisesociology 82

4 months ago

Yes fair point, and probably most of it totally unecessary!

0.000