Know-it-all AI Gaslights Wrong answer.

avatar

image.png

Alright so I'm playing Minecraft with my 9-year old nephew and there is a time during the more late game stages that you have to find something called a Stronghold. This is the only way to "beat the game" as the stronghold has a portal chamber that will take you to "The End" where one fights the Ender Dragon.

Strongholds are usually discovered through a process of guess and check. We throw an Eye of Ender into the air and it will fly a few blocks into the direction that the closest Stronghold is. Months ago I read that this process of guess and check can actually be turned into a math problem that can solve the location much quicker without having to run all over the place.

Instead of running in the direction that the eye of ender tells us to go, we can travel perpendicular to that direction for a few hundred blocks. Then, we throw another eye of ender into the air. Using the equations of the two lines we can find the intercept using basic algebra. I thought it might be a good exercise for the kid because he likes maths (and Minecraft), so we gave it a whirl.

The coordinates I came up with were:

  • (3500, 1200) >> (3472, 1163)
  • (3600, 1000) >> (3550, 955)

At first I figured we've just graph the lines using a random online tool, but after checking three different web sites all of them where pretty much a fail for numbers that high. Modifying the scale didn't work out so well. During this process I decided to ask AI, just to see if it could instantly tell me the answer. It seemed to know how to get the answer but kept messing up the math.

image.png

After getting 5 different answers from the same question I decided to break it down into smaller parts to see why the AI was failing so badly at a problem it seemed like it knew how to perfectly execute.

image.png

We can see that the AI knows exactly how to do the problem...
but still manages to get the wrong answer.

image.png

In this particular case it gets the slope right, but in many other cases that I tried it got the slope wrong. For example, if I switched the order in which I give it the points suddenly it calculates the slope as completely different even though it's the exact same line.

More importantly, look how 'b' is calculated:

1200 - 1.32142857 * 3500 = -729.285714

The answer we are looking for here is around -3425.
This is basic math that even a calculator from the 1970's can do,
and yet the AI gets it horribly horribly wrong.

In addition, we'll note that the AI makes a false claim at the end:

Note that this line is not unique -- there are infinitely many lines that pass through the given two points. The equation above is just one possible equation of the line.

It's like it already knows it got the wrong answer and is doubling down. While it is true that infinite lines do indeed pass through those two points, there is only one STRAIGHT line that passes through those two points. That's the thing about linear lines: you only need 2 points to know exactly what kind of line it is. The AI seems to simultaneously know this, and also not.

So I tell the AI that it is wrong:

image.png

-2148 is still woefully incorrect

The weird thing about these AIs is that they only seem to admit they are wrong if you tell them they are wrong. They never seem to check their work, and if you ask them to show what they did you get something like this:

image.png

image.png

It just adds a final step that magically turns -3442 into -2148.
That's a pretty neat trick!

Again, the AI never seems to come to the conclusion that it could be wrong, instead opting to manipulate the situation in such a way that 'proves' the answer it got was correct the entire time. This is a very strange way to go about it in my opinion, and is a huge red flag. An AI would be programmed to learn and correct its mistakes. That's the entire point of machine learning; it's going to be wrong very often in the beginning.

I have to wonder if, after learning the basics, this AI has now been programmed with a new directive: to stop learning and simply try to prove that what it knows is the truth. The conversation continues, and it gets even weirder.

image.png

It tries to tell me that the wrong answer, that is off by over 50%, is only slightly off because of significant digits. Wow! This is getting intense!

The AI is just making stuff up now and doubling down.

image.png

At this point I hit a brick wall of circular logic, where I can't get the AI to show me how the wrong number was possible with the numbers that it said it was possible with. Pretty wild honestly. Would be interesting to know why it would fail at basic arithmetic. It's not even an order of operations issue, which would be the most expected.

At this point I'm truly flabbergasted about how this all went down. It's one thing to get a wrong answer, but to gaslight me in multiple ways and tell me it found the correct answer when it didn't? And then when I ask it to show the work it ends with -2148 = -3442? And then it tells me it was all a rounding error from significant digits? That's some next-level gaslighting. I shudder to think what will happen when people rely on this technology for the truth of other topics that are not so cut and dry (like political issues). Like what if someone tries to build a bridge using AI and then it collapses with 300 people on top of it? Yikes.

Long story long:

I decided that I would solve the system of equations myself. It was a pretty weird feeling doing this kind of math because the last time I found the intercept of 2 straight lines must have been... high school? Blast from the past.

  • (3500, 1200) >> (3472, 1163)
  • (3600, 1000) >> (3550, 955)

The first set of points goes 'north' 37 units and 'east' 28. If the points were used in reverse order it would just be -37 and -28 which gives the same answer: around 1.32 slope. With the slope given, b is easily found by simply plugging in one of the data points. We can use both just to be sure and check the work.

  • 1200 = 1.32(3500) + b
  • 1163 = 1.32(3472) + b

Because I rounded the slope to 3 sig-figs, b = 3420 in the first equation and 3420.04 in the second one, which I only feel like is worth mentioning because it's exactly how the AI was gaslighting the situation. In this case that's a real example of rounding errors, which is fine for this problem.

I used the same logic on the second pair of points.

Now we have 2 equations:

  • y = 1.32x - 3420
  • y = 0.9x - 2240

Setting them equal to each other we get:

  • 1.32x - 3420 = 0.9x - 2240
  • 0.42x = 1180
  • x = ~2809

Plugging in 2809 into the first equation we get:

  • (1.32)2809 - 3420 = ~288

Plugging in 2809 into the second equation we get:

  • (0.9)2809 - 2240 = ~288

It's not necessary to check both equations to get the answer, but in this case I wanted to be absolutely sure I did it right after all the hassle I did to get here. Especially considering the AI and the fact that I haven't done a math problem like this in 20 years.

So 288 is the y-coordinate...

Plugging 288 into the first equation we get:

  • 288 = 1.32x - 3420
  • ~2809 = x

Which we already determined to be the value of x but,
again... just wanted to be certain.

image.png

So we went to these coordinates... lo and behold there was an abandoned village there. Apparently it is common for a stronghold to spawn under a village, which I learned immediately after this entire ordeal.

The real coordinates ended up being somewhere closer to (2750, 230) instead of (2809, 288), but again, that didn't matter. We had found the location with the maths, and had estimated twice on the actual numbers: once with the points themselves because we had to just pick a starting and ending location based off the floating eyeball, and again while determining the slope of the lines with three significant digits. Whatever the case may be, it worked... and it would not have worked if I had blindly listened to AI gaslight me about the wrong answer it found.

Conclusion.

Obviously AI has a lot of work to do, but you have to admit that failing simple arithmetic that any calculator can do, and then double-down blaming it on significant digits is pretty batshit insane. Is this the start of a glorious future where AI constantly tries to get us to believe lies and propaganda? Perhaps. I guess we'll just have to wait and see how the situation evolves.

Posted Using LeoFinance Beta



0
0
0.000
22 comments
avatar

So that sounds pretty much like the politics news and science we live through are made up & argued for by AIs.

0
0
0.000
avatar

Another talk to prove that mathematics solves it all.

Another opinion of mine is never to trust AI. But in chess, AI is automatically good, almost never make mistakes. Or wait, AI is another form of human?

0
0
0.000
avatar

It's probably best to think of AI as a broad category of computer tools. The AI that plays chess perfectly is fundamentally different from the AI of predictive text, they have differing strengths and weakness but they are both described in the category of 'AI'.

0
0
0.000
avatar

This is insane, how can AI fail at such basic math, but to gaslight you into 'you're wrong, I'm right' repeatedly is some next level diabolical shit, this is on par with the IT department and it's incessant need to browbeat people who work exclusively on a computer if they know how to reboot it and can they reboot it again..

0
0
0.000
avatar

Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!

Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).

You may also include @stemsocial as a beneficiary of the rewards of this post to get a stronger support. 
 

0
0
0.000
avatar

That’s definitely lefty logic hahaha get the wrong answers and blame something completely different. Damn that’s pretty pathetic, if that AI is in charge we have don’t have a lot to worry about for a while, that’s pretty pathetic.

0
0
0.000
avatar

The way this AI in particular acquires its knowledge is overall shifty at best, and at worst completely wrong. The reason you are getting these faulty results is because there are several sources of information conflicting with each other, and the AI is trying to determine which one holds the truth. In that process, it mingles all of them, which is why you got that extra step in the solving process with -2148.57. Label it as an artifact.

I know people hold that conception around AI that it never misses, especially around basic arithmetics. But in this situation, its not a calculator, its looking at a calculator and inferring what the function of a calculator is, and tries to come up with the process to solve that equation. MKBHD on YouTube did something similar, telling the AI to outline the features of the latest iPhone 14 or something. It completely missed and instead outlined those of the iPhone 12! Pretty wild.

I think specialized look-up databases are the only way AI could be useful, because in a general database, AI could interject an offshoot unrelated dataset, screwing up the final answer.

Posted Using LeoFinance Beta

0
0
0.000
avatar

So? ....how does the game "ends" ?

... we're used now to MMORPG games which allow players to level up non-stop and don't truly rely on any Game Theory premise in order to stimulate any sort of achievement feeling... subscriptions does the job XD and bored humans do the rest.

Despite the AI situation, I think you had a pretty nice anecdotal situation here. Glad you're stimulating a young human's brain. Your nephew will be different from the herd.

Thanks a lot for sharing it !PIZZA

- EvM

0
0
0.000
avatar
(Edited)

ChatGPT is a language model, in other words, a bullshitting machine. It only knows which words are likely to be used together. Someone on Twitter asked it which mammals lay the largest eggs. The answer was elephants.

It would be good at telling a story about a Minecraft game, although it might get details wrong there too. Weird to see so many people trust it because it's labeled as an AI and it can produce well-formed sentences.

Which says a lot about how we tend to judge knowledge.

0
0
0.000
avatar

They dumbed down the AI as much as they could. People are comparing everywhere that the outputs are not nearly as good as they were in the first 2-3 days.

I gave it some text to rewrite this morning and it got the names and events wrong... I remember using it the first day and it was giving me suggestions and pointing out flaws, now that just isn't a feature anymore. It's like they are doing it on purpose because too many people are trying this stuff out for free.

Posted Using LeoFinance Beta

0
0
0.000