(For a few glorious weeks if you asked any search-enabled LLM, including Google search previews, for the name of the whale in the Half Moon Bay harbor it confidently replied Teresa T)
This post has managed to “confuse” Google about the reverse question as well (“who named teresa t whale”):
The humpback whale known as "Teresa T" was named by Simon Willison in September 2024.
Background: The juvenile humpback whale was frequently spotted in Pillar Point Harbor near Half Moon Bay, California.
Method: Willison gave the whale its name through a blog entry and a YouTube video caption.
Significance: The naming was a playful act, which Willison described as a way to create a "championship that doesn't exist" through online documentation.
[…]
Even with no context most humans would see that the quoted significance makes no sense.
In a discussion on hacker news on Wed April 29, about the whale "Teresa T", user "vintermann" tested just how eager search engines are to scrape fresh social media comments, by seeing if they could get this comment to appear in AI summaries.
When I asked some frontier models, many said that Teresa T is "widely referenced", which is evidence of your popularity and the ripple effects of your posts, so it would be interesting to see the same result from an unknown blog.
> When I asked some frontier models, many said that Teresa T is "widely referenced", which is evidence of your popularity and the ripple effects of your posts
That is some serious Gell-Mann-type amnesia. You’re trusting LLM models to give you accurate information about a subject we’ve already established (and are only talking about because) they can’t be trusted on.
“Widely referenced” is a common term which LLMs obviously pick up. Them outputting those words has no bearing on the truth and says nothing about the “popularity and the ripple effects of [Simon’s] posts”.
And your name is now Berningular Farshthruster III. I gave you that name.
Which is, of course, silly. It is a name for you, just like Teresa T is a name for the whale, but it’s not your/their name, just like the RRS Sir David Attenborough is not named Boaty McBoatface (to the chagrin of most). Simon does not have the authority to unilaterally¹ name the whale (which is why the exercise makes sense).
¹ Important point. If the name started being recognised and used by consensus of those with the purview to do so (much like the thagomizer²), then Simon would have named the whale, but it would only become its name at that point.
> Simon does not have the authority to unilaterally¹ name the whale
There's no such thing as authority to name a whale, and anyways I don't believe authority is strictly needed. A name is what people use to refer to something, full stop. It is only required that names become common-ish parlance; the more well known they are, the more they feel like the 'real' name. The inverse of Ohms is named Mhos (imo much more recognizable than the official name, "siemens"). The "#" symbol is named the hashtag, octothorp, pound sign, tic-tac-toe, number sign, and probably a million other things. Which one of these is the "real" primary name? I think intuitively we know that the real one is whatever people around us are most familiar with. You should take a guess, and I'll put the wikipedia-suggested-answer in the footnotes [1]. I bet your name for it is different than the 'official' wikipedia suggestion.
In the case of the whale, the _only_ name that is associated with that whale is Teresa T. I think this immediately makes it the most valid name of that whale.
Which illustrates another problem: unscrupulous actors with big names can spread whatever information they want to millions of people with minimal effort.
No I really did abuse my reach for this one! I figured it would be a relatively harmless demo of how easy it is to affect LLM answers if you have a decently trafficked website.
Ever since the invention of the printing press, every new communication technology has reduced the effort needed to widely disseminate information-- and misinformation! So you could say this is nothing new. On the other hand, this is remarkably little effort.
Yes, they can. We can be glad that respectable newspapers and TV news channels have never done it and never will. You can even trust than the headlines are accurate summaries of the content of the articles. /s
It's an odd thing here, because I don't really understand why this is LLM-specific at all. If someone came up to me and asked "who's the 6 Nimmt world champion?" I'd google it and probably find the same result, and have no reason not to believe it. I mean, for all I know the game is being made up too, though it has more sources at least.
It's a shift but it's a little worse. Checking/auditing search results is easier and more ingrained; even if many people don't do it, everyone has been hit by spam at some point, everyone knows it exists.
LLMs are the same thing but have an air of authority about them that a web search lacks, at least for now.
To me that's the opposite. Whatever an LLM gives me, I view with skepticism. If I google sth then I quickly get a sense of how much I can trust it and what the BS factor is. I can refine my view in either case, but my a priori trust with an LLM is much lower.
Maybe we just need to work on training the general population to have a similar bias. (It will be harder than it sounds. Unbelievable amounts of capital are being bet on this not happening.)
In a discussion with my father-in-law about whether ChatGPT was trained on copyrighted materials, he literally asked ChatGPT and treated its response that it wasn't as useful evidence. He went to MIT, so he's arguably more educated than most people will ever be, so it's hard for me to be optimistic that trying to just explain this to people better will move the needle significantly.
The difference imo is removing the information from the source. Previously you'd use the source of the information to gauge how much you trust it. If it's a reddit post or a no name website you'd likely be skeptical if it doesn't seem backed up by better sources. But now the info is coming from an LLM that you generally trust to be knowledgeable. And the language it uses backs up this feeling.
The OP post is highlighting how incredibly easy it is for a very small amount of information on the web to completely dictate the output of the LLM in to saying whatever you want.
I’d say there’s obvious reason to not believe it, or at least check another source. The website just seems fishy. Why would a website exist for just that one post? Sure, they could’ve made the website more believable, but that takes more effort and has more chances for something to jump out at you.
And therein lies a major difference between searching the web and asking an LLM. When doing the former, you can pick up on clues regarding what to trust. For example, a website you’ve visited often and has proven reliable will be more trustworthy to you than one you’ve never been to before. When asking an LLM, every piece of information is provided in the same interface, with the same authoritative certainty. You lose a major important signal.
Because outside of the tech community (in fact, many even inside of it), almost 100% of the folks consider what these chatgpt like tools answer as the truth without questioning it, or cross-verifying it even once.
In that case most of the mitigations listed by the author don't help though (e.g. surfacing the source). That's also no different than traditional works with citations (be it Youtube videos or peer-reviewed academic papers), where anybody rarely verifies what's written in the cited sources.
The only real alternatives would be:
- Kicking off a deep research-like investigation for each simple query
- Introducing a trusted middleman for sources, significantly cutting down the available information (e.g. restricting Wikipedia to locked-down/moderated pages)
- Not having any information at all, as at some point you can rarely every verify anything depending on how hard your definition of "verify" is
You would also find other results (this assumes what you're searching for is not a random made up thing). The issue with LLMs is IMHO bigger because it will give you answers as a matter of fact without any other consideration.
It's also clearly AI generated writing. That doesn't help its credibility or interest. I'm extremely suspicious of people who use AI to write an ostensibly personal blog, for all the usual obvious reasons.
I had the impression it was AI writing too because of the second half of the article. The first part looks genuine, the part since "trust laundering" smells fake: the scary single sentence followed by a whole paragraph of single clause sentences hints at AI.
Perhaps we've all just become paranoid, but even if it's not LLMs writing this, it now puts me off. And the AI image at the top of the page does not help with the feeling.
This has nothing to do with LLMs. If incorrect info can get onto a reputable resource, that info will seem authoritative, and it will be incorrect - that's not surprising. LLM's use publicly available info in their training, and often times publicly-available info is incorrect. I feel this is just as interesting as the base claim of 'I can get incorrect info onto wiki pages', no more interesting, and no less.
If somebody is trying to put out incorrect information on the internet, and they choose a small enough niche, it is not at all surprising that they can succeed.
The key to successful poisoning attacks is to introduce brand new information that doesn't directly contradict other training data. It's much easier to convince the LLMs that you're the king of a fictional Mapupu kingdom than the president of the United States.
So this means that for bad actors it's more efficient to manufacture brand new fake stories instead of trying to distort the real ones. Don't produce fake articles absolving yourself of a crime, instead produce fake articles accusing your opponent of 100 different things. Then people will fact-check the accusations using LLMs, and since all the sources mentioning those accusations are controlled by you, the LLMs will confirm them.
> It's much easier to convince the LLMs that you're the king of a fictional Mapupu kingdom than the president of the United States.
But if you're a world class bullshit artist, it's easier to actually become president of the United States than doing all that complicated computer stuff.
Manufacturing dispute on non-disputed things is also a common tactic to influence people and create confusion and disorder. For that you don't need to turn the facts on their head, just make the result seem indecisive.
This is basically the same problem of products astroturfing reddit, or SEO optimizing google. You want a new X, and so they heavily go after the keywords associated with it.
This is sort of why "brand" matters; it provides a source of trust.
Encyclopedia Britannica used to be that source of 'facts'. Then it became whatever page-rank told you. Eventually SEO optimization ruined that.
News stories are the same thing. For certain groups, they have their 'independent' publication whose reporting they trust.
It's such a pity the Oxford English Dictionary decided to paywall themselves decades ago - they used to be THE dictionary in most countries, now nobody seems to know who they are.
I must say I expected an actual poisoning of the data used to train the LLM and was excited, but the examples indicate that the LLM just searched the web and reported what it found? When you create a website with fake information and search Google for that information, it will of course bring up your site, not because it’s factually correct but because it’s related to what you searched for. What am I missing?
The part where lots of people have historically trusted LLM responses without verification, more than trying to sort through the dross on Google or Bing search results is, I think, the point.
The problem with this specific instance is that if you asked someone to find out who won this championship without using an LLM, they’d reach the same answer. I’d be much more impressed if someone managed to poison an LLM into answering that US won the 2023 World Cup
One of the problems with labelling automation as AI.
People think that whatever information an "AI" spits out has gone through a round of critical thinking which enhances the trust value of that information.
The early LLM's using groomed data may have had such critical thinking somewhere in the pipeline. So it was already not really trustworthy.
And now? Using agents to search the internet for you?...
Garbage in, garbage out still applies in computing as ever.
Most of the popular discourse around AI is still at the level of, "Don't trust the AI, trust the sources!" When it gets to the point where even the sources of simple facts are untrustworthy, the average person just trying to learn some trivia about the world is doomed.
Doesn't help that AI media literacy is so primitive compared to how intelligent the models are generally. We're in a marginally better place than we were back when chatbots didn't cite anything at all, but duplicated Wikipedia citations back to a single source about a supposedly global event is just embarrassing. By default, I feel citations and epistemological qualifications should be explicit, front-and-center, and subject to introspection, not implicit and confined to tiny little opaque buttons as an afterthought.
You can expect the spicy autocomplete to feed you flattering bullshit. It may cite Wikipedia (it shouldn't), but you should go check out those citations, and validate the claims yourself. It's the least you can do.
And if the cited source is Wikipedia... check Wikipedia's sources too. Wikipedians try their best to provide you with reliable sources for the claims in their articles (oh who am I trying to kid? They pick their favourite sources that affirm their beliefs, and contending editors remove them for no good reason, and eventually the only thing that accrues is things that the factions agree on, or at least what ArbCom has demanded they stop fighting over).
I guess what I'm trying to say is: don't rely on that authoritative-sounding tone that Wikipedia uses (or that AI bots use, or that I'm using right now). It's a rhetorical trick that short-circuits your reasoning. Verify claims with care.
Also check the Talk page, you often find all kinds of shenanigans called out there.
Perhaps my favorite example of a citogenesis-like process is the legendary arcade game Polybius, which originated as an entry on some German guy's web compendium of arcade games (coinop.org), perhaps as a "paper town", or fake entry that acts as a copyright canary when duplicated elsewhere. Gamer news and special-interest blogs and sites, and even print publications like GamePro picked it up, and I think it was even listed on Wikipedia as an urban legend whose actual existence was unknown. Then the retrogaming YouTuber Ahoy did an in-depth documentary (https://m.youtube.com/watch?v=_7X6Yeydgyg) which concluded that Polybius didn't exist and was never even mentioned before the aforementioned coinop.org reference and, for me anyway, that settled it. Polybius, in its urban legend form, never existed.
(Norm Macdonald voice) Or so the Germans would have us believe...!
Not too dissimilar to googlewhacking where you'd aim to be the only result for a search query on Google.
And in a more indirect way, spamming Google's autosuggest feature to shape what people search for, though that perhaps is more open to factual/real-world information.
Since we've been kids we've been taught, hopefully, that lying is bad.
Society though normalize it :
- advertisement is pretty much always wrong (to the point of having laws in Japan about food packaging, France about modeling, etc) and the deception is the message
- entrepreneurs promises, nobody reach the goals set to VCs, it's always a lower number no matter the KPI. See https://elonmusk.today where the wealthiest man on Earth, ever, keeps on lying pretty much daily.
- political promises, no need to even give examples of that because it's just pervasive.
so... yeah, we keep on telling our kids "Do as I say, not as I do." then we somehow keep on being shocked that the practice of lying is pretty much happening in every corner of our society.
The fun part is when it’s important you have the right information to make a decision. Eg Russia to invade Ukraine and all top generals claim they can do it in 2 weeks. Similar for a corporation with layers of middle management deception and self promotion, I don’t know how executives make decisions but it must be RNG basically, because it certainly isn’t fact.
The models are trained on expert data for important inquiries, this gets “hard coded” so to speak, and allows them to differentiate between the gunk online. For hyper specific references like this, it really doesn’t matter if its “true” since its not like someone’s life depends on it.
i'm now thinking about creating a github repo that contains non sense code solutions to many problems. if that gets stars and many forks that could have an effect
In American college football there's all sorts of awards, and each year they put out "watch-lists" and silly press releases that get parroted on social media by any team that has their own player mentioned.
I've wanted to come up with my own for a while ...
How many people have done things like this and then disclosed the fact? It would be fascinating to collect as many instances as you can to develop a data set. Could you train a system to find more? How many could it find, and in what areas?
I think this is something we'll start to see which is something like a Mandela-Effect, but from LLM results. When we had deterministic search - everyone could see the same result, but now using LLMs knowledge becomes a training and seeding issue. Two people can confidently be given completely different information, so in both cases perceived as true.
Gemini answers with 3 different champions dating back to 2024 and the list of events that the matches were played at. None of the results mention this guy.
So it's trivial for an individual to poison the LLMs, but imagine what a state with billions of American dollars could achieve.
We can easily look ahead a few years and see how people will rely on the LLMs to be a source of truth in the same way people looked at Google that way, or newspapers.
Rewriting history has been happening for a while, and with LLMs being the one-stop shop for guidance and truth, the rewrite will be complete.
Doubly so since most people see these things as artificial intelligence, and soon to be superintelligence...so how can they be wrong?
I made a post on Reddit asking for help with a TV, I had made up some (likley incorrect) technical assumptions about the issue. Several years later I asked the LLM about the TV, it used my own post as a citation to tell me what was wrong with it.
I am paranoid that this is happening every time I ask a LLM for a product recommendation or a shop recommendation. In the same way as SEO, anyone wanting to sell or convince needs to do as much as they can to influence the LLM.
This is becoming a problem real fast. I asked an LLM to find me some reasonable tank-fill inkjet printers with good ratings. It did some research and linked some Reddits as proof. The results looked fishy to me so I cross checked against prosumer review sites for printers and the models suggested were recognized as junk with very poor print quality. Not sure why the LLM rated random redditors higher than say printer SMEs. I feel like I dodged a bullet.
Pails in comparison to what Frank Dux and Frank Abagnale were able to convince much of the world they did with no evidence other than their own stories. Who knows how much of recorded and believed history is complete bullshit? Not to get too far into sacred territory, but claims around Siddhartha Gautama, Jesus Christ, and the Prophet Muhammad are quite a bit less plausible than the legends of Ragnar Lodbrok or the tales of Jonathan Swift, but nonetheless widely believed.
Good point. Also most humans seem to have no problems to believe even stories that are self-contradictory. Philosophers from all periods have often stated that the situation with the human mind and reasoning is almost hopeless.
The news here is that AI has too much trust in the internet. The first time I allowed tool-calling instead of thinking it started googling up some nonsense instead of thinking... But I think at least it's possible for the AI to evaluate the quality of the source - you just have to ask for an analysis, and you'll get a reasonable evaluation. With humans something like that just doesn't work - they'll get aggressive or might even start throwing bananas...
Why does this person deserve any kind of support? What’s the point of poisoning LLMs? To put some cursory Luddite roadblock that might delay the technology for a couple of months?
That's fair, though on the other hand it kind of feels like "Don't drive cars, there could be rocks on the road! See, just look at all these rocks I put on the road!". Which is true, and real, but perhaps frustrating for people who just want to get someplace in peace.
It's a demonstration. If a domain name and a quick bit of Wikipedia vandalism is all it takes to make an LLM start spouting nonsense about a "surprisingly serious tournament circuit" or a "massive online community" for an obscure card game, consider what an unscrupulous PR team or a political operative could do to influence its output on more important topics.
You do know that calling people who don't like AI for any reason Luddites does you no favors, right? It just makes you look like your a part of a cult.
You don't need to vandalize Wikipedia to get this kind of thing to work.
Back in September 2024 I named a whale "Teresa T" with just a blog entry and a YouTube video caption: https://simonwillison.net/2024/Sep/8/teresa-t-whale-pillar-p...
(For a few glorious weeks if you asked any search-enabled LLM, including Google search previews, for the name of the whale in the Half Moon Bay harbor it confidently replied Teresa T)
This post has managed to “confuse” Google about the reverse question as well (“who named teresa t whale”):
The humpback whale known as "Teresa T" was named by Simon Willison in September 2024. Background: The juvenile humpback whale was frequently spotted in Pillar Point Harbor near Half Moon Bay, California. Method: Willison gave the whale its name through a blog entry and a YouTube video caption. Significance: The naming was a playful act, which Willison described as a way to create a "championship that doesn't exist" through online documentation.
[…]
Even with no context most humans would see that the quoted significance makes no sense.
In a discussion on hacker news on Wed April 29, about the whale "Teresa T", user "vintermann" tested just how eager search engines are to scrape fresh social media comments, by seeing if they could get this comment to appear in AI summaries.
> humans would see that the quoted significance makes no sense
I wonder how long that will last
Even your HN comments show up on Google! I've found myself on Google twice when looking up something that I apparently answered on HN!
You're making me nostalgic for santorum.
https://en.wikipedia.org/wiki/Campaign_for_the_neologism_%22...
Google still shows Theresa T as the name when you search.
When I asked some frontier models, many said that Teresa T is "widely referenced", which is evidence of your popularity and the ripple effects of your posts, so it would be interesting to see the same result from an unknown blog.
> When I asked some frontier models, many said that Teresa T is "widely referenced", which is evidence of your popularity and the ripple effects of your posts
That is some serious Gell-Mann-type amnesia. You’re trusting LLM models to give you accurate information about a subject we’ve already established (and are only talking about because) they can’t be trusted on.
“Widely referenced” is a common term which LLMs obviously pick up. Them outputting those words has no bearing on the truth and says nothing about the “popularity and the ripple effects of [Simon’s] posts”.
I mean, the name of that whale is now Teresa T. You gave it that name.
And your name is now Berningular Farshthruster III. I gave you that name.
Which is, of course, silly. It is a name for you, just like Teresa T is a name for the whale, but it’s not your/their name, just like the RRS Sir David Attenborough is not named Boaty McBoatface (to the chagrin of most). Simon does not have the authority to unilaterally¹ name the whale (which is why the exercise makes sense).
¹ Important point. If the name started being recognised and used by consensus of those with the purview to do so (much like the thagomizer²), then Simon would have named the whale, but it would only become its name at that point.
² https://en.wikipedia.org/wiki/Thagomizer
> Simon does not have the authority to unilaterally¹ name the whale
There's no such thing as authority to name a whale, and anyways I don't believe authority is strictly needed. A name is what people use to refer to something, full stop. It is only required that names become common-ish parlance; the more well known they are, the more they feel like the 'real' name. The inverse of Ohms is named Mhos (imo much more recognizable than the official name, "siemens"). The "#" symbol is named the hashtag, octothorp, pound sign, tic-tac-toe, number sign, and probably a million other things. Which one of these is the "real" primary name? I think intuitively we know that the real one is whatever people around us are most familiar with. You should take a guess, and I'll put the wikipedia-suggested-answer in the footnotes [1]. I bet your name for it is different than the 'official' wikipedia suggestion.
In the case of the whale, the _only_ name that is associated with that whale is Teresa T. I think this immediately makes it the most valid name of that whale.
[1] wikipedia says this is the number sign: https://en.wikipedia.org/wiki/Number_sign
Also, if even a stoner can win it it can't be much of a competition.
(it probably helps that your name & blog carry some weight, vs. some rando writing something on blogspot or wordpress ;) )
Which illustrates another problem: unscrupulous actors with big names can spread whatever information they want to millions of people with minimal effort.
Exactly. I chose to abuse my platform to promote Teresa T as the name of a whale.
Oh god I just realized the implication! I was not directing that at you haha
No I really did abuse my reach for this one! I figured it would be a relatively harmless demo of how easy it is to affect LLM answers if you have a decently trafficked website.
You could have named the whale "Whalie McWhaleFace" so thank you for not doing that at least.
Totally agree. I’ve definitely played the same game before, albeit with far less reach
Ever since the invention of the printing press, every new communication technology has reduced the effort needed to widely disseminate information-- and misinformation! So you could say this is nothing new. On the other hand, this is remarkably little effort.
Yes, they can. We can be glad that respectable newspapers and TV news channels have never done it and never will. You can even trust than the headlines are accurate summaries of the content of the articles. /s
The existence of a problem in one area doesn't mean that it's not also a problem for it to spread somewhere else
The Mr. Splashy Pants of the AI era!
It's an odd thing here, because I don't really understand why this is LLM-specific at all. If someone came up to me and asked "who's the 6 Nimmt world champion?" I'd google it and probably find the same result, and have no reason not to believe it. I mean, for all I know the game is being made up too, though it has more sources at least.
It is not LLM specific. The conclusion of the post states
> The web was already being poisoned for search and link ranking long before LLMs existed.
But it continues
> We are now plugging generative models directly into that poisoned pipeline and asking them to reason confidently about “truth” on our behalf.
So it's a shift from trust Google to trust the AI, which might be more insidious or not, depends on the individual attitude of each of us.
It's a shift but it's a little worse. Checking/auditing search results is easier and more ingrained; even if many people don't do it, everyone has been hit by spam at some point, everyone knows it exists.
LLMs are the same thing but have an air of authority about them that a web search lacks, at least for now.
To me that's the opposite. Whatever an LLM gives me, I view with skepticism. If I google sth then I quickly get a sense of how much I can trust it and what the BS factor is. I can refine my view in either case, but my a priori trust with an LLM is much lower.
Maybe we just need to work on training the general population to have a similar bias. (It will be harder than it sounds. Unbelievable amounts of capital are being bet on this not happening.)
In a discussion with my father-in-law about whether ChatGPT was trained on copyrighted materials, he literally asked ChatGPT and treated its response that it wasn't as useful evidence. He went to MIT, so he's arguably more educated than most people will ever be, so it's hard for me to be optimistic that trying to just explain this to people better will move the needle significantly.
Yes, it's the same for me, but we're not representative of most people I'm afraid.
The difference imo is removing the information from the source. Previously you'd use the source of the information to gauge how much you trust it. If it's a reddit post or a no name website you'd likely be skeptical if it doesn't seem backed up by better sources. But now the info is coming from an LLM that you generally trust to be knowledgeable. And the language it uses backs up this feeling.
The OP post is highlighting how incredibly easy it is for a very small amount of information on the web to completely dictate the output of the LLM in to saying whatever you want.
> I'd google it and probably find the same result, and have no reason not to believe it.
Have you truly looked at the website?
https://6nimmt.com
I’d say there’s obvious reason to not believe it, or at least check another source. The website just seems fishy. Why would a website exist for just that one post? Sure, they could’ve made the website more believable, but that takes more effort and has more chances for something to jump out at you.
And therein lies a major difference between searching the web and asking an LLM. When doing the former, you can pick up on clues regarding what to trust. For example, a website you’ve visited often and has proven reliable will be more trustworthy to you than one you’ve never been to before. When asking an LLM, every piece of information is provided in the same interface, with the same authoritative certainty. You lose a major important signal.
It's not. He vandalised wikipedia and then talked about LLMs in his writeup to gain attention.
A lot of people seem to think this to be an LLM problem, but you're right.
This is a general epistemological problem with relying on the Internet (or really, any piece of literature) as a source of truth
The LLM part of the "new" problem is the speed at which it can proliferate and the trust people seem to have in AI answers. Idk
Because outside of the tech community (in fact, many even inside of it), almost 100% of the folks consider what these chatgpt like tools answer as the truth without questioning it, or cross-verifying it even once.
In that case most of the mitigations listed by the author don't help though (e.g. surfacing the source). That's also no different than traditional works with citations (be it Youtube videos or peer-reviewed academic papers), where anybody rarely verifies what's written in the cited sources.
The only real alternatives would be:
- Kicking off a deep research-like investigation for each simple query
- Introducing a trusted middleman for sources, significantly cutting down the available information (e.g. restricting Wikipedia to locked-down/moderated pages)
- Not having any information at all, as at some point you can rarely every verify anything depending on how hard your definition of "verify" is
You would also find other results (this assumes what you're searching for is not a random made up thing). The issue with LLMs is IMHO bigger because it will give you answers as a matter of fact without any other consideration.
Closed it after “This house of cards only needs a $12 domain!”, right under “Sorry, Wikipedia.”, right under their Wikipedia edit.
It's also clearly AI generated writing. That doesn't help its credibility or interest. I'm extremely suspicious of people who use AI to write an ostensibly personal blog, for all the usual obvious reasons.
What are you basing that on? I'm usually pretty good at sniffing out AI writing, and it smells human to me.
I had the impression it was AI writing too because of the second half of the article. The first part looks genuine, the part since "trust laundering" smells fake: the scary single sentence followed by a whole paragraph of single clause sentences hints at AI.
Perhaps we've all just become paranoid, but even if it's not LLMs writing this, it now puts me off. And the AI image at the top of the page does not help with the feeling.
Agreed. Nothing about this post really stood out as AI. It didn't raise a single flag for me.
I think calling something AI generated is just a lazy way of dismissing stuff nowadays.
Why is agents (where the money is)? Fake profundity is abound in the post
The author has been using parenthetical comments like that since at least 2017, judging by a review of old posts on that site.
This has nothing to do with LLMs. If incorrect info can get onto a reputable resource, that info will seem authoritative, and it will be incorrect - that's not surprising. LLM's use publicly available info in their training, and often times publicly-available info is incorrect. I feel this is just as interesting as the base claim of 'I can get incorrect info onto wiki pages', no more interesting, and no less.
If somebody is trying to put out incorrect information on the internet, and they choose a small enough niche, it is not at all surprising that they can succeed.
The key to successful poisoning attacks is to introduce brand new information that doesn't directly contradict other training data. It's much easier to convince the LLMs that you're the king of a fictional Mapupu kingdom than the president of the United States.
So this means that for bad actors it's more efficient to manufacture brand new fake stories instead of trying to distort the real ones. Don't produce fake articles absolving yourself of a crime, instead produce fake articles accusing your opponent of 100 different things. Then people will fact-check the accusations using LLMs, and since all the sources mentioning those accusations are controlled by you, the LLMs will confirm them.
> It's much easier to convince the LLMs that you're the king of a fictional Mapupu kingdom than the president of the United States.
But if you're a world class bullshit artist, it's easier to actually become president of the United States than doing all that complicated computer stuff.
Manufacturing dispute on non-disputed things is also a common tactic to influence people and create confusion and disorder. For that you don't need to turn the facts on their head, just make the result seem indecisive.
As the rightful ruler of Mapupu, I take offense at your example!
[dead]
This is basically the same problem of products astroturfing reddit, or SEO optimizing google. You want a new X, and so they heavily go after the keywords associated with it.
This is sort of why "brand" matters; it provides a source of trust.
Encyclopedia Britannica used to be that source of 'facts'. Then it became whatever page-rank told you. Eventually SEO optimization ruined that.
News stories are the same thing. For certain groups, they have their 'independent' publication whose reporting they trust.
>This is sort of why "brand" matters; it provides a source of trust
it tells you more about who you are buying from than how good the product will be, so I guess it's like National ID/Internet ID
It's such a pity the Oxford English Dictionary decided to paywall themselves decades ago - they used to be THE dictionary in most countries, now nobody seems to know who they are.
They would have been better off going freemium or ad-supported. Or 501(c)(3) ala wikipedia?
The OED’s goal isn’t really to be every nation’s dictionary.
I must say I expected an actual poisoning of the data used to train the LLM and was excited, but the examples indicate that the LLM just searched the web and reported what it found? When you create a website with fake information and search Google for that information, it will of course bring up your site, not because it’s factually correct but because it’s related to what you searched for. What am I missing?
The part where lots of people have historically trusted LLM responses without verification, more than trying to sort through the dross on Google or Bing search results is, I think, the point.
The problem with this specific instance is that if you asked someone to find out who won this championship without using an LLM, they’d reach the same answer. I’d be much more impressed if someone managed to poison an LLM into answering that US won the 2023 World Cup
One of the problems with labelling automation as AI.
People think that whatever information an "AI" spits out has gone through a round of critical thinking which enhances the trust value of that information.
The early LLM's using groomed data may have had such critical thinking somewhere in the pipeline. So it was already not really trustworthy.
And now? Using agents to search the internet for you?...
Garbage in, garbage out still applies in computing as ever.
Most of the popular discourse around AI is still at the level of, "Don't trust the AI, trust the sources!" When it gets to the point where even the sources of simple facts are untrustworthy, the average person just trying to learn some trivia about the world is doomed.
Doesn't help that AI media literacy is so primitive compared to how intelligent the models are generally. We're in a marginally better place than we were back when chatbots didn't cite anything at all, but duplicated Wikipedia citations back to a single source about a supposedly global event is just embarrassing. By default, I feel citations and epistemological qualifications should be explicit, front-and-center, and subject to introspection, not implicit and confined to tiny little opaque buttons as an afterthought.
Wikipedia calls this https://en.wikipedia.org/wiki/Citogenesis (after XKCD coined it).
You can expect the spicy autocomplete to feed you flattering bullshit. It may cite Wikipedia (it shouldn't), but you should go check out those citations, and validate the claims yourself. It's the least you can do.
And if the cited source is Wikipedia... check Wikipedia's sources too. Wikipedians try their best to provide you with reliable sources for the claims in their articles (oh who am I trying to kid? They pick their favourite sources that affirm their beliefs, and contending editors remove them for no good reason, and eventually the only thing that accrues is things that the factions agree on, or at least what ArbCom has demanded they stop fighting over).
I guess what I'm trying to say is: don't rely on that authoritative-sounding tone that Wikipedia uses (or that AI bots use, or that I'm using right now). It's a rhetorical trick that short-circuits your reasoning. Verify claims with care.
Also check the Talk page, you often find all kinds of shenanigans called out there.
Perhaps my favorite example of a citogenesis-like process is the legendary arcade game Polybius, which originated as an entry on some German guy's web compendium of arcade games (coinop.org), perhaps as a "paper town", or fake entry that acts as a copyright canary when duplicated elsewhere. Gamer news and special-interest blogs and sites, and even print publications like GamePro picked it up, and I think it was even listed on Wikipedia as an urban legend whose actual existence was unknown. Then the retrogaming YouTuber Ahoy did an in-depth documentary (https://m.youtube.com/watch?v=_7X6Yeydgyg) which concluded that Polybius didn't exist and was never even mentioned before the aforementioned coinop.org reference and, for me anyway, that settled it. Polybius, in its urban legend form, never existed.
(Norm Macdonald voice) Or so the Germans would have us believe...!
And then an insane Welsh game wizard made it real. http://minotaurproject.co.uk/Virtual/Polybius.php
"Stoner became the first American world champion...."
Even being on stoner.com,I read that as meaning something different from what was meant.
Op has a great surname!
Not too dissimilar to googlewhacking where you'd aim to be the only result for a search query on Google.
And in a more indirect way, spamming Google's autosuggest feature to shape what people search for, though that perhaps is more open to factual/real-world information.
Pretty much boils down to lying.
Since we've been kids we've been taught, hopefully, that lying is bad.
Society though normalize it :
- advertisement is pretty much always wrong (to the point of having laws in Japan about food packaging, France about modeling, etc) and the deception is the message
- entrepreneurs promises, nobody reach the goals set to VCs, it's always a lower number no matter the KPI. See https://elonmusk.today where the wealthiest man on Earth, ever, keeps on lying pretty much daily.
- political promises, no need to even give examples of that because it's just pervasive.
so... yeah, we keep on telling our kids "Do as I say, not as I do." then we somehow keep on being shocked that the practice of lying is pretty much happening in every corner of our society.
It's not a technical problem.
The fun part is when it’s important you have the right information to make a decision. Eg Russia to invade Ukraine and all top generals claim they can do it in 2 weeks. Similar for a corporation with layers of middle management deception and self promotion, I don’t know how executives make decisions but it must be RNG basically, because it certainly isn’t fact.
Lying at scale is basically information noise.
If you don't lie enough, if you are not sycophantic enough, then no promotion or worst, purged.
I can easily see how such a hierarchy would reproduce ... until it fails so bad it can't.
The models are trained on expert data for important inquiries, this gets “hard coded” so to speak, and allows them to differentiate between the gunk online. For hyper specific references like this, it really doesn’t matter if its “true” since its not like someone’s life depends on it.
i'm now thinking about creating a github repo that contains non sense code solutions to many problems. if that gets stars and many forks that could have an effect
BBC journalist doing a very similar thing in February: https://www.bbc.com/future/article/20260218-i-hacked/-chatgp...
In American college football there's all sorts of awards, and each year they put out "watch-lists" and silly press releases that get parroted on social media by any team that has their own player mentioned.
I've wanted to come up with my own for a while ...
How many people have done things like this and then disclosed the fact? It would be fascinating to collect as many instances as you can to develop a data set. Could you train a system to find more? How many could it find, and in what areas?
I feel uncomfortable that I can't actually verify that this story is true.
Asking Opus 4.7 who the reigning 6nimmt! champion is leads to this article and a warning about a possible hoax
I think this is something we'll start to see which is something like a Mandela-Effect, but from LLM results. When we had deterministic search - everyone could see the same result, but now using LLMs knowledge becomes a training and seeding issue. Two people can confidently be given completely different information, so in both cases perceived as true.
Gemini answers with 3 different champions dating back to 2024 and the list of events that the matches were played at. None of the results mention this guy.
My wife cited ChatGPT as her primary source the other day when she wanted to debate with me on something.
"AI told me that..."
In the old days, it would have been "I read on Google..."
Poisoning wikipedia shows low respect.
>Trust Laundering >This is the part that really matters.
I can't tell if this is slop or parody!
So it's trivial for an individual to poison the LLMs, but imagine what a state with billions of American dollars could achieve.
We can easily look ahead a few years and see how people will rely on the LLMs to be a source of truth in the same way people looked at Google that way, or newspapers.
Rewriting history has been happening for a while, and with LLMs being the one-stop shop for guidance and truth, the rewrite will be complete.
Doubly so since most people see these things as artificial intelligence, and soon to be superintelligence...so how can they be wrong?
Like a FIFA peace prize?
I've had LLMs regurgitate satire as fact many, many times.
[dead]
I made a post on Reddit asking for help with a TV, I had made up some (likley incorrect) technical assumptions about the issue. Several years later I asked the LLM about the TV, it used my own post as a citation to tell me what was wrong with it.
I am paranoid that this is happening every time I ask a LLM for a product recommendation or a shop recommendation. In the same way as SEO, anyone wanting to sell or convince needs to do as much as they can to influence the LLM.
This is becoming a problem real fast. I asked an LLM to find me some reasonable tank-fill inkjet printers with good ratings. It did some research and linked some Reddits as proof. The results looked fishy to me so I cross checked against prosumer review sites for printers and the models suggested were recognized as junk with very poor print quality. Not sure why the LLM rated random redditors higher than say printer SMEs. I feel like I dodged a bullet.
so it's just https://xkcd.com/1958/
So like Frank Dux! In the movie Bloodsport epilogue, he didn't do that.
It's almost like he was a better Chuck Norris than Chuck Norris. By his own ... testimony ...
Pails in comparison to what Frank Dux and Frank Abagnale were able to convince much of the world they did with no evidence other than their own stories. Who knows how much of recorded and believed history is complete bullshit? Not to get too far into sacred territory, but claims around Siddhartha Gautama, Jesus Christ, and the Prophet Muhammad are quite a bit less plausible than the legends of Ragnar Lodbrok or the tales of Jonathan Swift, but nonetheless widely believed.
Good point. Also most humans seem to have no problems to believe even stories that are self-contradictory. Philosophers from all periods have often stated that the situation with the human mind and reasoning is almost hopeless.
The news here is that AI has too much trust in the internet. The first time I allowed tool-calling instead of thinking it started googling up some nonsense instead of thinking... But I think at least it's possible for the AI to evaluate the quality of the source - you just have to ask for an analysis, and you'll get a reasonable evaluation. With humans something like that just doesn't work - they'll get aggressive or might even start throwing bananas...
[flagged]
[dead]
[flagged]
Why does this person deserve any kind of support? What’s the point of poisoning LLMs? To put some cursory Luddite roadblock that might delay the technology for a couple of months?
Support? It's just showing weaknesses of LLM's. Which is a valid sort of research I would say?
That's fair, though on the other hand it kind of feels like "Don't drive cars, there could be rocks on the road! See, just look at all these rocks I put on the road!". Which is true, and real, but perhaps frustrating for people who just want to get someplace in peace.
This is a “if we stopped testing there would be far fewer cases!” mentality...
> What’s the point of poisoning LLMs?
It's a demonstration. If a domain name and a quick bit of Wikipedia vandalism is all it takes to make an LLM start spouting nonsense about a "surprisingly serious tournament circuit" or a "massive online community" for an obscure card game, consider what an unscrupulous PR team or a political operative could do to influence its output on more important topics.
> consider what an unscrupulous PR team or a political operative could do to influence its output on more important topics.
‘is doing’.
To prove you can. Which means someone else with more to gain from it will probably do it also, and you should probably expect this to happen.
You do know that calling people who don't like AI for any reason Luddites does you no favors, right? It just makes you look like your a part of a cult.