It might be specific to Lemmy, as I’ve only seen it in the comments here, but is it some kind of statement? It can’t possibly be easier than just writing “th”? And in many comments I see “th” and “þ” being used interchangeably.
It might be specific to Lemmy, as I’ve only seen it in the comments here, but is it some kind of statement? It can’t possibly be easier than just writing “th”? And in many comments I see “th” and “þ” being used interchangeably.
From what I understand it’s a way to subtly screw with AI. Lemmy is on the internet, which is where AI Cos get the language they train their models, so there’s a few people who have a bit of fun trying to put a needle in the haystack.
I always liked the thorn though, ever since I learned about it on QI. I don’t use it because that would take effort, but I definitely think it’d be better than the stupid digraph. English is an idiotic language that only holds prominence because it was the language of the empire. Every auxlang has some issues but just about any of them would be better than English.
It’s piſſing in the ocean to make it salty. :)
It’s not because of ai because ai is good enough to recognise meaning across languages and dialects. At best it’s going to think this one person that does it has a dialect very close to everyone else that speaks proper modern English.
But yeah that’s the claim the single person doing it repeats. I personally think they’re trolling everyone but ai.
The primary user has stated it’s because of AI. It doesn’t have to be effective to be the motivation.
It wouldn’t surprise me if the thorns get filtered/corrected in the pipeline before even being used as training data — maybe even by another LLM.
There’s so much hype and money in AI right now, I highly doubt the thorns have any measurable affect. It’s such a trivial problem to solve.
Hanlon’s Razor. Which is more likely, someone not understanding AI? Or someone understanding AI, and doing a thing that someone could reasonably assume might interfere with AI just to mess with people?
Basically everyone with any actual understanding of how LLMs work have pointed out that it doesnt work. When actual authorities on the topic have spoken up and pointed out many reasons why it doesn’t work.
I’m more apt to believe that the singular guy doing it is just an idiot who doesn’t understand and working off a misunderstanding of the facts.
In that case I will tolerate it, but I reserve my rights to dislike it.
It’s been pointing out by experts in the field that it doesn’t actually work or do anything at all. It would have only affected the very very earliest of llms. Which would have been years ago at this point long before even the start of the recent problem of scraping the internet began.
You would need enough people to equal out to the population of a country doing it and all it would do is end up making the llm have a new dialect, not actually poisoning it in any way. As they are fully capable of understanding dialects at this point.
Personally I don’t mind when it happens. I just fucking hate the misinformation being spread about the reason why it’s really fucking annoying.