How an artificial intelligence (as in large language model based generative AI) could be better for information access and retrieval than an encyclopedia with a clean classification model and a search engine?
If we add a step of processing – where a genAI “digests” perfectly structured data and tries, as bad as it can, to regurgitate things it doesn’t understand – aren’t we just adding noise?
I’m talking about the specific use-case of “draw me a picture explaining how a pressure regulator works”, or “can you explain to me how to code a recursive pattern matching algorithm, please”.
I also understand how it can help people who do not want or cannot make the effort to learn an encyclopedia’s classification plan, or how a search engine’s syntax work.
But on a fundamental level, aren’t we just adding an incontrolable step of noise injection in a decent time-tested information flow?
Well, the primary thing is that you can ask extremely specific questions and get tailored responses.
That’s the best use case for LLMs, imo. It’s less of a replacement for a traditional encyclopedia- though people use it like that also- and more of a replacement for googling your question and getting a Reddit thread where someone explains.
The issue comes when people take everything it spits out as gospel, and do zero fact checking on it- basically the way that they hallucinate is the problem I have with it.
If there’s a chance it’s going to just flatly make things up, invent statistics, or just be entirely wrong… I’d rather just use a normal forum and ask a real person that probably has a clue whatever question I have. Or try to find where someone has already asked that question and got an answer.
If you have to go and fact check the results anyway, is there even a point? At work now I’m getting entirely AI generated pull requests with AI generated descriptions, and when I challenge the dev on why they went with particular choices they can’t explain or back them up.
That’s why I don’t really use them myself. I’m not willing to spread misinformation just because ChatGPT told me it was true, but I also have no interest in going back over every response and double checking that it’s not just making shit up.
Google is so shit nowadays, it’s main purpose is to sell you things, not to actually retrieve the things you ask.
Mainly you see this with coding related questions, they were much better 5 years ago. Now only way to get results is to ask LLM and hope it doesn’t hallusinate some library that doesn’t exist.
Part of the issue is that SEO got better and google stopped changing things to avoid SEO manipulation.