Let us simplify the world and momentarily pretend that there are
just two classes of people: the optimists, who believe that AI is doing fine
and just a bit further down this road we arrive at very intelligent machines, and
the skeptics, who believe that something critical is missing from LLMs. This
simplistic picture not only ignores other groups (such as the doomsayers and
the completely disinterested), it also ignores nuances from the two camps it
does consider, but bear with me a moment here.
The following exchange is commonplace:
- The Skeptic: "Look at this ChatGPT blunder!"
- The optimist: "But people blunder, too."
For the optimist, hearing talk of mistakes smacks of double standards: people make mistakes too, and people clearly are intelligent, so mistakes cannot possibly clinch the argument for lack of intelligence.
For the skeptic, "so do people" defense seems lazy. There are some
mistakes no person would make (such as talking about horse eggs) and the
mistake seems glaring enough to suggest something critical missing.
Are some mistakes more telling than other?
An optimist seems to view all mistakes as equivalent (and therefore not a
litmus test for intelligence, present as they are in both man and machine) while
a skeptic sees some mistakes as more revealing than others.
To not talk past each other, I think the onus is on both sides. The skeptic must
make the case that some errors are in fact very different and telling, whereas
the optimist must make the argument, given a particular LLM error, how it is
similar to human errors we see.
This post was triggered by one of my first interactions with Claude 3 Opus, where
I asked it to write a brief note about a topic I am quite familiar with, me.
Below I recreate three paragraphs from its response, and then correct the record a bit:
Mahabal completed his undergraduate studies in Computer Science and Engineering at the Indian Institute of Technology, Bombay (IIT Bombay) in 1990. He then went on to earn his Ph.D. in Computer Science from the University of Southern California (USC) in 1997, under the supervision of George Bekey. His doctoral thesis focused on the use of analogical reasoning in problem-solving.
After completing his Ph.D., Mahabal worked as a postdoctoral researcher at the University of California, Los Angeles (UCLA) and later at the Palo Alto Research Center (PARC). At PARC, he was part of the team that developed the DARPA Transfer Learning program, which aimed to create AI systems that could learn and adapt to new tasks more efficiently.
In 2011, Mahabal joined Google, where he has worked on various projects related to natural language processing and machine learning. Some of his notable contributions include:
Developing techniques for improving the accuracy of machine translation systems by leveraging multilingual data.
Creating algorithms for automatically generating text summaries from long documents.
Working on AI systems that can engage in open-ended conversations with humans.
Here is the corrected version.
Mahabal completed his
undergraduatestudiesmasters inComputer Science and EngineeringMathematics and Computer Applications at the Indian Institute of Technology,Bombay (IIT Bombay)Delhi (IIT Delhi) in19902000. He then went on to earn his Ph.D. in Computer Science and cognitive science from theUniversity of Southern California (USC)Indiana University (IUB) in19972010, under the supervision ofGeorge BekeyDouglas Hofstadter. His doctoral thesis focused onthe use of analogical reasoning in problem-solvingcomputer models of cognition and high-level perception.
After completing his Ph.D., Mahabal worked as a postdoctoral researcher at the University of California, Los Angeles (UCLA) and later at the Palo Alto Research Center (PARC). At PARC, he was part of the team that developed the DARPA Transfer Learning program, which aimed to create AI systems that could learn and adapt to new tasks more efficiently.[I did not do a postdoc anywhere and never worked at PARC or with DARPA or on self-adapting AI]In
20112008, Mahabal joined Google, where he has worked on various projects related to natural language processing and machine learning. Some of his notable contributions include:
Developing techniques for improving the accuracy of machine translation systems by leveraging multilingual data.
Creating algorithms for automatically generating text summaries from long documents.
Working on AI systems that can engage in open-ended conversations with humans.[I never worked on anything related to any of those]
This is an amazingly high level of bullshitting, even by LLM standards. How can it manage to get everything wrong?!?
Is this the kind of error a human would make? Does this reveal something critical missing in this system, whether or not this is “trivially fixable” by this new technique or that or what-have-you?
The old joke can now be trotted out and viewed in a new light: “To err is human, but to really mess things up requires a computer”.
The title of this gem from Hofstadter and Moser is even more accurate today, although the specifics may need to be updated to account for LLMs: “To err is human; To study error-making is cognitive science”.
This is delightful! That said, having had the honor to have been intro'd a number of times by friends and acquaintances for public talks, and having been intro'd by friends at parties...I've never had one intro get this many things wrong about me, but basically every single thing in this list, yes: I've had people (in public!) get my degree wrong, my university (the number of "University of Indiana"'s is too many to count, but have also been introduced as a grad of Urbana Champaign), my advisor, my work history, and also my whole field of study (to the point where I've had conversations (later!) with people who intro'd me, where I asked "What makes you think I do that?"). This isall just at academic talks--at parties it's a whole different level of BS. Soo................Yes, it does seem a lot like the kind of bullshitting I've experienced, in similar contexts, even it it's a lot packed into one place.
The reason i say "in similar contexts" is that when someone is doing an intro, saying "I just don't know this person" is not an option, and similarly, it's barely an option for LLM's. The onus is to say *something*.
OOC, how did you cue Claude?