The big names in artificial intelligence—leaders at OpenAI, Anthropic, Google and others—still confidently predict that AI attaining human-level smarts is right around the corner. But the naysayers are growing in number and volume. AI, they say, just doesn’t think like us.
The work of these researchers suggests there’s something fundamentally limiting about the underlying architecture of today’s AI models. Today’s AIs are able to simulate intelligence by, in essence, learning an enormous number of rules of thumb, which they selectively apply to all the information they encounter.
This contrasts with the many ways that humans and even animals are able to reason about the world, and predict the future. We biological beings build “world models” of how things work, which include cause and effect.
We all understand the “brute force” method computers can use to crack passwords or find other solutions; it appears AI is just doing the same thing one level down: learning “rules” and applying them, but occasionally inconsistently which leads to errors and/or hallucinations.
One example in the piece is a simulation getting AI to note the “correct turns” on the streets of Manhattan, which it mostly does well. Yet oddly is sometimes directs drivers to go diagonally through Central Park or execute other odd behaviors.
The piece also notes the huge volume of material AI requires for even the smallest thing, where humans pick up so much faster. Again, driving a car through a human walking on the street is something people intuitively know to avoid, but AI only “knows” it once it figures out the rule, which becomes important for edge cases which happen infrequently.
The second article is about companies using, but not exactly prospering from the use of AI, at least not yet:
Companies Are Struggling to Drive a Return on AI. It Doesn’t Have to Be That Way.
As of last year, 78% of companies said they used artificial intelligence in at least one function, up from 55% in 2023, according to global management consulting firm McKinsey’s State of AI survey, released in March. From these efforts, companies claimed to typically find cost savings of less than 10% and revenue increases of less than 5%.
I didn’t read the article yet. But what it seems to get wrong is that the best practices for training AI is to find a balance between the model size (number of total layers and parameters), the size of the dataset(s) used to train with and the accuracy. In the most optimum cases you are minimizing memorization.
This is easiest to explain with computer vision but also applies to LLMs. You train with many images of each class of object. But then you test with a subset (10-15% typically) of the dataset that you did not train with. Thus you are training with the specific goal of generalizing and not memorizing.
Every so often during the training you actually randomly zero out some percentage of all the trained weights…the idea being to get rid of possibly unnecessarily non-zero weights. If they weren’t needed…they are gone, if they are needed subsequent training passes builds them back. This is known as dropout.
You didn’t read the article, but you are making a technical claim about what the AI researchers (mentioned in the article) did or didn’t do in their research?
That’s literally impossible, or virtually impossible to claim with any probability of being correct (without reading the article and associated research).
Or maybe I am misunderstanding your meaning.
Are you asserting that the AI research in the article doesn’t know the basics about model training?
Such as some of the training concepts that you mention?
I did not talk about the article, I specifically referenced the author of the quote, “Whoever wrote that …!” The line was a show stopper, it stopped me from reading further.
This is a reference to algorithmic code which AI tried and failed at miserably to the point that AI was declared Dead on Arrival. Algorithms were replaced by neural networks which are not algorithmic but statistical. Ever heard of Maxwell’s demon?
The molecules bounce around at random. In theory they could all wind up on one side of Maxwell’s experiment but it never happens. Why not? The best explanation I have heard is that it is a statistical improbability. Sorry, no link to the reference. Why improbability? In the experiment there are million or billions of molecules. If all states are equally likely, what is the probability that all of them find themselves on one side of the experiment? About zero point zero followed by a long string of zeros all the way to the Moon and beyond before the first non zero digit.
What do neural networks do? Calculate probabilities.
Maybe the article is correct, i’ll never find out because I’m not going to read it. Maybe the author should have paid more attention in school to English Composition.
BTW, how exactly do humans solve problems? I’ve written several posts about how I think the brain works. We can go much further back, how exactly did the universe create humans? Does the universe have a thinking brain? Mythology calls it god or creator. Science calls it “evolution.” In practice it is a bunch of particles bouncing around joining and rejecting each other, one in a gazillion gazillions combination survives. Given enough particles and enough time, humans could appear at the end of a very long string of improbable combinations. Are humans just a statistic? A probability?
So you opined on the author without reading the author’s article.
It’s a statement resulting from interpretability studies of large language models (which are neural networks), studies which try to explain the mechanisms by which these models might work.
This isn’t a path for learning from and understanding what people post, and therefore making an informed reply.
One of the AI researchers in the article works in the area of complexity.
.
I already HAVE the app for that, right here in my pocket!
.
.
.
(meaning no disrespect, just a little nostalgia from the old board that I thought worth preserving)
It isn’t just improbable, it is impossible according to the laws of physics as we understand them. I’ve had some training in this area, so I understand the principles involved. All the molecules cannot end up on one side, because in essence that would be compressing the gas, which would violate the 2nd Law of Thermodynamics.
Maxwell’s demon is a thought experiment where there is a gatekeeper (demon) in a chamber filled with gas. As the molecules bounce around, the gatekeeper allows high speed (therefore high energy) molecules to travel in one direction, and slow speed molecules to travel in the other direction. In the experiment, this would result in one half of the chamber filled with low energy gas, and the other side filled with high energy gas.
This seemingly violates the 2nd law. However, the thought experiment itself is flawed because the gatekeeper is doing work.
She wrote an award-winning book on complexity in fact.
Not exactly. I was really commenting on the idea put forth in the article title and others comments about thinking vs memorizing. One of the primary goals in neural nets (compared to previous failed attempts to do AI) is to move more towards thinking vs memorizing…compare that to the article title. Neural nets are intentionally built and trained to generalize and not memorize.*
As a child we learn to memorize things, like the multiplication tables. Typically, the first kid in the class to memorize them is considered smart. But this really isn’t thinking…being able to multiply any two numbers isn’t done by memorizing, but rather by thinking via some process or algorithm. Of course humans multiply by combining memorization for simple problems, approximations if suitable and an exact algorithm when needed.
Note that large language models (such as ChatGPT) use transformers which are a specific type of neural net for processing sequential data, like text.
That’s what I did and that’s what I said. Yes. Based on Goofy’s highlighted extract:
We Now Know How AI ‘Thinks’—and It’s Barely Thinking at All
The vast ‘brains’ of artificial intelligence models can memorize endless lists of rules. That’s useful, but not how humans solve problems.
AI no longer memorizes endless lists of rules. That was the case when “Expert Systems” were in vogue and flopped myserably. Why waste time on such out of date stuff?
But it sure saves time, a limited, non renewable, resource. Like I said , the quote “artificial intelligence models can memorize endless lists of rules.” was a trip wire. Computers can memorize endless lists of anything and everything. Rule based AI died decades ago.
Good for him.
Did the article explain how humans solve problems?
You wrote how the article might have gotten training wrong:
And then you provided some detailed explanations about training methods.
So I think you can see how anyone would misunderstand your comments.
Because now, after all of that discussion on training, you are focused on just the broad AI goal of thinking vs memorizing and how that is represented in the title of an article.
Maybe the title is not a good portrayal of the article.
Maybe the AI researchers don’t know enough about techniques in model training.
Maybe 10 other things worth mentioning.
If you are willing/able to read the article, check out the research, whatever else needed to make an informed assessment of the article’s content, and share your thoughts, we’d love to hear such thoughts.
And that’s exactly why the kind of research in the article is important.
The research is evaluating how/whether/why AI achieves the goal of thinking (vs memorizing). Specifically the mechanisms by which AI processes information and evaluating the extent to which AI is thinking vs memorizing.
If an important goal of AI is thinking (vs memorizing), then it seems research that evaluates that goal is important.
If comments are uninformed in this way, people will please less weight on such comments.
This is an uninformed statement, because, again, you didn’t read the article.
It seems you have your mind made up on how AI works, yet you don’t provide any evidence that counters the research in the article.
Yet, also, research into how AI works mechanistically, is an active area of study.
The research from the article is evaluating the extent to which that statement is true or not.
Like the brain, our understanding of how complex neural networks actually work is limited and there is very active research in this area, a piece of which is covered in the article.
Ah, the uninformed hits keep coming.
You mean her.
The complexity researcher is a she.
You can find out by reading the article.
Readers beware, much uninformed content in this thread.
So far this thread is 80% posts and responding to posts from people that didn’t read even the wsj article. (maybe save such posts for something more of everyday experience, like price of eggs)
Too bad.
I find the topic very interesting, very relevant, and with very important implications for the very large investments in AI.
To be sure, this short article only scratches the surface and provides a window into some research in the area of AI interpretability/explanability.
I expect to see a lot more on this topic, and more importantly, how findings in this topic lead to models with architectures, training data, training methods, and analysis approaches that can lead to better AI across many different use cases.