Machine learning conferences nowadays are too large for my enjoyment. I made the trip to Singapore for two posters and a talk in the associative memory workshop. I spent my time listening to the morning keynotes, walking briskly in the poster room, and catching up with friends and colleagues from both industry and academia.
On my way back, I met Kyunghyun Cho in the airport. We had a drink over what we had learned. Cho always has great insights; he has invented attention mechanisms; he dines with Korean stars. Therefore, I know what I must do when he tells me “You should tweet that!”
To decide what to research, we first need to understand where we stand.
The development of large language models (LLMs) and the rapid transition from machine learning (ML) to artificial intelligence (AI) makes this a challenge. Many researchers seem to be looking for a clearer assessment of our current position. One of the keynote speakers argues that AI is just another instance of sparse data compression. A substantial fraction of posters study which AI capabilities can or cannot be achieved using the current technology.
There is a considerable tension between what these researchers observe and what they hear from the big megaphone that keeps telling what AI will achieve, or what AI should or should not be. This intense communication campaign merely reflects the aspirations of business leaders trying to imagine ways to make money and acquire power. Their connection to the actual model capabilities is very tenuous: if they can speak, they can think, isn’t it? That’s about it.
The megaphone asks us to work on turning hype into reality. How to make LLMs truthful? How to make them reason? How to solve mathematical exercises? How to code JavaScript games? Considerable efforts led to limited successes: sometimes it works, sometimes it doesn't. In contrast, LLMs achieve certain complex tasks with near perfect accuracy and reliability. They produce fluent language, converse in any style, produce understandable stories, and can manipulate them according to our instructions. We do not know how our current models achieve these tasks. We do not even know why our earlier models could not.
Where we stand today is not defined by the capabilities we aspire to see in our AI models, but by the capabilities they achieve with near 100% success. Until we understand why and how these are achieved, we do not know where we stand, and we cannot reliably decide where to go!
See also "The Fiction Machine" for my take about this question.
Because I am lucky enough to work for a lab that keeps publishing its research and a company that believes in releasing its models to the public, I derived a guilty satisfaction in asking pointed questions to my less lucky colleagues. Do you really believe we have reached the last straight line before super-intelligence? At what price are you trying to preserve a couple months edge in time-to-market?
Closed models are useless for research purposes. Their operation is not transparent enough to even use as stable comparison points. Do they demonstrate genuine creativity or just learn from online usage data? Closed models are also problematic for people building applications because it makes their business dependent on potentially insecure online APIs and capricious pricing structures. We have twice seen how open models can disrupt this situation. The impact of the Llama models was not due to them pushing the state of the art, but to the free availability of their design and their weights. Similarly, the impact of the DeepSeek models was not due to them matching the best closest model, but to their unexpected origin, to the obvious quality of the associated papers, and again to the unrestricted availability of their design and their weights. As a result, Llama and DeepSeek have more influence on the future of the field than the possibly superior closed models. Are the closed models superior by the way? I cannot know for sure, and, very frankly, I do not care.
But we can also look further. Whoever deploys a good enough talking search engine might displace or defend Google’s search monopoly. Of course, this represents a lot of money and could be very useful to mankind. I call this Satya’s banderilla because this very large business is only a small prize in comparison to the cultural impact of sharing artificial intelligence research.
What happens if we discover and share a sensible scientific way to discuss artificial intelligence? Then we also have a sensible way to discuss human intelligence and cognition. Therefore, what starts as a substantial progress in machine learning becomes a paradigm shift for all human disciplines that involve cognition, all arts and sciences, at once. Note that this is different from and much more interesting than claiming that AI machines will take over arts and sciences. This is about us, not about the machines.
The true impact of AI will be a deep transformation of our human culture.
The road will be rocky. Both good and bad things can and will happen. But the outcome compares to that of the invention of writing or the discovery of storytelling. Compared to such a transformation, the talking search engine is really a small fish.