It's Fall 2017.
I'm sitting in a 100 year old wooden lecture hall listening to our Machine Learning professor, Malik Magdon-Ismail, speak about training sets and test sets of data. Years later, I won't remember any of the statistics or linear algebra he's been teaching for weeks, but I will remember this. He says that any decision that's made based on seeing the test data will affect the validity of the outcome. He gives an example of reading research papers to see what others have done, then improving on their algorithms. He clarifies that the most unbiased and generalizable way to design a ML model is to think really hard about the problem, features, etc before looking at the data or prior work.
Flash forward 7 years: AI is everywhere, it has consumed the tech blogs, mainstream news, and stock market. The hype of NVIDIA and Microsoft are driving the S&P 500 higher, even while less than half of the rest of the index isn't improving. Highly regarded software engineers who are now working on selling AI products are writing hype-beast blog posts like it's 2021 and they're working on a cryptocurrency. They claim that them and all their friends are somewhere between 2x and 10x more productive in a wide variety of knowledge jobs because they use the latest & greatest AI chatbots. No, not the latest and greatest from a few months ago, those pieces of crap hallucinated horribly.
Whoa, let's take a step back. I have two big questions about AI:
To find out, let's go back again. It's 2014, I was in my high school's FIRST Robotics team's lab. Google's Self-Driving Car Project was in the mainstream news, and since cars are basically big robots, we were chatting about its potential. My optimism for the future about how technology will change our world was at teenage all-time high. On one hand, self-driving cars were poised to automate away a hugely tedious sector of the US economy and daily lives. I lived in rural New Hampshire, suburbs of suburbs of Boston basically, so I and everyone I knew had to drive everywhere, it didn't take much imagination to see how life-changing it would be to free up that time. Like teleportation, almost. CGP Grey's seminal video "Humans Need Not Apply" came out around then too, which gave an extremely optimistic take on how robotics and "mechanical minds" could improve our lives, while simultaneously busting myths about their impact to people's jobs. The video is now 10 years old, and in among the beautiful philosophy: "There isn't a rule of economics that says 'better technology make more better jobs for horses'. It sounds shockingly dumb to even say that out loud, but swap horses for humans and suddenly people think it sounds about right." there's some technological showcases that haven't panned out in the decade since. "Self driving cars aren't the future: they're here and they work. [...] They don't need to be perfect, they just need to be better than us.", though philosophically correct that self-driving works, software engineers like myself know that expectations for computers are much higher than expectations for humans.
It turns out that it's really hard to build self-driving with a history of working perfectly in all scenarios, no matter the location, road conditions, or unpredictable behavior of nearby humans. So in the past decade almost every auto maker has put "adaptive cruise control", "lane keep assist", "emergency braking", and "parking assist" SAE Level 1/2 features in their cars. Mercedes-Benz has started rolling out SAE Level 3 automated driving in certain situations, in which the car takes over liability from the driver when it's activated, but we're still far from automating away the transportation industry as Grey predicted. I'm not saying that Level 4/5 self driving isn't coming ever - Alphabet's WAYMO is rolling it out in select locations - just that it takes longer than even smart people expect to generalize tech like this.
If you're a casual news reader in 2024 you probably won't know that AI has been through hype cycles for about as long as computers have existed (since the 1950s at Dartmouth). When people today talk about AI taking over white collar office jobs, they are leaving out the fact that computers have been doing that for a long time. Even the term "computer" is the name of a human profession that the machines automated away. It's obvious to us now that humans aren't faster or better than a computer / calculator at mathematical calculations, will it be true that in the future it'll be obvious that humans are worse than computers at all intelligence and thinking tasks? It's hard to say, the field started at that Dartmouth workshop 70 years ago, Deep Blue beat Gary Kasparov 27 years ago, Siri and Alexa came out 14 and 11 years ago, ChatGPT came out two years ago. Is AI exponentially improving, leveling off, or something else? This analysis of scaling papers shows AI LLM models can't cross a logarithmic scaling limit with regards to dataset size, number of features, and training time. Some shift in direction is needed to get significant performance improvements. Speaking of which...
I've read some of psychologist Daniel Kahneman's book "Thinking, Fast and Slow", which describes how the human mind works with two complimentary systems. "System 1" is fast, intuitive, subconscious, while "System 2" is slow, rational, deliberate reasoning. The way I've been thinking about modern neural networks is that they operate as "System 1" does in humans, where they intuitively spit out guesses based on prior learning and memory. Normally I'm against anthropomorphising AI like this, but it exhibits a lot of similarities to Kahneman's experiments, including the hallucinations and false reasoning, but also superhuman levels of recall and generalization on the training sets. Since these GPTs are so good at "System 1", I've been on the lookout for people working not on improving them, but on building AI that does the "System 2" style reasoning that was popular in Prolog-style AI of the 1970s. Will GPTs with sufficient infrastructure like multi-agent workflows become capable of general reasoning and Artificial General Intelligence? Will we need to wait for hardware improvements to build bigger models? Or does a totally new & complimentary type of AI need to be developed to perform "System 2" style artificial thought? Since the first draft of this post, OpenAI has announced & released o1 - a new style of model that's slower, but uses internal chain-of-thought to perform much better than ChatGPT 4 at math, coding, and science. Though I'm always skeptical of creator's claims, this shows that AI researchers have started tackling the "System 2" problem. It's surprising to me that this development happened so fast, and shows that relatively minor changes have been able to simulate chain-of-thought reasoning using technology developed for "System 1" models.
I'm not on the AI hype train, but this is an area to watch.