E19: AI's Next Leap - OpenAI's O1, Robot Dexterity & Spatial Intelligence
In this episode, the hosts discuss the latest advancements in AI, focusing on OpenAI's new model O1, which showcases improved reasoning capabilities. They explore the implications of these advancements in various fields, including robotics and spatial intelligence, and the challenges of funding AI research in academia. The conversation also touches on Klarna's shift towards internal AI solutions and the excitement surrounding SpaceX's private spacewalk, highlighting the intersection of technology and inspiration for future innovations.
Takeaways
- OpenAI's O1 model excels in logical tasks and reasoning.
- The new model has passed an IQ test, indicating advanced capabilities.
- Reasoning in AI mimics human thought processes, enhancing use cases.
- Navigating the growing number of AI models presents new challenges.
- Robotics is evolving with AI, leading to more dexterous machines.
- Spatial intelligence could revolutionize architecture and engineering.
- Funding for AI research is increasingly difficult for academic institutions.
- Klarna's move to internal AI solutions may inspire other companies.
- SpaceX's private spacewalk signifies a new era in commercial space travel.
- The rapid pace of AI innovation is reshaping industries and society.
Episode Transcript
Introduction
Chris: Welcome to the Chris Rod Max show, where we discuss the latest developments in AI and their consequences. I'm thrilled to be back. Rod, Max, how are you both?
Rod: Hi everyone, great to be here.
Max: Doing well. Good to be back.
Chris: Fantastic. Today, instead of focusing on risks and limitations as we've done in previous episodes, I want to explore the latest innovations and exciting developments in AI and beyond that have occurred over the past week.
Topics of Discussion
We'll cover:
- OpenAI's latest model, the 01 release
- Advancements in AI-powered robotics
- Other newsworthy items, including Klarna and SpaceX's private spacewalk
Let's begin with OpenAI's latest internal release, codenamed Strawberry, which has now been officially launched as 01. Rod, could you provide some technical insights on how it differs from previous models?
OpenAI's Latest Model: 01
Rod: Certainly. The 01 model has been in the news for a while, initially rumored to be a significant release like GPT-5. However, it turned out to be a specialized model focusing on logical tasks. For instance, it excels at solving classical high school problems that require finding the best path among multiple options.
One significant improvement is its ability to count words in a sentence accurately, a task that stumped previous models like GPT-3, GPT-3.5, and GPT-4. However, there's a trade-off: while historical models were getting faster, 01 takes more time to generate answers because it's exploring multiple options in the background, similar to a decision tree.
Chris: Interesting. It seems we're already living in a world with numerous model choices. Rod, you're fond of Claude; Max, you use Copilot at work; I typically use ChatGPT. Now it's becoming even more complex with different models to choose from.
Recently, I heard that this new 01 version passed an IQ test with a score of 120, which is above the average human IQ of 100. By that definition, 01 is supposed to reason better than the average human being. It's fascinating how tasks that seem trivial to us, like counting words in a sentence, are actually incredibly complex for AI models.
Max, what are your thoughts on this new development? Have you tried it, and what do you think about its potential use cases?
Max: Thank you. Let's start with what OpenAI calls reasoning, which revolves around chain of thought. I find this really interesting because it mimics how humans think. When we argue a point, we typically map out our thoughts step by step, looking at how one point leads to the next, and then review our entire argument.
This feels like chain of thought โ a sequential, step-by-step process to address the question at hand. It's not just about looking forward and predicting, which is common in many machine learning solutions, but also looking backward to analyze the reasoning. This is fascinating because it's more akin to how humans innately try to understand different reasoning paths.
In terms of use cases, this approach could be applied to many tasks we perform daily. For instance, in customer service, the AI could better capture the entire context of a conversation, making it less likely to provide out-of-context answers. This could be applied to various scenarios like customer service, finance, or coding, where understanding the broader context and being able to explain the reasoning is crucial.
AI and Robotics Advancements
Chris: Absolutely. The rate of innovation we've seen in the last few months, with new model releases and incremental improvements, has been incredible. It's fascinating to consider where this is leading.
Now, let's move on to our second topic: what happens when you apply this reasoning capability to robotics? Earlier this year, we saw news about Figure.ai, a general-purpose humanoid robot capable of recognizing objects and reasoning about them. For example, if presented with an apple and a cup, and asked to provide something edible, it can reason that it should give the apple to the human.
Recently, Figure.ai partnered with BMW to test this humanoid robot in factory assembly. While the video still looks a bit clunky and slow, we're clearly making progress. Google has also published work on a robot capable of performing dexterous tasks like tying shoelaces or hanging clothes โ actions that seem trivial to us but are quite challenging for robots.
Max, what are your thoughts on these developments in AI-powered robotics?
Max: The first thing that comes to mind is "Terminator, here we come!" But joking aside, it's quite impressive. The ability to analyze a scene, understand a request like "pass me something to eat," process what's edible, and then pick up and pass the correct item involves a complex chain of thoughts.
The nuance will come when the robot needs to differentiate between various types of food, considering factors like healthiness or ripeness. This capability could indeed replace some human roles, especially in factories. Imagine asking a robot, "Pass me a smaller screwdriver to tighten these screws on this part of the car." The reasoning engine allows for more natural, human-like communication with the robot, which is really interesting.
It's super exciting, but it also means we'll eventually need to learn to coexist with these advanced robots.
Rod: I'd like to add that robots are a good example of how humans have historically been poor at predicting the future. Science fiction has long portrayed anthropomorphic, intelligent robots indistinguishable from humans. Yet, as we approach 2025, we still don't have robots freely roaming in the wild. Instead, innovation has come in unexpected areas โ we thought factory work would be replaced, but now it seems marketing managers might be more at risk of replacement.
It's also worth noting that the journey from impressive prototypes to widespread practical use is a long one. We've seen companies like Tesla and Toyota present humanoid robots for factories, but their actual implementation and impact remain to be seen. Boston Dynamics is famous for their amazing robot videos, but have they really found practical, widely-adopted use cases? Not really, at least not yet.
Chris: I agree, Rod, that the technology itself might still be a few years out. However, I think the real innovation here is that the robot isn't necessarily programmed to perform a specific task. The difference in the news from Google DeepMind is that it's all based on simulations. The robots aren't trained for particular tasks but learn from watching videos of various actions.
This opens up a new field of use cases and flexibility for robots and their respective tasks. While it might look a bit creepy or reminiscent of science fiction, I believe we're making significant progress.
World Labs and Spatial Intelligence
Chris: Now, let's discuss the recent funding announcement for Fei-Fei Li's new startup, World Labs. They've raised $230 million to work on spatial intelligence. Rod, could you help us understand what exactly spatial intelligence entails?
Rod: World Labs is still keeping many details under wraps, but we can make some educated guesses. Fei-Fei Li, often called the "godmother of AI," aims to bring her progress in 2D image understanding into three-dimensional space.
I imagine it could be related to the concept of digital twins โ digital versions of physical objects or systems that can simulate various conditions. Currently, these digital twin simulations are quite basic. World Labs might be working on something more advanced, perhaps similar to Roblox or Minecraft, where people can build or reproduce real-life objects and expose them to different conditions in a digital environment before implementation in the real world.
Chris: Interesting. So you're suggesting there's a connection between the digital world and the physical world, with 3D models that can apply reasoning to 3D space and eventually to the physical space. Max, what are your thoughts?
Max: When we think about spatial intelligence from a human perspective, we're considering distance in relation to something else. It's like modeling the world or different spectra on multiple axes. In 3D, you have three reasoning points, trying to understand where you are in relation to that three-dimensional plane.
From a reasoning perspective, there's a lot of value in being able to think this way. You're not limited to just two dimensions of argument; you can incorporate a third. I can visualize it in my head like a 3D graph.
When we think about spatial intelligence, it's not necessarily limited to the physical world. Being able to think in 3D or 4D ways is super interesting. It's like playing 4D chess โ you have to consider how things relate to each other across multiple planes.
This becomes very interesting because, as humans, we often fail to realize that much of our intelligence is embedded from our ancestors. We do many things without actively thinking about them. For robots or AI, which lack this embedded intelligence, everything needs to be trained from the ground up. Being able to break down how we think and incorporate that into AI is quite exciting.
Chris: Certainly. To summarize, it's essentially software for architects and engineers building for the physical world, using simulations to visualize things in 3D. I imagine spatial intelligence could be far more advanced, potentially able to design entire cities from scratch within seconds.
Another interesting point Fei-Fei Li mentioned was that universities aren't getting enough funding, and none of the US universities can build a large language model due to the expense โ reportedly around $100 million. It's intriguing that she's entering the private sector to secure more funding. What are your thoughts on this?
Rod: This isn't entirely new. Cutting-edge artificial intelligence research has become so expensive that it's impossible to secure the necessary funding in academia. We've discussed in previous episodes how even companies struggle to obtain the billions required to execute these projects. It's limited to very few companies in very few markets.
Imagine the challenge for universities, where the goal isn't commercial and there's no expected return on investment. For an increasingly large subset of problems, the research simply can't be done in academia anymore. It has to be done in industry, with investors pouring in enormous amounts of money.
Max: It reminds me of the UK pooling funding for the supercomputer in Edinburgh. In our capitalistic society, for something to be developed via public funding, we need to look beyond immediate returns. If it's to be done within academia, it requires someone who's extremely interested and has access to resources or donors willing to support the university.
The lack of funding is sad, but it will have long-term implications. Money is a human construct, meant to help us achieve a better life for humanity. Why can't we come together and work something out? Governments have a lot of ability to do this. Every government should probably look into providing the necessary funding or resources, like data centers for training. Because if we look beyond that, it's just pricing after all.
Other News: Klarna and SpaceX
Chris: To wrap up, let's touch on a couple of other news items. First, Klarna has been in the news again for ditching Salesforce and Workday in favor of their internal AI software. Second, SpaceX conducted a private spacewalk with a billionaire, an intern, and some engineers, which is quite cool and pushes the boundaries of commercial space flight. What are your reactions to these developments?
Rod: Regarding Klarna, we've been discussing the implications extensively. Taking the bullish case, this could inspire other companies to develop their own tools and software. This might put pressure on the SaaS industry, with large buyers of SaaS software replacing it with internal developments, thanks to the speed that AI tools bring to code generation.
We might also see many indie developers and small studios providing tools almost at the same level as large enterprise software but at a fraction of the cost. This could democratize software development, especially for SMEs.
As for the spacewalk, it's great to inject some fantasy into our discussions. We need things that inspire and motivate us, that bring fantasy to our society and show us that we're just a few years away from what once seemed like science fiction. It triggers a positive mindset and makes us think about where we're allocating our resources โ not just in the digital space and AI, but in the real world of space exploration and colonization.
Max: If I had 55 million for a spacewalk! But more seriously, this reminds me of the four-minute mile barrier. Once Roger Bannister broke it, suddenly many others started achieving it too. This feels similar โ we've set a new bar for ourselves as human beings. While $55 million is only available to the wealthy today, I'm sure someone somewhere is trying to build something cheaper for people to go to space.
Regarding SpaceX, looking at how far they've come, regardless of one's opinion on Elon Musk, that man has achieved quite a lot for humanity. He's trying to inspire the next generation of entrepreneurs.
As for Klarna, I think we'll see more smaller applications being developed. Experiments will be quicker, leading to more software and more people building things. For larger software, there's typically a lot of maintenance cost. My question is, with AI-built software, how much maintenance will be needed? Could the software eventually adapt itself over time due to its intelligence and ability to evolve?
Chris: Max, I think AI itself, or perhaps a Figure.ai robot, will handle the maintenance. I don't think maintenance will be the biggest problem here.
I love how we're ending with inspiration โ I think that's the key word. Today's show was really about the latest innovations, from OpenAI's new reasoning model to advancements in robotics and spatial intelligence, and how these are opening up new use cases. We've discussed the implications we're already seeing with AI applications in the enterprise space, like with Klarna, and how this is changing the landscape. And then we ventured into deeper, cooler stuff like the private spacewalk.
Thank you both for your insights, and thank you to our audience for listening. If you enjoyed our show, please subscribe, like, and leave comments. We really appreciate your thoughts and are always looking to improve our show. With that, thank you Rod, thank you Max, and we'll speak to you soon.