Podcast Image: E02: OpenAI GPT-4o Reveal, Breaking AI Adoption Barriers, and the Disruptive Future of Work

E02: OpenAI GPT-4o Reveal, Breaking AI Adoption Barriers, and the Disruptive Future of Work

In this episode, Chris, Rod, and Max break down the latest bombshell from OpenAI - the reveal of GPT-4o, an AI model that can engage with humans through text, audio, video, and even detect emotions. As AI becomes more accessible through falling costs, rising open source capabilities, and accelerated releases from tech giants, the pressure is on for enterprises to adapt.

Host

Rod Rivera

๐Ÿ‡ฌ๐Ÿ‡ง Chapter

Guests

Chris Wang

AI Innovation and Strategy Expert, CXC Innovation

Max Tee

VC Expert, AI Investor, BNY Mellon

E02: OpenAI GPT-4o Reveal, Breaking AI Adoption Barriers, and the Disruptive Future of Work

This is the second episode of The Chris Rod Max Show! Here you'll find links to our YouTube, Spotify, and SounCloud uploads. Scroll down for a full transcript of the show.

And do not forget to follow us for more:

Listen on Spotify

https://open.spotify.com/episode/7xzBNdBKMPEaFus4AFBDPu

Listen on SoundCloud

https://soundcloud.com/chris-rod-max-show/e02-ai-orchestration-gpt-4-human-like-assistants-and-the-end-of-robotic-process-automation

Keywords

#AI #OpenAI #GPT4o #Automation #FutureOfWork #ArtificialIntelligence #MachineLearning #DeepLearning #Robotics #DataScience #Innovation #Technology

Chapters

  • 00:00 Introduction and OpenAI Reveal
  • 06:25 The Speed of Development in Large Language Models
  • 09:56 Barriers to AI Adoption: Skills, Data, and Ethics
  • 25:17 Automating Complex Processes with AI
  • 30:44 Challenges and Opportunities in AI Adoption

Full Transcript of the Show

Chris: Welcome to The Chris Rod Max Show, where every week we discuss the latest news in AI, connect with AI founders, and explore the implications of AI for tech, companies, and humanity. Today I'm joined by my co-hosts Rod and Max. We have some exciting news to dive into, including the OpenAI GPT-4 reveal from yesterday. We'll also be talking about barriers to adopting AI tools and software, as well as automation. Let's dive right in.

Rod, let's start with you. As someone with a product management background, what did you think of the GPT-4 reveal? Can you describe some of the key features?

Rod: This announcement seemed to divide people between the tech world building AI applications and end users. AI developers felt underwhelmed - they expected something like a search engine to compete with Google. But what was released is much more geared towards end users.

Until now, there were two main models - GPT-3.5 which was free but basic, and GPT-4 which was much better but expensive. Most people only used the free version and weren't that impressed. But now, the previous top-of-the-line GPT-4 has become free, giving everyone access to what was recently the best model available.

There were also announcements around a personable AI assistant you can interact with via voice, that understands images and equations. But I think the biggest news is the democratization of this previously out-of-reach model, making it freely available to all.

Chris: Thanks Rod. What are your thoughts on this Max, coming more from the consumer side?

Max: I found it quite exciting from a release perspective. I think it's great for any application builders, especially economically, since the high-end model is now more readily accessible. As a consumer and investor, I felt it was very positive.

Reading through the details, I was inspired to try building my own agent to help with tasks like market research. With the advanced model now available, you and I can create simple agents for low-level tasks much more efficiently.

While I'm not sure how it will all unfold, I think there are interesting possibilities ahead in terms of where value will be created, whether it's specialized knowledge combined with a base model or ever-improving general models making specialized knowledge less relevant. Regardless, OpenAI is clearly a leader driving a lot of this forward.

Chris: Definitely. You both touched on the split opinions between expectations and reality. Rod highlighted the anticipated search engine feature that didn't come to fruition. But on the other hand, we're seeing this powerful model that was behind a paywall now freely accessible with ChatGPT, enabling multi-modal interaction through text, audio, and video.

The enhanced tone makes it feel more like conversing with a real person. And in the demo, ChatGPT even showed a humorous side. Improved language processing was evident in the real-time English to Italian translation. From a consumer perspective, these multi-modal capabilities open up significant possibilities.

Another thing that stood out to me is the rapid pace of development. ChatGPT only launched in December 2022 and hit 1 million users in five days, then 100 million a month later. Now, just a year after that, GPT-4 has arrived. What do you think about the speed of progress and the implications now that ChatGPT has become even more human-like?

Rod: I want to underscore the year-over-year improvements. For a long time, GPT-4 was the unchallenged leader. But in recent weeks, open-source models from Anthropic, Google, Meta and others emerged that rivaled or even exceeded GPT-4. That may have lit a fire under OpenAI to innovate and release enhancements quickly to stay ahead.

So in the span of a year, we went from OpenAI dominating with GPT-3 and 4, to viable competitors arising, to OpenAI striking back to retake the lead. The competitive landscape has intensified dramatically just in this first quarter.

Max: I think the pace of development ties into what Jensen Huang, CEO of Nvidia, has said about AI advancement being all about scaling - using more data, networks, and layers to drive progress. For a stretch there wasn't much new from OpenAI, but we know they were working on major data-sharing deals with publishers to feed their models.

If you consider what Ben Evans said on our show last week, that generative AI is fundamentally about pattern recognition - the more patterns you can encode into the system, the better it can replicate human-like outputs. I think that will keep progressing, though it remains to be seen how far it goes.

From an investment view, I've been pondering at what point a startup with specialized knowledge might outperform a general-purpose "omniscient" model. Will proprietary data give you an edge in certain domains? Or will ever-improving general models eventually make specialized knowledge obsolete? I don't have the answers, but it's fascinating to consider as we see development unfold on both fronts. What's clear is that OpenAI is extremely valuable in driving this field forward.

Chris: Indeed. Another feature they showed was real-time translation. Max, do you foresee us getting "Babel fish" style instant translation capabilities? What are your thoughts on where that's headed?

Max: Real-time translation has been envisioned for a long time. I remember Skype promoting a future where you could seamlessly converse with anyone worldwide in different languages. When I was in college, I actually tried building simple glasses that could translate text in the user's field of view into speech, to help the visually impaired. But the technology for fast, real-time translation just wasn't there yet.

With these new capabilities, you could snap a photo, send it to ChatGPT, and have it instantly returned and read aloud. That could assist in many scenarios. When you consider the multi-modal abilities spanning vision, voice, text and beyond - it's really getting closer to replicating human senses.

The more we can encode these sensory inputs, the more patterns the AI can recognize, and the more human-like it becomes. So what really differentiates humans and AIs when they can process the world the same way we do? It has huge implications that will reshape industries, because so much of our current environment is built around human capabilities as we've known them. If that changes, it means an overhaul of almost everything. Tons of opportunity for value creation and destruction. It'll be fascinating to watch.

Chris: For sure. To your point Rod about real-time translation, I imagine your international family would find that incredibly useful when traveling to places like China, where the language barrier can be challenging. Having your phone able to live-translate signs, menus, speech and more, without even needing an internet connection, would be utterly game-changing.

As we've discussed, the pace of AI progress is staggering. Shedding light on that is one of our key aims with this podcast. On that note, I want to pivot to our next topic, which is the barriers to widespread AI adoption. An IBM study found the top three hurdles are: limited AI skills/expertise, data complexity, and ethical concerns.

Given the rapid developments we've covered, how do you see those barriers evolving in the coming months and years? Will they ease over time as the technology matures and proliferates?

Rod: The need for upskilling is a perennial issue that affects everyone from developers who now need to learn AI skills, to the average office worker in marketing or HR who has to figure out how to harness these new tools.

When I talk to people about ChatGPT, some have used it but still in a very basic way - asking a question and getting an answer back. Very few are doing more advanced things like image or voice generation. The vast majority still aren't using it at all. So this reskilling challenge exists at every level, from developers to end users.

Max: The two things you mentioned, talent and data, are both highly complex. Talent ties into human behavior. For example, my company rolled out CoPilot to assist with writing, but not everyone uses it, even though it can be a huge time-saver.

I leverage it to automatically generate detailed notes from meeting transcripts, which saves me a ton of time. Taking it a step further, I was inspired to try building my own agent to automate basic market research on companies I'm evaluating. Rather than having an analyst spend hours on it, I wanted to see if AI could streamline the screening process.

The more I thought about it, the more I realized AI could potentially transform the entire process. I was watching a conference where the CIO of Bridgewater, the world's largest hedge fund, discussed their investment philosophy. It boils down to 1) Identifying fundamental truths and drivers in markets and the economy, 2) Systematically codifying those concepts into equations and rules.

With AI, you can start to automate those steps of distilling raw information into knowledge and then executing on it systematically. For a lot of human workers, that will feel very threatening. But if people truly understand the purpose behind what they do, AI can become a tool to carry out the execution more efficiently. It's a major shift though.

As for data, the key issue is drawing boundaries around data ownership, access rights, privacy and so on. There are legal constraints like GDPR to navigate since we can't just start from a blank slate. Overall, while daunting, I believe these talent and data hurdles to adoption will be gradually overcome.

Chris: What I'm hearing is that change is a constant, but humans are naturally resistant to it. We like staying in our comfort zones. It takes a time investment and open-mindedness to explore these new tools and uncover what's possible. Inertia is very real at every level.

That said, as Max exemplified by building his own agent to automate research, we'll have pioneering inventors and innovators who blaze new trails. Over time, as compelling use cases emerge, adoption barriers will diminish and the rest of us can follow their lead. Once we hit that inflection point, AI will become much more widespread.

Rod: Chris, you mentioning the need to embrace change reminded me of an experience I had in 2016 while working at a large consumer goods company. The biggest challenge I faced in rolling out AI tools wasn't the technology itself, but people's unwillingness to adopt it. Many feared that it would automate them out of a job. Just getting them to use it, even when it clearly made their lives easier, was a real struggle.

Max: The human factor really can't be overstated. But as a species, we do tend to adapt over time. As the saying goes, it's not the strongest or smartest that survive, but the most adaptable to change. We'll get there eventually.

I think what's interesting about the talent piece in particular is that it will require us to get crystal clear on the fundamental elements of any given role or process. What's the core objective? What information is needed and how should it be processed to achieve the desired outcome?

With AI in the picture, a lot of rote tasks can be automated, but you still need human judgment sitting on top of it to steer things in the right direction. It's almost like the AI becomes the junior consultant or analyst cranking away in Excel, while the senior human partner focuses on the higher-level strategy and insights.

Chris: Great points. As we touched on earlier, AI can increasingly shoulder the repetitive, low-level work so humans can focus on the more creative and strategic elements. With that in mind, I want to explore the notion of automation a bit further, and specifically the concept of robotic process automation or RPA.

Rod, for any listeners who may be unfamiliar, can you explain what RPA is and why it's relevant in this context?

Rod: Sure. RPA refers to tools that can automate certain processes by recording and mimicking human actions. For example, an RPA bot could be trained to open a spreadsheet, copy and paste data from it into a particular field in another application, and so on.

This can be useful in enterprises with legacy systems that don't easily talk to each other. Rather than writing custom code, you can use RPA to bridge the gap and enable some interoperability. But it's a fairly brittle approach that tends to break if anything in the underlying applications or flow changes.

More recently, we're seeing a shift towards AI-powered automation agents that can understand higher-level goals and figure out how to accomplish them dynamically, rather than just replaying a rigid set of pre-defined steps. With the latest advancements we've been discussing, these AI agents are becoming much more capable and cost-effective, which I think will pose a major challenge to traditional RPA vendors.

Chris: That's a great explanation. I can definitely see AI-driven RPA being valuable for all sorts of back-office functions like accounting, where you often have humans shuttling data between different sheets and systems. It's time-consuming and error-prone when done manually.

At the same time, as you alluded to Rod, the key is identifying appropriate use cases and figuring out the right way to specify the task, whether that's by demonstrating it step-by-step or expressing it as a higher-level objective and letting the AI figure out the details. I'm curious to see how this evolves, particularly with the multi-modal capabilities of models like GPT-4.

Rod: Exactly. You can picture it like an orchestra, where you have different AI models each specializing in a particular task - one for data extraction, one for formatting, one for generating insights and recommendations. And then sitting above that you have a conductor model that understands the overall goal and coordinates the various specialist models to achieve it.

Crucially, you don't have to explicitly spell out every step - you can just specify the end state you're looking for and let the AI orchestra figure out how to make it happen. And with the latest advancements, this is becoming possible at very low cost. I expect we'll see a lot of new entrants in this space in the next 3-6 months, as well as incumbents like UiPath being forced to up their game.

Max: That's a great analogy. It reminds me of how consulting firms approach projects by breaking them down into specialized workstreams. In the past, RPA was kind of like the workstream to automate a very specific part of a process. The orchestration layer that Rod described is more akin to the engagement partner overseeing everything and making sure it all comes together to deliver the outcome.

It actually has a lot of similarities to how software architecture has evolved, from monolithic applications to microservices that each handle a small part of the overall functionality. I suspect we'll see automation solutions heading in a similar direction, where you have lots of specialized AI models that can be composed and remixed in various ways.

The real power of AI in this context is that it introduces much more flexibility and dynamism compared to rigid, hard-coded RPA bots. You're not constrained to a fixed workflow anymore. As long as you can describe what you're trying to achieve, the AI can figure out the optimal way to accomplish it based on the specifics of the situation.

Chris: Those are some great insights. I think this has been a really illuminating discussion. As we wrap up, I'm curious - what was the last task that you had an AI assist you with?

Rod: Just last night I was putting together a project plan, and I had a bunch of different Excel files with examples of how it had been done previously. I provided those to the AI along with a description of what I was trying to achieve, and it generated the plan for me.

Max: For me, it was actually trying to automate a market sizing analysis. I had read a blog post where someone walked through doing a top-down and bottom-up market analysis for a particular company. I wanted to see if I could replicate that with AI. It worked pretty well.

Chris: Very cool. I've been playing around with using AI to help generate copy and content for a personal website I'm building. It's been super helpful on that front. Now I'm trying to figure out if I can automate the actual process of building and deploying the site, but that's a work in progress.

In any case, this has been a great discussion. Thanks so much for joining, and we'll see you next week. And to our listeners - you can always find us at chrisrodmax.com.