Daniel Lenton from Unify

In this conversation, Daniel Lenton, the CEO and founder of Unify, discusses the unique approach of Unify in unifying the AI stack and addressing fragmentation in machine learning frameworks. He explains how Unify's Ivy library makes it easier for ML researchers to work with different frameworks and how the Model Hub simplifies model deployment. Daniel also highlights the challenges of code conversion and the potential impact of LLMs on Unify's objectives. He emphasizes the importance of deployment in the AI market and the scalability of Unify's business model. The conversation concludes with a discussion on the integration of Unify and the Model Hub and the potential for community involvement. In this conversation, Daniel Lenton discusses the value alignment in deploying AI models and the key use cases and optimizations in the model hub. He also talks about the coalescing of users around specific models and the potential for optimizing existing models in the hub. The pricing structure and value-added services are explored, as well as the self-serve nature of the model hub. The future of AI applications and the role of compute in AI are discussed, along with advice for those entering the field. Daniel shares his thoughts on the potential of AI and the importance of math, and provides information on where to find him.

Takeaways

Unify aims to unify the AI stack and address fragmentation in machine learning frameworks.
The Ivy library makes it easier for ML researchers to work with different frameworks and mix and match them.
The Model Hub simplifies model deployment by providing a suite of deployment solutions and dynamic routing capabilities.
Unify's business model focuses on the deployment side of AI, which is becoming increasingly important as more models are deployed and costs rise. Value alignment is crucial in deploying AI models, and there is a need for optimization and customization to meet user requirements.
The model hub serves various use cases, from AI as a service offerings to users playing with LLMs and seeking cost efficiency.
The coalescing of users around specific models allows for optimization and improved performance.
The pricing structure mirrors that of the vendors, and there is potential for value-added services and enterprise contracts.
The model hub is self-serve, allowing users to easily access and integrate models.
The future of AI applications lies in multimodal capabilities and the potential for AI assistants that can perform a wide range of tasks.
Compute power plays a crucial role in AI, and there is a need for a strong foundation in math and critical thinking for understanding complex systems.
Daniel Lenton can be found on his website, LinkedIn, and the Unify Discord community.

Episode Transcript

Introduction and Unify Overview

Rod Rivera: Welcome to this session of AI products. Today, we're joined by Daniel Lenton, the CEO and founder of Unify. Daniel, could you tell us about Unify and what makes it unique?

Daniel Lenton: Thank you for having me, Rod. Unify is quite ambitious in its scope. Our name reflects our mission: to unify the entire AI stack. We're addressing fragmentation at multiple levels.

Firstly, we're making life easier for ML researchers who often struggle with various machine learning frameworks when building and experimenting with new models. Our product, Ivy, allows easy mixing and matching of these frameworks.

Secondly, we're tackling model deployment. There's significant fragmentation in deployment tools, serving layers, compression techniques, and compilers. Our Model Hub offers a comprehensive suite of deployment solutions with an intuitive interface for benchmarking and selecting the best configuration for any use case.

Rod Rivera: That's fascinating. How did you come up with this idea, and why hasn't anyone else done it before?

Daniel Lenton: Well, there have been attempts at abstractions before, like Keras. However, what I observed during my Ph.D. in robotics, 3D vision, and AI was that while you could write code supporting multiple frameworks, it wasn't possible to take code from one framework and use it in another.

This became frustrating when team members would intern at different companies and return using totally different frameworks. It made sharing code and collaborating extremely difficult. I wanted to unlock the creative powers of the lab and the field more broadly, so I dove deep into this infrastructure project.

Unify's Approach and Business Model

Max Tee: That's really interesting. Could you elaborate on how you approached solving this problem? Which part of the fragmentation did you decide to tackle first and why?

Daniel Lenton: We started with framework fragmentation because it was the most obvious issue at the time. There weren't many deployment tools or AI-as-a-service companies a few years ago. The fragmentation was primarily at the framework level.

Ivy has been very useful for students and researchers, helping them convert between papers and frameworks. It's still addressing a significant pain point, especially since many industries are still deploying TensorFlow while most research happens in PyTorch.

However, the bigger commercial opportunity lies in the deployment space. We've found that many companies are spending millions of dollars a year on cloud costs for running their models. If we can provide even a 30% speed improvement and cost reduction, we're talking about saving hundreds of thousands of dollars annually.

Max Tee: That's exciting! How do you think about value alignment on the deployment side with your users? What are some key use cases you're seeing?

Daniel Lenton: The main use cases we're seeing are companies doing AI-as-a-service offerings, spending at least tens of thousands of dollars per month on compute. We've worked with companies spending anywhere from $400,000 to$ 1.5 million a year on compute.

For example, we closed a deal with a company where we cut their monthly bill from $35,000 to about$ 7,000 just through optimizations, without any compression or quantization.

For the Model Hub, our users are essentially anyone using LLMs who cares about cost efficiency, from small companies trying to be very efficient with their burn rate to massive companies spending a lot of money.

The Model Hub and Its Features

Rod Rivera: Could you tell us more about the Model Hub? How does it work, and what are its key features?

Daniel Lenton: The Model Hub is essentially a wrapper around existing LLM endpoints. We've abstracted these endpoints, allowing users to query our API and specify the provider they want. We then route the request to the appropriate provider with minimal added latency.

We've also added runtime benchmarks that show metrics like tokens per second, intertoken latency, and time to first token. Users can flexibly switch between providers based on these metrics.

What we're rolling out soon is the ability to automatically route requests to the fastest or cheapest option based on user-defined profiles. Users can prioritize latency, cost, or a weighted balance between these factors and quality.

Rod Rivera: That sounds very powerful. How does pricing work for the Model Hub?

Daniel Lenton: For models we're wrapping, we don't inflate the price at all. You pay exactly what the original provider charges. Our idea is that as we gain more usage, we might be able to negotiate better rates with vendors through enterprise contracts.

We're also considering value-add features. For instance, instead of charging per API call, we might charge a percentage of the cost savings we provide. At the very least, we guarantee that users won't pay more than they would with any individual provider.

The Future of AI and Career Advice

Rod Rivera: Given your broad overview of the ecosystem, what are your thoughts on where we're headed? What trends are you seeing?

Daniel Lenton: It's hard to predict with certainty, but the broad theme is that data and compute seem to rule over everything else. We're seeing a move away from companies doing lots of in-house models towards outsourcing to generative models.

I think multimodal AI is coming, which is why our hub isn't just for LLMs. The idea is that you send a prompt, and it goes to the best model for that task, whether it's text, image, or video generation.

I believe that within the next 10 years or so, AI might be at or near human-level ability in many areas. It wouldn't surprise me if there's no single job that a really good AI can't do.

Rod Rivera: That's a sobering thought. What advice would you give to those starting their careers or still studying, given this potential future?

Daniel Lenton: Math is always good, as is critical thinking. I think coding is still useful because even if AI can create simple front-end websites, the deep thought on long-term design and scalability of code is valuable to learn.

There might be more demand for skills like prompt engineering, but math will remain important. We'll need people who can understand complex systems to make sense of these AI "brains" and figure out how this form of intelligence works.

Closing Thoughts

Rod Rivera: Where can people find you and Unify, Daniel?

Daniel Lenton: You can find me at danlenton.com, but I'd suggest checking out unify.ai. There, you'll find links to our LinkedIn and Discord. I'm quite active on Discord and always welcome questions from the community.

Rod Rivera: Any last message for our audience?

Daniel Lenton: I'd encourage everyone to follow their passions. When I started my Ph.D. in AI in 2016, it wasn't the most stable career path, but I was passionate about robotics and AI. I built Ivy because I thought it was cool, with no expectations of where it might lead. By following your curiosity and doing what you're passionate about, you're always winning, regardless of the outcome.

Rod Rivera: Thank you so much, Daniel, for being here today and sharing your insights with us.

Daniel Lenton: It was my pleasure. Thank you for having me.