Alright, folks, listen up! This week, we got some exciting news coming out of Anyscale’s Ray Summit. Now, if you’re into LLMs and generative AI for developers, you’re gonna wanna pay attention to this.
The CEO of Anyscale, Robert Nishihara, kicked off the Ray Summit by dropping a truth bomb. He warned us that the LLM era is about to enter a whole new level of complexity and data intensity. That’s right, we’re talking about multimodal models, baby! Text data, video data, image data – it’s all gonna get mixed together. And let me tell you, it’s gonna require some serious hardware acceleration and application complexity.
But fear not! Anyscale has got our back. They’ve already got this awesome open-source platform called Ray, which is being used by big names like OpenAI and Uber. But now, they’re taking it to the next level with Anyscale Endpoints. This bad boy allows developers to integrate, fine-tune, and deploy open-source LLMs at scale. It’s like an LLM API, just like the OpenAI API, but for open models like Llama 2. And get this, folks – it’s gonna cost you just $1 per million tokens. That’s a steal!

Anyscale Endpoints
Now, Endpoints comes with the ability to fine-tune an LLM, which is pretty neat for any application that doesn’t require massive scale. But if you want to get all fancy and customize your LLM even further, you’ll need to upgrade to the full Anyscale AI Application Platform. That bad boy gives you complete control over your data, models, and app architecture. You can even deploy multiple AI applications on the same infrastructure. Talk about power!
But wait, there’s more! Anyscale just announced Anyscale Private Endpoints. This sweet feature allows customers to run the service inside their own cloud. Now that’s what I call flexibility!
A Sit down with OpenAI Co-Founder John Schulman
Now, on top of all these exciting product announcements, Nishihara sat down with none other than John Schulman, the co-founder of OpenAI and one of the masterminds behind ChatGPT. They had a little chat about the importance of scaling models and compute. Schulman dropped some knowledge bombs, folks.
He explained that the founding team of OpenAI believes in scaling up simple things rather than building complicated clever things. But here’s the thing – scaling in machine learning ain’t as easy as it sounds. There are all these little details to take care of. You gotta scale your learning rates just right, or else you end up with worse results with big models. And don’t even get me started on scaling up your data along with the model size. It’s a delicate balance, my friends.

John Schulman, one of the co-founders of OpenAI and a creator of ChatGPT.
Nishihara wasn’t done yet. He wanted to know why OpenAI isn’t already using 70 trillion parameter models or even bigger ones. And Schulman had an answer ready. It’s all about compute efficiency, my friends. You gotta find that sweet spot where you can train a small model for a long time or a big model for a short time. It’s a delicate balance that maximizes compute efficiency. Right now, a 70 trillion parameter model just ain’t optimal. But who knows, things might change in the future!
Now, here’s where it gets interesting. Turns out, OpenAI is pushing the limits of scale in many different dimensions. And guess what? They’re using Anyscale’s Ray system for distributed computing. Boom! Schulman spilled the beans, my friends. They got a library for distributed training that does model parallelism. And Ray is a big part of that, handling all the communication. It’s like their secret sauce.
To wrap things up, Nishihara posed a thought-provoking question to Schulman. He wanted to know what problems in AI are still being figured out today. Schulman dropped some serious insight, my friends. He mentioned the problem of supervising a model that’s superhuman. How do you make sure LLMs are doing what humans want? It’s what they call scalable oversight or scalable supervision. It’s a whole new ball game, my friends, and the problems haven’t even been fully defined yet. Mind-blowing stuff!