Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of the last week’s stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.
This week, Google dominated the AI news cycle with a range of new products that launched at its annual I/O developer conference. They run the gamut from a code-generating AI meant to compete with GitHub’s Copilot to an AI music generator that turns text prompts into short songs.
A fair number of these tools look to be legitimate labor savers — more than marketing fluff, that’s to say. I’m particularly intrigued by Project Tailwind, a note-taking app that leverages AI to organize, summarize and analyze files from a personal Google Docs folder. But they also expose the limitations and shortcomings of even the best AI technologies today.
Take PaLM 2, for example, Google’s newest large language model (LLM). PaLM 2 will power Google’s updated Bard chat tool, the company’s competitor to OpenAI’s ChatGPT, and function as the foundation model for most of Google’s new AI features. But while PaLM 2 can write code, emails and more, like comparable LLMs, it also responds to questions in toxic and biased ways.
Google’s music generator, too, is fairly limited in what it can accomplish. As I wrote in my hands on, most of the songs I’ve created with MusicLM sound passable at best — and at worst like a four-year-old let loose on a DAW.
There’s been much written about how AI will replace jobs — potentially the equivalent of 300 million full-time jobs, according to a report by Goldman Sachs. In a survey by Harris, 40% of workers familiar with OpenAI’s AI-powered chatbot tool, ChatGPT, are concerned that it’ll replace their jobs entirely.
Google’s AI isn’t the end-all be-all. Indeed, the company’s arguably behind in the AI race. But it’s an undeniable fact that Google employs some of the top AI researchers in the world. And if this is the best they can manage, it’s a testament to the fact that AI is far from a solved problem.
Here are the other AI headlines of note from the past few days:
Andrew Ng’s new company Landing AI is taking a more intuitive approach to creating computer vision training. Making a model understand what you want to identify in images is pretty painstaking, but their “visual prompting” technique lets you just make a few brush strokes and it figures out your intent from there. Anyone who has to build segmentation models is saying “my god, finally!” Probably a lot of grad students who currently spend hours masking organelles and household objects.
Microsoft has applied diffusion models in a unique and interesting way, essentially using them to generate an action vector instead of an image, having trained it on lots of observed human actions. It’s still very early and diffusion isn’t the obvious solution for this, but as they’re stable and versatile, it’s interesting to see how they can be applied beyond purely visual tasks. Their paper is being presented at ICLR later this year.
Meta is also pushing the edges of AI with ImageBind, which it claims is the first model that can process and integrate data from six different modalities: images and video, audio, 3D depth data, thermal info, and motion or positional data. This means that in its little machine learning embedding space, an image might be associated with a sound, a 3D shape, and various text descriptions, any one of which could be asked about or used to make a decision. It’s a step towards “general” AI in that it absorbs and associates data more like the brain — but it’s still basic and experimental, so don’t get too excited just yet.
If these proteins touch… what happens?
If these proteins touch… what happens?
Everyone got excited about AlphaFold, and for good reason, but really structure is just one small part of the very complex science of proteomics. It’s how those proteins interact that is both important and difficult to predict — but this new PeSTo model from EPFL attempts to do just that. “It focuses on significant atoms and interactions within the protein structure,” said lead developer Lucien Krapp. “It means that this method effectively captures the complex interactions within protein structures to enable an accurate prediction of protein binding interfaces.” Even if it isn’t exact or 100% reliable, not having to start from scratch is super useful for researchers.
The feds are going big on AI. The President even dropped in on a meeting with a bunch of top AI CEOs to say how important getting this right is. Maybe a bunch of corporations aren’t necessarily the right ones to ask, but they’ll at least have some ideas worth considering. But they already have lobbyists, right?
I’m more excited about the new AI research centers popping up with federal funding. Basic research is hugely needed to counterbalance the product-focused work being done by the likes of OpenAI and Google — so when you have AI centers with mandates to investigate things like social science (at CMU), or climate change and agriculture (at U of Minnesota), it feels like green fields (both figuratively and literally). Though I also want to give a little shout out to this Meta research on forestry measurement.
Doing AI together on a big screen — it’s science!
Doing AI together on a big screen — it’s science!
Lots of interesting conversations out there about AI. I thought this interview with UCLA (my alma mater, go Bruins) academics Jacob Foster and Danny Snelson was an interesting one. Here’s a great thought on LLMs to pretend you came up with this weekend when people are talking about AI:
These systems reveal just how formally consistent most writing is. The more generic the formats that these predictive models simulate, the more successful they are. These developments push us to recognize the normative functions of our forms and potentially transform them. After the introduction of photography, which is very good at capturing a representational space, the painterly milieu developed Impressionism, a style that rejected accurate representation altogether to linger with the materiality of paint itself.
Definitely using that!
Source @TechCrunch