Google’s Gemini AI is multi-modal, which means it can process and generate files in various formats, ranging from text and images to videos. Though it can generate audio, so far, it has lacked the ability to process audio files uploaded by users. That finally changes, as Gemini now lets you feed audio files and talk about them.
What’s the big change?
The ability to upload audio files is now live in the Gemini mobile app and the web version, too. In the Gemini chat bubble, tap on the “+” icon and upload the audio clip by selecting the clip-shaped file upload icon. Oh, by the way, this feature is free for all Gemini users.
According to Google’s support page, you can upload audio clips of up to ten minutes duration. But if you pay for the Gemini AI Pro or Ultra bundles, you can upload audio files with a run time of up to 3 hours.
In case you’re curious about what other file formats you can feed to Gemini, here’s a quick rundown:
- Up to 10 files in one go, including ZIP files.
- Video of up to 2GB in size. 5 minutes in length for free users, and 1 hour for paying customers.
- One code folder, or one GitHub repository (up to 5,000 files / 100MB size)
A boon for the bibliophiles
Not everyone loves digging into an audiobook, podcast, or lecture recording. Sometimes, walls of text are where the real magic happens, or it’s where the cognitive comfort zone lies. If you count yourself among the folks who seek some aural liberation, this Gemini feature update is nothing short of a godsend. And yeah, audio support goes beyond the English language, as you can see in the post below.
Now, whether it’s the summarization of a long lecture, or the need to extract only a few specific talking points from a podcast, Gemini will handle the audio and give you just what you want. You can ask it to write long reports, short briefs, or even convert it into the form of knowledge slides that you can export as images.
On the other end of the rope, we have the fantastic NotebookLM tool. It can turn your long text files into an engaging two-person audio podcast. If you prefer video overviews, it can do that, as well. And while at it, go and avail the free Gemini AI Pro offer that Google is offering to students in numerous countries, including the US.