Meet Nova: Amazon’s New Family of Multimodal AI Models

Amazon Web Services (AWS) has made waves at its re: Invent conference with the unveiling of Nova, a groundbreaking family of artificial intelligence (AI) models. This powerful suite includes five different large language models (LLMs), with three focused exclusively on text generation. But that’s not all—Nova also features an image-generation model and a video-generation model. The company claimed that the new generation of AI models, which are presently available on Amazon Bedrock, have competitive pricing and enhanced intelligence.

Amazon detailed the new generation of AI models in a post. As part of the Nova series, five different LLMs have already been released. Andy Jassy, the CEO of the company, stated that a sixth AI model, called Nova Premier, will be released in 2025. Three of the five models, the Nova Micro, Nova Lite, and Nova Pro, are limited to text generation. The three models do differ from one another, though. Micro has the lowest latency responses in the series and only takes text as input. It features a context window of 1,28,000 tokens.

The Nova Lite, on the other hand, is only capable of producing text but will accept images, videos, and text. The most capable multimodal AI model in the group, Nova Pro, has more capabilities than the other two. Both the Nova Lite and Nova Pro models feature a context window of 3,00,000 tokens.

In addition to these, Amazon refers to two additional models in the Nova series as “creative content generation models.” The first is Nova Canvas, a model to generate images that can accept both text and images as inputs. The company has promoted it as a marketing, entertainment, and advertising tool. Lastly, a video production model called the Nova Reel can produce short videos in response to text and image input. Additionally, users can use natural language prompts to control camera motion. Enterprise clients of the company can access all of these models via the Amazon Bedrock platform.

The details of the safety precautions that AWS has laid up for its Nova models to handle problems like false information and other security threats are yet unclear. The company also continues to withhold details on the data it used to train its generative models.

Looking ahead, AWS is developing a speech-to-speech model for release in Q1 2025, which will process and transform spoken input into more natural, human-like responses. Additionally, an “any-to-any” model is expected around mid-2025, enabling users to input text, speech, images, or video and get output as text, speech, images, or video. This model might, for sure, power applications ranging from AI assistants to content editors to translators.

Author
Recent Posts

Beema Basheer

Meet Nova: Amazon’s New Family of Multimodal AI Models

Leave a Reply Cancel reply

Makers of Arc Browser Are Rolling Out a Promising New AI Browser

Google’s AI Video Generator Launches in Private Preview, Ahead of OpenAI’s Sora

Meet Nova: Amazon’s New Family of Multimodal AI Models

Related Articles

AWS AI Unit Gets $1 Billion Investment to Accelerate Enterprise AI Adoption

Microsoft Raises Xbox Prices as AI Hardware Costs Push Tech Prices Higher

WWDC Developer Sessions Reveal Apple’s New AI Future

Leave a Reply Cancel reply

Makers of Arc Browser Are Rolling Out a Promising New AI Browser

Google’s AI Video Generator Launches in Private Preview, Ahead of OpenAI’s Sora