Amazon Nova family of base models: delivering comprehensive generative artificial intelligence (Gen AI) services

Amazon Nova系列基础模型：提供全面的生成式人工智能（Gen AI）服务

Amazon's latest cutting-edge foundation models enable cutting-edge intelligence and industry-leading price/performance.

From our Inferentia and Trainium chips built for inference and training, to providing the industry's best Foundation Models at Amazon Bedrock and AI-driven experiences such as Rufus cap (a poem) Alexa We are committed to delivering generative AI (Gen AI) solutions that have real value and create real meaning for consumers. Our goal is to simplify the lives of shoppers, sellers, advertisers, businesses, and everyone else through AI.

As the next step in our AI journey, we built Amazon Nova, a new generation of Foundation Models. With the ability to process text, images, and video as input, users can utilize Amazon Nova-powered generative AI applications to understand videos, charts, and documents, or to generate videos and other multimedia content.

"With about 1,000 generative AI apps running inside Amazon, we have a comprehensive understanding of the challenges app developers are still facing," said Rohit Prasad, senior vice president, Artificial General Intelligence, Amazon. "Our new Amazon Nova model is designed to help internal and external developers address these challenges by delivering compelling intelligence and content generation while making significant advances in latency, cost-effectiveness, customization, information traceability and agent capabilities."

Amazon Nova models are now available in Amazon Bedrock, including:

Amazon Nova Micro, a text-only model that provides the lowest latency response at a very low cost.
Amazon Nova Lite, a low-cost multimodal model capable of processing image, video, and text inputs at very fast speeds.
Amazon Nova Pro, a highly capable multimodal model that provides an optimal balance of accuracy, speed, and cost for a wide range of tasks.
Amazon Nova Premier(math.) genusAmazon's most powerful multimodal model for complex inference tasks and serves as the best teacher for distilling custom models (expected to launch in Q1 2025).
Amazon Nova Canvas, a top-notch image generation model.
Amazon Nova Reel, a top video generation model.

Amazon Nova Reel turns a single image into a short video by entering it and prompting "dolly forward".

How the Amazon Nova model will benefit customers

All Amazon Nova models offer superior capabilities, are responsive, cost-effective, and are designed for easy integration with customers' systems and data. They support multimodality and multiple tasks in more than 200 languages. Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro cost at least 75% less than the best-in-class performance models in Amazon Bedrock. they are also the fastest models in Amazon Bedrock in their respective intelligence categories. They are also the fastest models in Amazon Bedrock in their respective intelligence categories.

These models have been integrated with Amazon Bedrock, a fully managed service that delivers high-performance base models from leading AI companies and Amazon through one API. With Amazon Bedrock, customers can easily experiment with and evaluate the Amazon Nova model, as well as other base models, to determine the best fit for their application.

These models also support custom fine-tuning, where customers can point the model to their own labeled, proprietary data examples to improve accuracy.The Amazon Nova model learns what's important from the customer's own data (including text, images, and video), and then Amazon Bedrock trains a private, fine-tuned model to provide a customized response.

In addition to supporting fine-tuning, these models also support distillation techniques, which allow the specific knowledge of a more powerful "large teacher model" to be transferred into a smaller, more efficient model that runs faster and more economically while maintaining high accuracy.

Amazon Nova models also integrate with the Amazon Bedrock Knowledge Base and excel at Retrieval Augmented Generation (RAG), allowing customers to provide information traceability of responses through the organization's own data to ensure optimal accuracy.

The Amazon Nova model is optimized for ease of use and excels in agent-based applications that require multi-step tasks to be completed through multiple API interactions.

Creative Content Generation

With high-quality output, an intuitive API platform, and advanced customization capabilities, the Amazon Nova creative generation models (Amazon Nova Canvas and Amazon Nova Reel) are helping Amazon Ads remove barriers for sellers and advertisers, inspiring creativity and innovation like never before. Examples include using video ads for the first time, creating display and video campaigns for more products in the product catalog, and experimenting with new tactics like keyword-level creative. On average, brands using these tools are able to promote five times the number of products and double the number of images per promoted product, while shifting budgets to the tactics that work best.

In this example, Amazon Ads used Amazon Nova Reel to create a video ad for a fictional boxed pasta brand. In the imaginative "Spaghetti City," towering macaroni tubes are carved into buildings, neighborhoods are dotted with Italian spice landscapes, and streets are lined with delicious tomato sauce, spiralized spaghetti, and tender meatballs. The campaign shows how advertisers can bring their products to life and create high-quality content through Amazon Nova models.

Video comprehension

To test the video comprehension capabilities of Amazon Nova Pro, we asked the model to review and describe a silent video clip of a soccer game. The results contain detailed descriptions of the game scene, team uniforms, player movements, and the final outcome of the game.

Video cue: "Describe the video."

Output: "The video shows a soccer match being played on a green field. Two teams are wearing yellow and white uniforms. The quarterback of the yellow team passes the ball to a receiver, who catches the ball and starts running forward. The white team's defense tries to catch up and intercept the ball. The play ended with the receiver being tackled to the ground."

looking forward

We will introduce two new Amazon Nova models in 2025, including a speech-to-speech model and a native multimodal-to-multimodal (i.e., "any-to-any" modality) model. Our speech-to-speech model will be able to understand speech streamed in as natural language, parse linguistic and non-linguistic cues (such as intonation and rate of speech), and provide natural human-like interactions. And our any-to-any model will be able to process text, images, audio, and video as input and output. This will simplify application development by enabling the same model to perform multiple tasks, such as translating content from one modality to another, editing content, and supporting AI agents that can understand and generate all modalities.

Responsible AI

Amazon Nova models have built-in security measures and protection mechanisms. The company has introduced AWS AI Service Cards for Amazon Nova models, which provide users with transparent information about use cases, limitations and responsible AI practices.

This is just the beginning for Amazon Nova, and we're excited to continue innovating and delivering real value to every Amazon customer. Learn more And start experiencing Amazon NovaThe