AWS re:Invent Overview: Amazon AI Updates and NVIDIA Strategic Partnership

2023-11-29

During AWS re:Invent, NVIDIA contributed GPUs to AWS's cloud computing business and added a retrieval system to its AI enterprise software platform on the AWS Marketplace. Amazon Web Services (AWS) announced the launch of an AI chatbot for enterprise use, a new generation of AI training chips, and expanded partner relationships during the AWS re:Invent conference held from November 27 to December 1 in Las Vegas. AWS CEO Adam Selipsky highlighted generative AI and how cloud services enable organizations to train powerful models in his keynote speech on the second day of the conference. Graviton4 and Trainium2 Chip Releases AWS announced the next generation of its Graviton chip, a server processor designed for cloud workloads, and Trainium, which provides computational power for AI base model training. Graviton4 (Figure A) offers a 30% increase in computational performance, a 50% increase in core count, and a 75% increase in memory bandwidth compared to Graviton3, according to Selipsky. The first instance based on Graviton4 will be the R8g EC2 instance for memory-intensive workloads, provided by AWS. Trainium2 will be used in Amazon EC2 Trn2 instances, with each instance capable of scaling up to 100,000 Trainium2 chips. AWS stated in a press release that this provides the capability to train large language models with 300 billion parameters in a matter of weeks. Anthropic will use Trainium and Amazon's high-performance machine learning chip, Inferentia, to train its AI models, as announced by Selipsky and Anthropic's CEO and co-founder, Dario Amodei. These chips may help AWS make inroads into Microsoft's AI chip market. Amazon Bedrock: Adding Content Governance and Other Features During the re:Invent conference, Selipsky made several announcements regarding Amazon Bedrock, the foundational model building service: - Amazon Bedrock Agent is now officially available in preview. - Custom models built through customized fine-tuning and continuous pre-training are now available in preview for US customers. - Governance features for Amazon Bedrock are coming soon; these features allow organizations to align Bedrock with their own AI content restrictions using natural language guidance. - The knowledge base for Amazon Bedrock is now generally available in the US, enabling the connection of foundational models within Amazon Bedrock to internal company data for enhanced retrieval and generation. Amazon Q: AWS Enters the Chatbot Race AWS has launched its generative AI assistant, Amazon Q, designed for natural language interaction and content generation tasks. It can adapt to existing identities, roles, and enterprise security permissions. Amazon Q can be used across an organization and has access to many other business software applications. AWS positions Amazon Q as a business-centric tool specifically for individual employees who may have specific questions about their sales or tasks. Amazon Q is particularly useful for developers and IT professionals working within AWS CodeCatalyst, as it can help troubleshoot errors or network connectivity issues. Amazon Q will be available in the AWS Management Console and CodeWhisperer's documentation, in the serverless computing platform AWS Lambda, or in workplace communication applications like Slack (Figure B). Amazon Q has a feature that allows application developers to update their applications using natural language commands. This feature is currently available in preview in AWS CodeCatalyst and will soon be rolled out in supported integrated development environments. Many Amazon Q features are now available in preview across other AWS services and products. For example, contact center administrators can now access Amazon Q in Amazon Connect. Amazon S3 Express One Zone Now Open for Business Amazon S3 Express One Zone, now generally available, is a new S3 storage class specifically built for frequently accessed data, offering high performance and low latency cloud object storage, according to Selipsky. It is designed for workloads that require millisecond-level latency, such as finance or machine learning. Customers can now move data from S3 to their own caching solutions; with Amazon S3 Express One Zone, they can choose their own geographic availability zone and combine frequently accessed data with high-performance computing. Selipsky stated that access costs for Amazon S3 Express One Zone can be reduced by 50% compared to standard Amazon S3. Salesforce Now Available on AWS Marketplace On November 27, AWS announced an expanded partnership with Salesforce, allowing specific Salesforce CRM products to be accessed on the AWS Marketplace by joint customers of Salesforce and AWS in the United States. The products include Salesforce's Data Cloud, Service Cloud, Sales Cloud, Industry Clouds, Tableau, MuleSoft, Platform, and Heroku. More products are expected to be available, with expanded geographic availability anticipated next year. New options include: - Amazon Bedrock AI services will be available within Salesforce's Einstein Trust Layer. - Salesforce Data Cloud will support data sharing between Salesforce and AWS technologies, including Amazon Simple Storage Service. "Salesforce and AWS enable developers to easily and securely access and leverage data with generative AI technology to drive rapid transformation for their organizations and industries," stated Selipsky in a press release. Conversely, AWS will increasingly use Salesforce products internally, such as Salesforce Data Cloud. AWS Removes ETL from More Amazon Redshift Integrations ETL can be a cumbersome part of encoding transactional data. Last year, Amazon announced zero-ETL integration between Amazon Aurora MySQL and Amazon Redshift. AWS introduced more zero-ETL integrations with Amazon Redshift: - Aurora PostgreSQL - Amazon RDS for MySQL - Amazon DynamoDB The above three are now available for preview worldwide. The next step AWS wants to take is to make searching in transactional data smoother; many people use Amazon OpenSearch Service for this purpose. In response, Amazon announced the launch of DynamoDB zero-ETL with OpenSearch Service. Additionally, to make data more discoverable within Amazon DataZone, Amazon added a new capability to add business descriptions to datasets using generative AI. Introducing Amazon One Enterprise Identity Verification Scanner Amazon One Enterprise allows secure management to access physical locations in industries such as hospitality, education, or technology. It is a fully managed online service paired with the AWS One palm scanner for biometric authentication through the AWS Management Console. Amazon One Enterprise is currently available in preview in the United States. NVIDIA and AWS Reach Cloud Agreement NVIDIA announced a range of new GPUs through AWS, including NVIDIA L4 GPU, NVIDIA L40S GPU, and NVIDIA H200 GPU. AWS will be the first cloud provider to introduce the H200 chip with NV link into cloud computing. Through this link, GPU and CPU can share memory to accelerate processing speed, as explained by NVIDIA CEO Jensen Huang in Selipsky's keynote speech. Amazon EC2 G6e instances with NVIDIA L40S GPU and Amazon G6 instances with L4 GPU will be launched starting in 2024. Furthermore, NVIDIA's AI building platform, NVIDIA DGX Cloud, will soon be available on AWS. The availability date has not been announced yet. NVIDIA has chosen AWS as its primary partner for NVIDIA's 65 exaflop supercomputer Project Ceiba, which includes 16,384 NVIDIA GH200 Superchips. NVIDIA NeMo Retriever Another announcement made during re:Invent is NVIDIA NeMo Retriever, which allows enterprise customers to provide more accurate responses for their multimodal generative AI applications using retrieval-enhanced generation. Specifically, NVIDIA NeMo Retriever is a semantic retrieval microservice that connects custom LLMs to applications. The embedded model in NVIDIA NeMo Retriever determines the semantic relationships between words. The data is then input into the LLM, which processes and analyzes the text data. Commercial customers can connect the LLM to their own data sources and knowledge bases. NVIDIA NeMo Retriever is now available for early access through the NVIDIA AI Enterprise software platform, accessible via the AWS Marketplace. Early partners collaborating with NVIDIA to develop retrieval-enhanced generation services include Cadence, Dropbox, SAP, and ServiceNow.