Generative AI on AWS Essentials

Archishman Bandyopadhyay
Jul 8, 2024
10 min read

AWS offers the following generative AI services for customers:

Amazon Q: A generative AI–powered assistant designed for work that can be tailored to meet business needs.
Amazon Bedrock: The easiest way to build and scale generative AI applications with FMs
Amazon Elastic Compute Cloud instances powered by AWS Inferentia and AWS Trainium: The best price-performance infrastructure for training and inference in the cloud
Amazon SageMaker Access and fine-tuning of a wide selection of FMs with Amazon SageMaker

Generative AI stack

Top layer

The top layer provides generative AI applications that customers can use to work with FMs without specialized knowledge or coding.

Middle layer

The middle layer provides tools, like Amazon Bedrock, that can be used to customize and train FMs for customers seeking to develop generative AI applications. This tier also assists customers in evaluating, selecting, and consuming these models through various distribution channels.

Bottom layer

The bottom layer provides solutions to customers looking to optimize the FM training and inference costs.

AWS' Generative AI Services

1. Amazon Q

Amazon Q is a generative AI-powered assistant that is designed specifically for work and can be tailored to a customer’s business. Customers can get fast, relevant answers to pressing questions, generate content, and take actions – all informed by their information repositories, code, and enterprise systems. Amazon Q provides tailored information to employees to streamline common business tasks, quickly build applications on AWS, reduce time to business insights, provide better customer service, and plan and manage supply chain inventory more efficiently.

A. Amazon Q Business is a fully managed, generative AI-powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. Amazon Q provides immediate and relevant information to employees. It also streamlines tasks and accelerates problem solving.

B. Amazon Q Developer can explain and update specific lines of code in your integrated development environment (IDE). To get updated code, send your code to Amazon Q. It generates new code that reflects the changes that you asked it to make. Then, you can insert the updated code directly into the file where the code originated.

With Amazon Q Developer, one can choose from the following options:

Explain – Get your code explained in natural language.
Refactor – Improve code readability or efficiency, among other improvements.
Fix – Debug code.
Optimize – Enhance code performance.
Send to prompt – Send the highlighted code to the Amazon Q chat panel, and ask questions that you have about the code.

C. Amazon Q brings its advanced generative AI technology to Amazon QuickSight, the AWS unified business intelligence (BI) service built for the cloud. With Amazon Q in QuickSight, customers get a generative BI assistant that allows business analysts to use natural language to build BI dashboards in minutes and easily create visualizations and complex calculations.

D. In Amazon Connect, the contact center service, Amazon Q helps customer service agents provide better customer service. Amazon Q in Connect uses the real-time conversation with the customer along with relevant company content to automatically recommend what to say or what actions an agent should take to better assist customers.

2. Amazon Bedrock

Amazon Bedrock is a fully managed service that makes FMs from leading AI startups and Amazon available through an API. Customers can choose from a wide range of foundation models to find the model that is best suited for their use case.

Amazon Bedrock is the easiest way to build and scale generative AI applications with FMs.

Amazon Bedrock supports the following foundation models :

Guardails for Amazon Bedrock–

Implement safeguards customized to your application requirements and responsible AI policies.

Apply guardrails to multiple FMs and Agents for Amazon Bedrock to bring a consistent level of AI safety across all your applications.
Block undesirable topics in your generative AI applications for a relevant and safe user experience.
Filter harmful content based on your responsible AI policies.
Redact sensitive information, such as PIIs, to protect privacy.
Block inappropriate content with a custom word filter.

AWS Inferentia and AWS Trainium EC2 instances :

These are purpose-built ML accelerators that AWS designed from the ground up.

The first-generation of AWS Inferentia delivers significant performance and cost-savings benefits for deploying smaller models. AWS Trainium and AWS Inferentia2 were built for training and deploying ultra-large generative AI models with hundreds of billions of parameters.

3. Amazon SageMaker

Amazon SageMaker is a fully managed service provided by Amazon Web Services (AWS) that enables developers and data scientists to build, train, and deploy machine learning models at scale.

Benefits:

Offers a wide selection of FMs and access to the latest publicly available FMs for faster time to market
Provides ML practitioners the capabilities to build, train, and deploy LLMs and foundation models from scratch.
A secure environment to customize the models with user data
Broad and deep capabilities for evaluation, experimenting and industrializing FMs to ensure they meet user needs
A secure and reliable fully managed infrastructure with scalability and high-performance.

Amazon SageMaker JumpStart is the machine learning hub of Amazon SageMaker that helps customers to discover built-in content to develop their next machine learning models.

Key Customer Personas

A. Business decision-makers

Sample job titles: Business executives; VP, head, or director of marketing; finance; operations; customer service; supply chain; logistics; and legal.

Responsibility: Business decision-makers have the authority to find and fund technology solutions that can help achieve their business outcomes. They are motivated by improving company performance, creating a competitive advantage with innovative technology, and potential AI/ML business results.

Business decision-makers play a key role in the tech purchasing and decision-making process for their division. They stay informed through their peer network and their employees. They also read industry analyst reports and browse vendor websites.

Discovery questions for C-suite business decision-makers

Long-Term Vision and Short-Term Goals for AI/ML

What is your vision for AI/ML and your company?
What are your business goals for AI for this year?

Goals, Expectations, and Challenges for Generative AI/ML

What customer experiences or use cases do you expect to transform with generative AI?
How might you improve customer retention with personalized user experiences?

Priorities in Building a Generative AI Application

Rank the importance of the following for choosing a foundation model:
Performance
Latency
Cost
Accuracy

Track Record with Previous AI/ML Projects

Have you achieved your business outcomes with previous AI/ML projects?
What successes or challenges have you had with technology partners in AI/ML?
What business operations would you automate first with AI capable of generating novel content?

Discovery Questions for C-Suite CMO and Marketing

How can you accelerate content production?
How can you increase reach with personalized ads?
How can you quickly create localized content?

B. Technical decision-makers

Sample job titles: Chief cloud architect, head or director of data science, analytics, AI and ML, IT executives (excluding c-suite) and professionals

Responsibilities: Technical decision-makers usually seek a deeper understanding of the technology. They want to see product details, use cases, and clear explanations on what AWS can do for generative AI in relation to their individual goals. Their top priorities are increasing innovation, delivering IT projects quickly, aligning IT performance metrics to business outcomes, and cutting overall IT costs.

Technical decision-makers often guide technology decisions as the final decision-maker or by contributing recommendations. So they look for information that they can trust.

Discovery questions for C-suite business decision-makers (data scientists, managers, developers, and VPs of engineering)

Goals and Use Cases for Generative AI

What customer experiences or use cases do you expect to transform with AI?
How long do you have to wait for media to be created to finish building new products and user experiences?
In what aspects of your supply chain are you experimenting with generative design technology?
How might using AI to generate media-rich content impact your digital assets strategy?
What kinds of experiences have you always dreamed of providing to your customers, but lacked the original media to create efficiently?

Challenges and Blockers with AI

What challenges has your company faced in launching AI/ML-related products and business processes? Where do things get blocked?
Do you have in-house development teams or outsource your development?
How do you measure developer productivity?
Do you have organizational goals around developer productivity or cost reduction?
Have your software engineering teams experimented with AI-generated code to accelerate rudimentary software engineering tasks?

Plans for Models and Customization

Have you decided on which model you are using? Or do you plan to customize foundation models (FMs) for your use case or industry?
Do you plan to customize FMs for your particular use case or industry?
Would you prefer to maintain control over your hosting instances?
Do you know how to evaluate the right model for the use case you are trying to solve with Generative AI?

Examples of Generative AI Models:

Amazon Titan (Titan Text and Titan Embeddings) from Amazon
Jurassic-2 from AI21 Labs
Claude from Anthropic
Stable Diffusion from Stability AI
Mistral from Mistral AI
Llama from Meta

Working with Partners for AI/ML

What achievements or difficulties have you encountered when seeking assistance from an AWS partner to organize and develop your AI/ML capabilities?

C. Builders

Sample job titles: Developer, data scientist, ML developer, ML engineer, BI engineer, director of development productivity, VP of engineering, software development managers

Responsibilities: Builders are the ones implementing generative AI applications. They want to use foundation models to create generative AI applications aligned with business objectives. They are also searching for ways to enhance productivity of their teams to accelerate time to market for their applications.

Discovery questions for Builders

Data Scientists and ML Practitioners

Current ML Approach and Challenges:

What ML platform are you currently using to evaluate and deploy custom ML models?
What are some of the challenges that you face in the model lifecycle?

Deploying Foundation Models and Barriers:

Are you able to quickly and easily deploy foundation models today?
What are the barriers?

Partnering to Deliver Generative AI Capabilities:

How are you partnering with your product and business stakeholders to provide them with generative AI capabilities quickly, securely, and cost-effectively?

Look for the Following:

Data volume on or coming to AWS
Creating novel content (e.g., images for ads, product descriptions, document summarizations, contextual chatbots)
Experience using Amazon SageMaker
Deployment of models on Amazon EC2 directly or from Hugging Face
Desire to stay in AWS or customize a model for their domain (e.g., FinServ, HCLS)

Developers

Experience Level and Time Spent:

What is your team’s current level of experience with coding and programming?
How much time do you typically spend on coding tasks?

Planned Use Cases and Tasks to Automate:

What type of application or project are you planning to use the code generator for?
What specific coding tasks do you think a code generator would be most helpful for, and why?
What are some examples of tasks or functions that you would like the code generator to automate for you?

Desired Features and Capabilities:

How important is customizability to you, and how much flexibility do you need in terms of generating code that meets your specific needs?
What kind of support or training do you think you would need to use the code generator effectively?
What resources would be most helpful to you?
Are there any particular features or functionalities that you would like the code generator to have (e.g., version control integration, testing or debugging support, collaboration features)?

Top of mind across key personas

Security and compliance: Customers place high priority on data security protections, regulatory compliance, and governance policies. Using secure infrastructure and following best practices helps mitigate risks.

Privacy: Customers want assurances that models do not disclose private data. Controls should prevent unauthorized data usage during model development.

Risk management: Customers are concerned about potential misuse when deploying generative AI applications. Proactive threat modeling and defense-in-depth security controls can help manage risks.

Balancing innovation with security: Customers want to experiment with new technologies but also manage risks and maintain security. Collaborative governance, review processes, and security guidance can enable innovation while upholding standards.

Content generation: Customers want assurances that AI-generated content aligns with brand messaging without hallucinations. Controls to prevent copyright infringement are also important.

Generating insights: Data analysts or marketing teams can use AI to analyze data and identify high-potential leads for campaigns to improve business outcomes.

Customer Use Cases

By Business Objectives

Enhancing the customer experience

Improving customer engagement with seamless omnichannel access to virtual agents across different collaboration tools and channels
Empowering customers to quickly find answers and complete transactions on their own by implementing conversational voice and text-based chatbots
Discovering business insights out of real-time or recorded conversations to detect emerging trends

Boosting employee productivity

Improving productivity of internal teams that keep systems running
Saving time and improving accuracy as the system automatically trains on each customer's information
Creating faster time to market

Reducing cost of serviceImproving business operations

Improving employee productivity
Increasing operational efficiencies
Enhancing customer experiences

By Industry

Healthcare

Medical research: Patient-to-trial matching, multi-modal data analysis
Clinical efficiency: Longitudinal patient records for full patient picture, automate medical image interpretation
Operational efficiency: Auto-generate referral letters, clinical coding, and prior authorization
Patient experience: Patient outcome prediction, personalized patient discharge instructions and treatment plans
Digital health: Patient care concierge, remote care management

Life sciences

Research and discovery: Protein folding, protein design
Clinical development: Optimizing trial protocols, patient cohorts, and sites
Manufacturing: Predictive maintenance, resource optimization
Commercial and medical affairs: Patient outcome prediction, content generation
Patient support: Patient care concierge, patient-to-trial matching

Financial services

AI-managed portfolios: Create highly tailored investment strategies and portfolios aligned to specific financial goals and risk profiles
Increase the business value of unstructured content: Create on-demand structured data products from large unstructured data sources such as emails, document repositories, and filings
Drive product innovation and automate business processes: Develop new tools, such as stock screening using natural language search, for end-users
Intelligent advisory: Automatically translate complex questions from internal users and external customers into their semantic meaning, analyze for context, and then generate highly accurate and conversational responses
Transform financial documentation: Quickly draft investment research, loan documentation, insurance policies, regulatory communications, requests for information (RFI), and business correspondence

Manufacturing

Operational efficiency: Text generation for contracts and SOPs, customer service and agent assistants, research and summarization for supply-chain optimization
Reduce time and cost of production: Agents and search for plant maintenance, operations, and research
Product design optimization: Generate and enhance new product design, market and customer research to support market development
Real-time equipment diagnostics: Ingest historical data and diagnose equipment failures in real time to recommend maintenance actions
Training content generation: Generative conversational agents can be trained on product manuals, troubleshooting guides, and maintenance notes to deliver swift technical support to workers, reducing downtimes

Retail

Better customer experience: Provide shoppers with a more natural, personalized experience at scale, use natural language to narrow products down to what the customer is specifically looking for
Data insights: Consume large amounts of data like sales, returns, or product reviews to summarize trends
Optimized operations: Make better merchandising decisions, automate the generation of product categories and decisions, track vessels through scraping public vessel or freight locations and associate it with ordered freight to gain real-time visibility of goods
Enhanced marketing: Generate SEO-optimized copy for landing pages, blogs, and social media posts, generate product images or models without having to use photography

Media and entertainment

Text: Narrative generation from sports statistics and news, script summarization, reading and writing assistance
Images and videos: Render-from-rough storyboarding from sketches, rendering scenes from untextured 3D models
Audio: Music generation, automated dialogue replacement, script reading, localization