Generative AI on AWS Essentials
- Archishman Bandyopadhyay
- Jul 8, 2024
- 10 min read
AWS offers the following generative AI services for customers:
Amazon Q: A generative AI–powered assistant designed for work that can be tailored to meet business needs.
Amazon Bedrock: The easiest way to build and scale generative AI applications with FMs
Amazon Elastic Compute Cloud instances powered by AWS Inferentia and AWS Trainium: The best price-performance infrastructure for training and inference in the cloud
Amazon SageMaker Access and fine-tuning of a wide selection of FMs with Amazon SageMaker
Generative AI stack
Top layer
The top layer provides generative AI applications that customers can use to work with FMs without specialized knowledge or coding.
Middle layer
The middle layer provides tools, like Amazon Bedrock, that can be used to customize and train FMs for customers seeking to develop generative AI applications. This tier also assists customers in evaluating, selecting, and consuming these models through various distribution channels.
Bottom layer
The bottom layer provides solutions to customers looking to optimize the FM training and inference costs.
AWS' Generative AI Services
1. Amazon Q
Amazon Q is a generative AI-powered assistant that is designed specifically for work and can be tailored to a customer’s business. Customers can get fast, relevant answers to pressing questions, generate content, and take actions – all informed by their information repositories, code, and enterprise systems. Amazon Q provides tailored information to employees to streamline common business tasks, quickly build applications on AWS, reduce time to business insights, provide better customer service, and plan and manage supply chain inventory more efficiently.
A. Amazon Q Business is a fully managed, generative AI-powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. Amazon Q provides immediate and relevant information to employees. It also streamlines tasks and accelerates problem solving.
B. Amazon Q Developer can explain and update specific lines of code in your integrated development environment (IDE). To get updated code, send your code to Amazon Q. It generates new code that reflects the changes that you asked it to make. Then, you can insert the updated code directly into the file where the code originated.
With Amazon Q Developer, one can choose from the following options:
Explain – Get your code explained in natural language.
Refactor – Improve code readability or efficiency, among other improvements.
Fix – Debug code.
Optimize – Enhance code performance.
Send to prompt – Send the highlighted code to the Amazon Q chat panel, and ask questions that you have about the code.
C. Amazon Q brings its advanced generative AI technology to Amazon QuickSight, the AWS unified business intelligence (BI) service built for the cloud. With Amazon Q in QuickSight, customers get a generative BI assistant that allows business analysts to use natural language to build BI dashboards in minutes and easily create visualizations and complex calculations.
D. In Amazon Connect, the contact center service, Amazon Q helps customer service agents provide better customer service. Amazon Q in Connect uses the real-time conversation with the customer along with relevant company content to automatically recommend what to say or what actions an agent should take to better assist customers.
2. Amazon Bedrock
Amazon Bedrock is a fully managed service that makes FMs from leading AI startups and Amazon available through an API. Customers can choose from a wide range of foundation models to find the model that is best suited for their use case.
Amazon Bedrock is the easiest way to build and scale generative AI applications with FMs.
Amazon Bedrock supports the following foundation models :
Guardails for Amazon Bedrock–
Implement safeguards customized to your application requirements and responsible AI policies.
Apply guardrails to multiple FMs and Agents for Amazon Bedrock to bring a consistent level of AI safety across all your applications.
Block undesirable topics in your generative AI applications for a relevant and safe user experience.
Filter harmful content based on your responsible AI policies.
Redact sensitive information, such as PIIs, to protect privacy.
Block inappropriate content with a custom word filter.
AWS Inferentia and AWS Trainium EC2 instances :
These are purpose-built ML accelerators that AWS designed from the ground up.
The first-generation of AWS Inferentia delivers significant performance and cost-savings benefits for deploying smaller models. AWS Trainium and AWS Inferentia2 were built for training and deploying ultra-large generative AI models with hundreds of billions of parameters.
3. Amazon SageMaker
Amazon SageMaker is a fully managed service provided by Amazon Web Services (AWS) that enables developers and data scientists to build, train, and deploy machine learning models at scale.
Benefits:
Offers a wide selection of FMs and access to the latest publicly available FMs for faster time to market
Provides ML practitioners the capabilities to build, train, and deploy LLMs and foundation models from scratch.
A secure environment to customize the models with user data
Broad and deep capabilities for evaluation, experimenting and industrializing FMs to ensure they meet user needs
A secure and reliable fully managed infrastructure with scalability and high-performance.
Amazon SageMaker JumpStart is the machine learning hub of Amazon SageMaker that helps customers to discover built-in content to develop their next machine learning models.
Key Customer Personas
A. Business decision-makers
Sample job titles: Business executives; VP, head, or director of marketing; finance; operations; customer service; supply chain; logistics; and legal.
Responsibility: Business decision-makers have the authority to find and fund technology solutions that can help achieve their business outcomes. They are motivated by improving company performance, creating a competitive advantage with innovative technology, and potential AI/ML business results.
Business decision-makers play a key role in the tech purchasing and decision-making process for their division. They stay informed through their peer network and their employees. They also read industry analyst reports and browse vendor websites.
Discovery questions for C-suite business decision-makers
Long-Term Vision and Short-Term Goals for AI/ML
What is your vision for AI/ML and your company?
What are your business goals for AI for this year?
Goals, Expectations, and Challenges for Generative AI/ML
What customer experiences or use cases do you expect to transform with generative AI?
How might you improve customer retention with personalized user experiences?
Priorities in Building a Generative AI Application
Rank the importance of the following for choosing a foundation model:
Performance
Latency
Cost
Accuracy
Track Record with Previous AI/ML Projects
Have you achieved your business outcomes with previous AI/ML projects?
What successes or challenges have you had with technology partners in AI/ML?
What business operations would you automate first with AI capable of generating novel content?
Discovery Questions for C-Suite CMO and Marketing
How can you accelerate content production?
How can you increase reach with personalized ads?
How can you quickly create localized content?
B. Technical decision-makers
Sample job titles: Chief cloud architect, head or director of data science, analytics, AI and ML, IT executives (excluding c-suite) and professionals
Responsibilities: Technical decision-makers usually seek a deeper understanding of the technology. They want to see product details, use cases, and clear explanations on what AWS can do for generative AI in relation to their individual goals. Their top priorities are increasing innovation, delivering IT projects quickly, aligning IT performance metrics to business outcomes, and cutting overall IT costs.
Technical decision-makers often guide technology decisions as the final decision-maker or by contributing recommendations. So they look for information that they can trust.
Discovery questions for C-suite business decision-makers (data scientists, managers, developers, and VPs of engineering)
Goals and Use Cases for Generative AI
What customer experiences or use cases do you expect to transform with AI?
How long do you have to wait for media to be created to finish building new products and user experiences?
In what aspects of your supply chain are you experimenting with generative design technology?
How might using AI to generate media-rich content impact your digital assets strategy?
What kinds of experiences have you always dreamed of providing to your customers, but lacked the original media to create efficiently?
Challenges and Blockers with AI
What challenges has your company faced in launching AI/ML-related products and business processes? Where do things get blocked?
Do you have in-house development teams or outsource your development?
How do you measure developer productivity?
Do you have organizational goals around developer productivity or cost reduction?
Have your software engineering teams experimented with AI-generated code to accelerate rudimentary software engineering tasks?
Plans for Models and Customization
Have you decided on which model you are using? Or do you plan to customize foundation models (FMs) for your use case or industry?
Do you plan to customize FMs for your particular use case or industry?
Would you prefer to maintain control over your hosting instances?
Do you know how to evaluate the right model for the use case you are trying to solve with Generative AI?
Examples of Generative AI Models:
Amazon Titan (Titan Text and Titan Embeddings) from Amazon
Jurassic-2 from AI21 Labs
Claude from Anthropic
Stable Diffusion from Stability AI
Mistral from Mistral AI
Llama from Meta
Working with Partners for AI/ML
What achievements or difficulties have you encountered when seeking assistance from an AWS partner to organize and develop your AI/ML capabilities?
C. Builders
Sample job titles: Developer, data scientist, ML developer, ML engineer, BI engineer, director of development productivity, VP of engineering, software development managers
Responsibilities: Builders are the ones implementing generative AI applications. They want to use foundation models to create generative AI applications aligned with business objectives. They are also searching for ways to enhance productivity of their teams to accelerate time to market for their applications.
Discovery questions for Builders
Data Scientists and ML Practitioners
Current ML Approach and Challenges:
What ML platform are you currently using to evaluate and deploy custom ML models?
What are some of the challenges that you face in the model lifecycle?
Deploying Foundation Models and Barriers:
Are you able to quickly and easily deploy foundation models today?
What are the barriers?
Partnering to Deliver Generative AI Capabilities:
How are you partnering with your product and business stakeholders to provide them with generative AI capabilities quickly, securely, and cost-effectively?
Look for the Following:
Data volume on or coming to AWS
Creating novel content (e.g., images for ads, product descriptions, document summarizations, contextual chatbots)
Experience using Amazon SageMaker
Deployment of models on Amazon EC2 directly or from Hugging Face
Desire to stay in AWS or customize a model for their domain (e.g., FinServ, HCLS)
Developers
Experience Level and Time Spent:
What is your team’s current level of experience with coding and programming?
How much time do you typically spend on coding tasks?
Planned Use Cases and Tasks to Automate:
What type of application or project are you planning to use the code generator for?
What specific coding tasks do you think a code generator would be most helpful for, and why?
What are some examples of tasks or functions that you would like the code generator to automate for you?
Desired Features and Capabilities:
How important is customizability to you, and how much flexibility do you need in terms of generating code that meets your specific needs?
What kind of support or training do you think you would need to use the code generator effectively?
What resources would be most helpful to you?
Are there any particular features or functionalities that you would like the code generator to have (e.g., version control integration, testing or debugging support, collaboration features)?
Top of mind across key personas
Security and compliance: Customers place high priority on data security protections, regulatory compliance, and governance policies. Using secure infrastructure and following best practices helps mitigate risks.
Privacy: Customers want assurances that models do not disclose private data. Controls should prevent unauthorized data usage during model development.
Risk management: Customers are concerned about potential misuse when deploying generative AI applications. Proactive threat modeling and defense-in-depth security controls can help manage risks.
Balancing innovation with security: Customers want to experiment with new technologies but also manage risks and maintain security. Collaborative governance, review processes, and security guidance can enable innovation while upholding standards.
Content generation: Customers want assurances that AI-generated content aligns with brand messaging without hallucinations. Controls to prevent copyright infringement are also important.
Generating insights: Data analysts or marketing teams can use AI to analyze data and identify high-potential leads for campaigns to improve business outcomes.
Customer Use Cases
By Business Objectives
Enhancing the customer experience
Improving customer engagement with seamless omnichannel access to virtual agents across different collaboration tools and channels
Empowering customers to quickly find answers and complete transactions on their own by implementing conversational voice and text-based chatbots
Discovering business insights out of real-time or recorded conversations to detect emerging trends
Boosting employee productivity
Improving productivity of internal teams that keep systems running
Saving time and improving accuracy as the system automatically trains on each customer's information
Creating faster time to market
Reducing cost of serviceImproving business operations
Improving employee productivity
Increasing operational efficiencies
Enhancing customer experiences
By Industry
Healthcare
Medical research: Patient-to-trial matching, multi-modal data analysis
Clinical efficiency: Longitudinal patient records for full patient picture, automate medical image interpretation
Operational efficiency: Auto-generate referral letters, clinical coding, and prior authorization
Patient experience: Patient outcome prediction, personalized patient discharge instructions and treatment plans
Digital health: Patient care concierge, remote care management
Life sciences
Research and discovery: Protein folding, protein design
Clinical development: Optimizing trial protocols, patient cohorts, and sites
Manufacturing: Predictive maintenance, resource optimization
Commercial and medical affairs: Patient outcome prediction, content generation
Patient support: Patient care concierge, patient-to-trial matching
Financial services
AI-managed portfolios: Create highly tailored investment strategies and portfolios aligned to specific financial goals and risk profiles
Increase the business value of unstructured content: Create on-demand structured data products from large unstructured data sources such as emails, document repositories, and filings
Drive product innovation and automate business processes: Develop new tools, such as stock screening using natural language search, for end-users
Intelligent advisory: Automatically translate complex questions from internal users and external customers into their semantic meaning, analyze for context, and then generate highly accurate and conversational responses
Transform financial documentation: Quickly draft investment research, loan documentation, insurance policies, regulatory communications, requests for information (RFI), and business correspondence
Manufacturing
Operational efficiency: Text generation for contracts and SOPs, customer service and agent assistants, research and summarization for supply-chain optimization
Reduce time and cost of production: Agents and search for plant maintenance, operations, and research
Product design optimization: Generate and enhance new product design, market and customer research to support market development
Real-time equipment diagnostics: Ingest historical data and diagnose equipment failures in real time to recommend maintenance actions
Training content generation: Generative conversational agents can be trained on product manuals, troubleshooting guides, and maintenance notes to deliver swift technical support to workers, reducing downtimes
Retail
Better customer experience: Provide shoppers with a more natural, personalized experience at scale, use natural language to narrow products down to what the customer is specifically looking for
Data insights: Consume large amounts of data like sales, returns, or product reviews to summarize trends
Optimized operations: Make better merchandising decisions, automate the generation of product categories and decisions, track vessels through scraping public vessel or freight locations and associate it with ordered freight to gain real-time visibility of goods
Enhanced marketing: Generate SEO-optimized copy for landing pages, blogs, and social media posts, generate product images or models without having to use photography
Media and entertainment
Text: Narrative generation from sports statistics and news, script summarization, reading and writing assistance
Images and videos: Render-from-rough storyboarding from sketches, rendering scenes from untextured 3D models
Audio: Music generation, automated dialogue replacement, script reading, localization
Comments