Dec 01, 2024, Global – Amazon Bedrock unveils two new preview capabilities to enhance evaluation and optimization capabilities for generative AI applications: RAG evaluation for Knowledge Bases and LLM-as-a-judge for model assessment. The promise of these capabilities is that they will simplify testing and shorten the time it takes to develop AI-ready solutions.
The RAG evaluation is an automated evaluation of Retrieval Augmented Generation (RAG) applications with evaluation metrics computed using large language models. Developers have the option to compare configurations and fine-tune applications for specific use cases. On the other hand, LLM-as-a-judge facilitates a humanlike evaluation of models, resulting in far less cost and time compared to traditional human assessments.
They measure quality dimensions such as correctness, helpfulness and adherence to the responsible AI principles with natural language explanations and normalized scores for interpretability. These capabilities are now available in preview in multiple AWS regions, with no extra charge on evaluation jobs beyond standard Amazon Bedrock pricing.
Amazon Bedrock’s evaluation tools have been designed to fast-track the deployment of generative applications by providing simple insights and reducing feedback loops. Developers can easily access the evaluation tools from within the Amazon Bedrock console.
Latest Stories:
AI Propels Edge Data Center Market to $22.11 Billion Growth by 2028
Open-Source AI Summit Abu Dhabi Sparks Global Innovation Dialogue
Japan Airlines Leverages AI to Improve Boarding Efficiency