The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features or functionality remain at the sole discretion of GitLab Inc.
Enabling Excellence in AI Model Evaluation: Illuminating the Potential of Generative AI Models
Our vision is to establish a comprehensive and cutting-edge framework that empowers users to evaluate and unlock the full potential of GenAI models, including offerings from Google, Anthropic, OSS community, and beyond. By providing a unified platform with advanced evaluation tools, actionable insights, and ethical considerations, we aim to foster trust, transparency, and continual improvement in AI technologies.
This group consists of the following categories:
In 2023 the AI Model Validation group started laying the foundation to evaluate Large language models to help ML/AI powered features to be built in GitLab. We started out of a need to test Code Suggestions, but have quickly expanded to other use cases like multi-turn conversations (Chat), and specialized data evaluations like vulnerability data.
Throughout 2024 we've focused on building a robust system for evaluating AI models that power GitLab Duo. Our Central Evaluation Framewor (CEF) includes large scale evaluations that we run regularly to make informed decisions about which AI models to use. Learn about how we govern our AI model decisions and take a deep dive into how the CEF works
Last Reviewed: 2024-10-05
Last Updated: 2024-10-05