Safety Frameworks Workstream
The Frontier Model Forum’s Safety Frameworks workstream aims to advance the development and implementation of robust safety practices for frontier AI models and systems. As AI capabilities continue to advance, comprehensive safety frameworks and evaluations will become increasingly crucial for assessing and addressing potential risks before and during AI development. Understanding how to create, implement, and effectively evaluate safety measures is a critical challenge for the field.
The Safety Frameworks workstream was established to address this challenge. The workstream aims to develop and refine the concept of safety frameworks, establish best practices for their development, and provide guidance for their implementation. This includes creating standardized evaluation protocols and reporting requirements, developing metrics for safety assessments, and establishing methodologies for testing the effectiveness of safety measures. Effectively managing AI safety will require both robust frameworks and systematic approaches to evaluating their implementation and impact.
The Safety Frameworks workstream draws heavily on the expertise of the FMF’s member firms and external experts. The FMF actively seeks to engage with organizations that have experience developing and implementing safety frameworks, as well as researchers and policymakers working on AI safety. By leveraging collective knowledge across academia, industry, and government, the Safety Frameworks workstream remains grounded in practical experience and emerging science across diverse sectors.
The FMF is committed to sharing its findings through various publications, which may include briefs on the core components of safety frameworks, the tradeoffs associated with different capability thresholds, the timing and frequency of capability assessments, or specific governance and transparency mechanisms. These materials will be published following careful review to ensure they contribute constructively to the development of AI safety practices.
Recent Publications
Preliminary Taxonomy of Pre-Deployment Frontier AI Safety Evaluations. December 20, 2024.
Components of Safety Frameworks. November 8, 2024.
Early Best Practices for Frontier AI Safety Evaluations. July 31, 2024.
What is Red Teaming?. October 24, 2023.