December 2024

AI Governance – Step 4: Monitor and validate AI models continuously

Björn Preuß
Chief Data Scientist
AI Governance
GRACE AI Platform
AI Compliance

Do you know if your AI models are on track?

Continuous monitoring and validation of AI models are crucial to ensure expected performance and to remain compliant with regulatory standards.7

Preventing model drift and degradation

Many regulatory frameworks, such as SR 11-7, stress the importance of ongoing validation and performance monitoring, which helps identify issues like model drift or performance degradation over time2. GRACE AI Platform can play a key role in this process by supporting the tracking of experimentation and testing throughout the AI system and model development lifecycle, from initial development to deployment and beyond. This includes tracking subsequent retraining, recalibration, and redeployment efforts.12

Enhanced AI Governance with GRACE’s tracking and monitoring capabilities

GRACE’s tracking functionality links the datasets used in these processes, allowing users to track the configuration of successive versions of deployed systems, as well as the experiments and versioned datasets that contributed to them. The system also supports the tracking of both standard and custom metrics, providing flexibility in performance evaluation. Additionally, GRACE facilitates the generation of metrics for model explainability through methods like feature importance analysis.

Beyond monitoring performance, GRACE enables the evaluation of datasets for quality, suitability, and representativeness. It continuously monitors data over time to detect data drift or concept drift, ensuring models remain reliable and relevant in changing environments.

Key actions:
  • Implement real-time monitoring tools: Use GRACE to track AI system performance, identify anomalies, and flag potential issues. This helps in detecting biases, errors, or drift in model performance.
  • Validate models periodically: Conduct regular validation of AI models using updated data to ensure they continue to perform accurately and fairly. This should include testing under different real-world conditions to evaluate the robustness of AI systems.
  • Control through technical controls: Use GRACE to follow up on the fulfillment of your technical controls and ensure that you meet the threshold requirements for each relevant measure.
Practice

In the context of managing multiple AI systems for insurance, including various LLMs, a complex network of applications has developed, with individual models often being deployed across different systems simultaneously. It’s essential to maintain a control tower perspective to monitor which models and systems are compliant and which have deviated from their established performance standards. The functionality within GRACE supports this oversight during both the pre- and post-production phases, providing a comprehensive view of each model’s status over time. For the LLMs, you can implement live guardrails and controls, along with robust logging capabilities, enabling real-time interventions to guide model responses effectively and ensure compliance within the insurance framework.

References
  • 2 Federal Reserve Board, “SR 11-7: Guidance on Model Risk Management,” Apr. 2011.
  • 7 Wirtz, Bernd W., et al. “Governance of Artificial Intelligence: A Risk and Guideline-Based Integrative Framework.” Government Information Quarterly, vol. 39, no. 4, Oct. 2022, pp. 101685–101685, doi:10.1016/j.giq.2022.101685.
  • 12 Minkkinen, Matti, et al. “Continuous Auditing of Artificial Intelligence: A Conceptualization and Assessment of Tools and Frameworks.” Digital Society, vol. 1, no. 3, Oct. 2022, doi:10.1007/s44206-022-00022-2.

Transcript

More news

Get the latest news

Stay up to date on our latest news and industry trends