Goodbye Goodhart: Escaping metrics and policy gaming in Responsible AI

As Responsible AI (RAI) program implementations mature over time, they often introduce metrics to track how well the organization is doing in terms of its RAI posture. Additionally, to motivate adoption by staff, there are often incentives that are either introduced or altered within existing functions to ensure that there is alignment towards the program’s goals and progress is made steadily towards achieving those goals. Yet, there is a common pitfall that any such approach is susceptible to: Goodhart’s Law, i.e., “When a measure becomes a target, it ceases to be a good measure.”

 

Key takeaways

  • Measure with caution: Metrics can drive behaviors that align with measured outcomes but may ignore broader goals.

  • Avoid gaming: Metrics can be gamed, leading to undesirable behaviors.

  • Balanced metrics: Use a mix of quantitative and qualitative metrics to cover different dimensions of AI ethics.

  • Dynamic adjustments: Regularly review and adjust metrics to prevent metric manipulation and ensure they remain aligned with ethical goals.

  • Incentive structures: Align incentives with long-term ethical goals, not just short-term achievements.

 

Responsible AI programs must carefully design metrics and incentives to avoid the pitfalls highlighted by Goodhart's Law. By using a balanced approach and regularly revising metrics, these programs can ensure that their measures genuinely reflect and promote ethical AI practices. So, let’s dive into each of these and see how you can implement them at your organization.


1. Measure with caution

Design influences behavior. The concept of "what you measure is what you get" is akin to the idea of incentive structures in behavioral economics, where the design of incentives directly influences behavior. Measurement systems influence organizational culture and employee behavior. Theories like McGregor's Theory X and Theory Y highlight how managerial assumptions shape employee behavior.

Metrics add rigor to program implementation. Metrics are useful in that they can provide a quantitative basis for tracking the effectiveness of the RAI program. We should ensure that measurement systems reflect organizational values and goals, leading to more holistic outcomes, rather than being disjointed, which can lead to the issues highlighted above with staff behavior. This does require a comprehensive understanding and integration of different ethical dimensions, which can be complex. But, it is worth the effort when the outcomes are much closer to what we want the RAI program to achieve.

Don’t go it alone! Adopting a collaborative approach involving stakeholders from various departments (e.g., ethics, compliance, operations) can help identify and integrate diverse metrics that capture all relevant ethical dimensions, thus leading to a higher rate of success.


2. Avoid gaming

Both humans and machines love to game. We can think about the potential for strategic manipulation of systems to achieve favorable outcomes, i.e., favorable to those who are trying to game the system for their own ends. This is also something that (perhaps) is universal, exhibited even by AI systems that engage in what is called specification gaming. To counter this scourge, we can lean in on concepts like Total Quality Management (TQM) which emphasizes the importance of process integrity over just meeting targets.

Discipline and vigilance are our friends. Thinking explicitly about how the metrics we want to put in place might be gamed reduces the risk of ethical breaches and unintended consequences. However, continuous monitoring and potential redesign of metrics are required to stay ahead of gaming tactics. Establishing robust monitoring systems and adaptable metrics, which are then observed via automated monitoring tools and periodic audits, can help with heading off some of the challenges that the use of metrics poses to an RAI program implementation.


3. Balanced metrics

Holistic assessments and protection against gaming: Using a mixture of quantitative and qualitative metrics is a strategic management tool to provide a comprehensive view of organizational performance when it comes to the RAI program implementation. It emphasizes the interconnectivity of different organizational elements and the need for holistic assessment. To head off our challenge from Goodhart’s Law, it also serves as a bulwark making it hard to simultaneously game multiple metrics that stand as checks and balances against each other. They also have the benefit of capturing multiple dimensions of ethical impact.

Balancing is a tough act: However, nothing comes without a cost: balancing diverse metrics can be complex and resource-intensive, and requires deep domain expertise to achieve the right balance. The biggest challenge when applying this approach is in developing and integrating multiple, sometimes conflicting, metrics. From applied experience, one of the lowest-hanging fruits to help address that is to use cross-functional teams to design the metrics and implement continuous feedback loops for adjustment as discoveries are made on what works and what doesn’t.


4. Dynamic adjustments

Change is the only constant: Metrics need adjustments as the organization's context and maturity change over time. So, too, do the metrics need to evolve over time, and, in a sense, adopting a meta approach to tracking the efficacy of the metrics themselves as a measurement will help you decide when they need to be changed. This doesn’t happen on its own and requires keeping metrics relevant and aligned with evolving ethical standards and societal expectations. It requires a robust mechanism for regular review and update, which can be resource-intensive, i.e., you need to have an adequate amount of resources and authority available to be able to act on these change requirements.

Creating a culture of constant change: If you work in an organization that is loath to welcome change and takes ages to pass things through committees before acting on them, as an RAI practitioner, it is incumbent upon you to establish a culture of continuous improvement and responsiveness to change. Not only will this make the RAI program more successful, but it will also improve the organization’s overall health as staff and decision-makers become comfortable with frequent change. This will require you to embed regular review processes in the organizational routine and leverage technology for real-time monitoring and adjustments.


5. Incentive structures

Shaping staff’s motivation: We can borrow from the disciplines of human resource management and corporate governance that have had to deal with issues of incentive alignment for a long time. We can think about both intrinsic and extrinsic motivation when it comes to what drives staff behavior, in particular, paying attention to the levers that each of them offers for bringing about the change that we want such that there is alignment with stakeholder and societal interests.

Plugging into existing structures: What we ultimately want to achieve here is a long-term commitment to ethical principles that are in firm alignment with broader organizational values. Then, there is the temporal aspect of incentive design, which needs to account for both short- and long-term implications, which can sometimes prove to be challenging. A strategy that works really well in the early stages is to integrate the use of ethical impact assessments into performance reviews and other reward systems that exist within the organization, essentially giving yourself a jumpstart and leaning in on the familiarity that staff has with these processes.


Now that we know the problems that some of the metrics and incentives can cause towards the efficacy of an RAI program, here are some implementation steps that we can take to operationalize the discussion above:

  • Stakeholder Engagement: Involve diverse stakeholders in defining and reviewing metrics and incentives.

  • Transparency: Make metric criteria, incentives, and changes to both transparent to foster trust and accountability.

  • Education: Train staff on the importance of balanced metrics and the risks of Goodhart's Law along with the impact that various incentives have on them.

Abhishek Gupta

Founder and Principal Researcher, Montreal AI Ethics Institute

Director, Responsible AI, Boston Consulting Group (BCG)

Fellow, Augmented Collective Intelligence, BCG Henderson Institute

Chair, Standards Working Group, Green Software Foundation

Author, AI Ethics Brief and State of AI Ethics Report

https://www.linkedin.com/in/abhishekguptamcgill/
Next
Next

Making large changes in small, safe steps for Responsible AI program implementation