Be careful with ChatGPT


Existing Responsible AI approaches leave unmitigated risks in ChatGPT and other Generative AI systems. We need to evolve our approaches and refine our thinking.

Ethical challenges are only getting exacerbated with increasing experimentation. We are unearthing issues like the generation of very convincing scientific misinformation, biased images and avatars, hate speech, and more. How we put these systems together within human organizations and empower ourselves to take action will be critical in determining whether we get ethical, safe, and inclusive uses out of them. 

Let’s dive deeper into these areas and highlight why we must act now.


Privacy

LLMs KNOW YOU: Systems like GPT-3 are known to hold private information about individuals (via the large internet-scale datasets that they are trained on, such as The Pile dataset) that can be elicited by providing the right prompts. 

WHY IT MATTERS: Generative AI systems wrapped in easy-to-use interfaces like DALL-E 2 and ChatGPT make it likely that personally identifiable information (PII) about people will inadvertently pop up as a part of the outputs. This stands in contrast to search queries where the information might still be accessible, but at least the user has to click through to the webpage listed in the search results. 

GO DEEPER: In Extracting Training Data from Large Language Models, Carlini et al. show that larger models are even more susceptible to privacy intrusion problems. The current trend in Generative AI systems is firmly in the direction of larger models.   


Security and Reliability

INSECURE AI: These systems are often vulnerable to threats such as: 

  1. model theft (recreating the model by observing query responses), 

  2. model inversion (recovering the underlying training dataset for a model through querying), 

  3. data poisoning (injecting tailored samples into training data to trigger the model to respond incorrectly during inference), and 

  4. adversarial attacks (manipulation of an AI system to cause unexpected behavior). 

SURPRISE ATTACK: In the case of Generative AI systems built on large-scale foundation models, latent brittleness can be leveraged to mount zero-day attacks that are unknown to the developer and unmitigated. These vulnerabilities are then propagated and amplified in any downstream system using them. 

WHY IT MATTERS: This has implications for the system's reliability, especially in safety-critical contexts like automobiles, healthcare, nuclear power systems, and more, where we seek a high degree of robustness and safety in the performance of AI systems. For example, code generation by systems like Codex is known to harbor vulnerabilities producing unsafe outputs

GO DEEPER: We should constrain such safety-critical AI systems operations within predefined thresholds, graceful failure modes, and deferral to humans in case of out-of-bounds uncertainty. When a Generative AI system is used downstream, the strength of those guarantees begins to fade as we lose fidelity and confidence in the upper and lower bounds of the performance measures of those systems due to utilization in different or unknown contexts for which the system wasn’t explicitly trained.  


Transparency and Accountability

IN THE DARK: Generative AI systems will often be purchased off-the-shelf or consumed via an API, as with systems like Midjourney, ChatGPT, Stable Diffusion, etc. In these cases, users need more visibility on the exact composition of the underlying training data, the internal structure of the system, and the governance surrounding those systems. 

WHY IT MATTERS: This lack of transparency poses issues when users encounter problems like violating their IP rights, as Greg Rutkowski experienced with the co-opting of his work. There was little recourse for him, and many other artists worldwide saw their portfolios scooped up in the training of these systems. When they tried to seek recourse, they ran into low-resourced and opaque processes with little to no resolutions. 

TURMOIL AHEAD: Once the proverbial cat is out of the bag, the artists struggle tremendously to hold developers accountable. The same goes for code examples in the case of systems like Codex and Copilot, with a lawsuit that might create new precedents for future legal challenges. Other lawsuits, such as the class action against Stable Diffusion and the lawsuit by Getty Images, will also shape the landscape in the coming months.


Explainability

FOOL ME ONCE …: The confidence with which Generative AI systems return answers can be pretty misleading when explanations don’t accompany those answers for how the system arrived at those answers. 

WHY IT MATTERS: While we see toy uses such as casual chats with ChatGPT for the time being, there are more severe use cases, such as the integration of these capabilities into decision-making within healthcare or financial workflows, whereby an explanation for how the system arrived at an answer is going to be essential. The ease of integration of these systems makes customer support more likely to become automated. Trying to figure out why your claim was denied will be an exercise in futility when multiple systems are chained together to provide that service. It becomes impossible to point out why some subsystems made a particular decision and how it affects the dissatisfied end-user.

GO DEEPER: Not using unexplainable models where alternatives are available (potentially with some performance tradeoffs).  


Fairness and Inclusiveness

DATA HUNGRY: Internet-scale systems come with internet-scale biases. Generative AI systems are trained on very large datasets and scoop up useful signals (which leads to their tremendous utility) and problematic content such as hate speech, racism, and misogyny. 

WHY IT MATTERS: These are manifested in the outputs of Generative AI systems. For example, when prompted with the “CEO,” the models provide more outputs of pictures of men than of women – something we’ve seen with many AI systems in the past. 

WHAT LIES AHEAD: The pervasiveness of the lack of fairness might be amplified across modalities into computer vision and natural language processing for multi-modal foundation models, especially when they are used to create marketing copy, educational videos, etc. 


Safety and Existential Risk

FAST EVOLVING CAPABILITIES: Current LLMs, still highly limited, are passing legal exams, authoring major publications, and outscoring college students on IQ-style tests. Further progress will likely happen quickly – AI will probably surpass our abilities within the century

THREAT ASSESSMENT: According to expert surveys, many AI experts think there’s a non-negligible chance that AI will lead to outcomes as bad as human extinction - 48% of top experts surveyed think there is as high as a 10% chance of a catastrophic outcome – that someone is more likely to die from an AI accident than a car crash. In the past, it’s been easy to dismiss existential threats as best left to Terminator movies, but times are changing.

ALIGNMENT IS HARD: AI systems are different from other technologies: they can have goals, seek power, and surprise us with new capabilities we don’t expect. The field of alignment - of making systems share human goals, like safety - is young and under-resourced compared to how fast advanced systems are being built. Investing in alignment research is becoming a dire need with rapid increases in capabilities.


Scaling and Evolving Existing Solutions

RESPECTING PRIVACY: In addressing privacy issues in the era of Generative AI, investigating how methods such as differential privacy, data cooperatives, and better safety filters on outputs might serve as a starting point to evolve our current approaches to privacy and data management. 

IMPROVED LICENSING: Working on novel mechanisms through contracts and licenses can be another way to bridge the accountability gaps in an ecosystem where some of an organization's core capabilities come from a model developed by a third party.

BUILDING TRUST: Enhancing these systems' security, safety, and reliability starts by working with community stakeholders to define better what each of these terms might mean to them and can be an approach that leads to more desirable outcomes. For example, working with domain experts in healthcare to understand their go/no-go criteria and what sorts of information would help elicit trust can aid in more meaningful adoption of Generative AI systems in their workflows. 

MORE TRANSPARENCY: In addressing the lack of transparency, leaning more heavily on automated documentation methods might provide us with an avenue to meet this challenge as it alleviates the burden of documenting many systems and doing so comprehensively. This can prove challenging for a Generative AI system with an underpinning foundation model and a very large data footprint that can be difficult to document well. 

BETTER EXPLANATIONS: Explainability issues can be tackled by investing in techniques like distillation, teacher-student models, and inverse reinforcement learning, which can shed some light on how the system arrives at its outputs.

DEEPER INCLUSION: Finally, to ensure that the systems are fair and inclusive, joint exploration of their capabilities with impacted stakeholders, such as artists in the creative industry, can help unearth the values they hold dear and align system outputs with respecting those value choices while still providing them with the power that these capabilities have to offer.  


Existing Responsible AI Issues Need More Attention

WHAT’S GOING WRONG: There are many guidelines and efforts toward operationalizing Responsible AI. Yet, there remains a large Responsible AI gap with organizations committing to act as they scale AI. However, they fail to do so comprehensively due to a lack of leadership assignment, resource allocation, and alignment with their purpose and values.

WHY ACT NOW: With Generative AI systems gaining traction, organizations find it easier to adopt these AI capabilities due to lowered barriers for access and utility. Research has shown that investing in Responsible AI early is essential because it takes, on average, three years for the program to achieve maturity. It minimizes failures as you scale the organization's development and deployment of AI systems.

Responsible AI adoption is even more urgent and important with Generative AI systems. Some existing issues constituting Responsible AI become exaggerated and varied in their impact, yet our responses haven’t evolved to address these changing needs. Let’s change that through deliberate effort.

Abhishek Gupta and Emily Dardaman

Fellow, Augmented Collective Intelligence, BCG Henderson Institute

https://www.linkedin.com/in/abhishekguptamcgill/
Previous
Previous

Emergent Collective Intelligence from Massive-Agent Cooperation and Competition

Next
Next

What should humans do next?