5 Rules to Manage AI’s Unintended Consequences
Companies are increasingly using “reinforcement-learning agents,” a type of AI that rapidly improves through trial and error as it single-mindedly pursues its goal, often with unintended and even dangerous consequences. The weaponization of polarizing content on social media platforms is an extreme example of what can happen when RL agents aren’t properly constrained. To prevent their RL agents from causing harm, leaders should abide by five rules as they integrate this AI into their strategy execution.
Social media firms claim they’re just trying to build communities and connect the world and that they need ad revenues to remain free. But nothing is really free. For them, more views mean more money and so they’ve optimized their algorithms to maximize engagement. Views are the algorithms’ “reward function” — the more views the algorithms can attract to the platform the better. When an algorithm promotes a given post and sees an upsurge of views, it will double down on the strategy, selectively timing, targeting and pushing posts in ways that it has found will stimulate further sharing, a process called reinforcement learning.
It doesn’t take an AI expert to see where this leads: provocative posts that evoke strong emotions will get more views and so the algorithm will favor them, leading to ever-increasing revenues for the platform. But social platforms aren’t the only ones who use reinforcement learning AI. As companies adopt it, leaders should look to social media companies’ problems to understand how it can lead to unintended consequences — and try to avoid making predictable mistakes.
Reinforcement Learning Agents
To understand the cause and effect cycle we see on social platforms, it’s helpful to know a bit more about how the algorithm works. This type of algorithm is called a reinforcement learning (RL) agent and while these agents’ activities are perhaps most visible in social media they are becoming increasingly common throughout business.
Unlike algorithms that follow a rigid if/then set of instructions, RL agents are programmed to seek a specified reward by taking defined actions during a given “state.” In this case, the reward is views — the more the better. The agent’s permitted actions might include who to target and the frequency of promotions. The algorithm’s state might be time of day. Combined, the agent’s reward, the states within which it operates and its set of permitted “actions” are called its “policies.”
Policies broadly define how an RL agent can behave in different circumstances, providing guardrails of sorts. The agent is free to experiment within the bounds of its policies to see what combinations of actions and states (state-action pairs) are most effective in maximizing the reward. As it learns what works best, it pursues that optimal strategy and abandons approaches that it found less effective. Through an iterative trial-and-error process, the agent gets better and better at maximizing its reward. If this process sounds familiar, it’s because it’s modelled on how our own brains work; behavioral patterns ranging from habits to addictions are reinforced when the brain rewards actions (such as eating) taken during given states (e.g., when we’re hungry) with the release of the neurotransmitter dopamine or other stimuli.
Understanding how RL agents pursue their goals makes it clearer how they can be modified to prevent harm. While it’s hard to change the behavior of humans in the human-AI system it’s a simpler matter to change an RL agents’ policies, the actions it can take in pursuing its own reward. This has important implications for social media, clearly, but the point is broadly applicable in any of an increasing number of business situations where RL agents interact with people.
Whatever you think of Facebook’s and Twitter’s leadership, they surely didn’t set out to create a strategy to sow discord and polarize people. But they did instruct managers to maximize the platforms’ growth and revenues, and the RL agents they devised to do just that brilliantly succeeded — with alarming consequences.
The weaponization of social media platforms is an extreme example of what can happen when RL agents’ policies aren’t properly conceived, monitored, or constrained. But these agents also have applications in financial services, health care, marketing, gaming, automation and other fields where their single-minded pursuit of rewards could promote unexpected, undesirable human behaviors. The AIs don’t care about these, but the humans who create and operate them must.
Following are five rules leaders should abide by as they integrate RL agents into their strategy execution. To illustrate how an agent in financial services can skew human behavior for the worse — and how, with proper adjustment, these can help head off such a problem — I’ll illustrate with a case from my own company.
1. Assume that your RL agent will affect behavior in unforeseen ways.
My company built an agent to speed the quality assurance of accounting transactions by flagging anomalies (potential errors) that the algorithm scored as high-risk and putting these first in the queue for evaluation by an analyst. By dramatically reducing the total number of anomalies the analysts needed to review, the algorithm substantially cut the overall review time as we’d hoped it would. But we were surprised to see suspiciously fast review times for even the most complex anomalies. These should have taken the analysts more time, not less.
2. Systematically evaluate deviations from the expected.
To date, few companies methodically assess how their RL agents are influencing people’s behavior. Start by regularly asking your data scientists for data on behavior changes that may be associated with the agents’ activities. If you see a departure from what’s expected, dig deeper. In our case, the fact that the analysts were flying through the riskiest anomalies was a red flag that the algorithm was causing an unexpected knock-on effect. We knew we had a problem.
3. Interview users, customers, or others about their responses to the RL agents’ outputs.
While those at the receiving end of an RL agent’s actions may be unaware that they’re being influenced by an AI, you can still gauge their response to it. Because we were concerned about our accounting analyst’s too-speedy reviews we spoke with them about their response to the algorithm’s compiling anomalies for them to assess. It turned out that they wrongly assumed that the agent was doing more of the quality assurance on these anomalies than it was; they were over-relying on the agent’s “expertise” and so paid less attention in their own investigation of the anomaly. (Incidentally, such over-reliance on AI is one reason people are crashing “self-driving” cars; they assume the AI is more capable than it is and hand over too much control — a dangerous knock-on effect.)
4. If an agent is promoting undesirable behaviors, modify its policies.
To optimize agents’ pursuit of their reward, most AI teams constantly adjust the agents’ policies, generally modifying state-action pairs, for example the time in a billing cycle (the state) in which an agent will send a payment prompt (the action). In our accounting example, we made various policy changes including redefining the state to include the time spent by analysts on each anomaly and adding actions that challenged an analyst’s conclusions if they were reached too quickly and elevated selected anomalies to a supervisor. These policy changes substantially cut the number of serious anomalies that the analysts dismissed as false positives.
5. If undesirable behaviors persist, change the reward function.
Adjusting an agent’s state-action pairs can often curb undesirable knock-on behaviors, but not always. The big stick available to leaders when other interventions fail is to change the agent’s goal. Generally, changing a reward function isn’t even considered because it’s presumed to be sacrosanct. But when the agent’s pursuit of its goal is promoting harmful behavior and adjusting the states or actions available to the agent can’t fix the problem, it’s time to examine the reward itself.
Giving an artificial intelligence agent the goal of maximizing views by any means necessary, including exploiting human psychological vulnerabilities, is dangerous and unethical. In the case of social platforms, perhaps there’s a way to adjust the agents’ state-action pairs to reduce this harm. If not, it’s incumbent on the platforms to do the right thing: stop their agents from destructively pursuing views at any cost. That means changing the reward they’re programmed to pursue, even if it requires modifying the business model.
While that may seem like a radical idea in the case of social media, it’s definitely in the air: Following the Facebook Oversight Board’s decision to uphold the company’s ban on former president Trump for violating its rules against provoking violence, Frank Pallone, the chairman of the House Energy and Commerce Committee, squarely placed blame for recent events on the social media business model, tweeting: “Donald Trump has played a big role in helping Facebook spread disinformation, but whether he’s on the platform or not, Facebook and other social medial platforms with the same business model will find ways to highlight divisive content to drive advertising revenues.” The Oversight Board itself called out Facebook’s business model in advising the company to “undertake a comprehensive review of [its] potential contribution to the narrative of electoral fraud and the exacerbated tensions that culminated in the violence in the United States on January 6. This should be an open reflection on the design and policy choices Facebook has made that allow its platform to be abused.”
Ironically, a positive effect of social media is that it’s been an important channel in elevating people’s awareness of corporate ethics and behavior, including the platforms’ own. It’s now common for companies and other organizations to be called out for the destructive or unfair outcomes of pursuing their primary objectives — whether carbon emissions, gun violence, nicotine addiction, or extremist behavior. And companies by and large are responding, though they have a long way to go still. As RL agents and other types of AI are increasingly tasked with advancing corporate objectives, it is imperative that leaders know what their AI is up to and, when it’s causing harm to the business or society at large, to do the right thing and fix it.