(Bloomberg Opinion) -- Internet giant Amazon last month ran into a problem that eloquently illustrates the pitfalls of big data: It tried to automate hiring with a machine learning algorithm, but upon testing it realized that it merely perpetuated the tech industry’s bias against women. What’s most troubling isn’t the discovery itself. It’s that most companies using similar algorithms don’t even want to know.
Welcome to the era of plausible deniability in big data.
I have long expected something like the Amazon glitch to happen. More than a year ago, using the male-dominated culture at Fox News as an example, I explained how it would work. Machine-learning algorithms analyze data from the past to understand what will be successful in the future. In a company with a misogynistic culture, the computer would see that females were promoted less frequently, tended to leave quickly and got fewer raises. So it would conclude that men are better hires, perpetuating and even amplifying the historical bias.
Amazon’s recruiting engine went to great lengths to identify and weed out women. A women’s college in the education section of a resume was an automatic demerit. By contrast, the presence of typically male vocabulary, such as “executed,” was a point in favor. These are just two examples of how computers can sift through data to find proxies for the qualities that they want to seek or avoid. What seems like offhand, irrelevant information correlates to things like gender, race, and class. And like it or not, gender, race, and class are very important in how our world works, so their signals are very strong. This allows computers to discriminate without their creators intentionally doing so.
What makes Amazon unusual is that it actually did its due diligence, discovered the troubling bias and decided against using the algorithm.
That’s a lot more than most companies can say. They’re trying to pretend that such problems don’t exist, even as they double and triple down on recruiting, firing, or other human-resources algorithms, and even as they sell or deploy credit, insurance and advertising algorithms. I know this because I run a company that audits algorithms, and I have encountered this exact issue multiple times.
Also see:
Here’s what happens. An analytics person, usually quite senior, asks if I can help audit a company’s algorithm for things like sexism or other kinds of bias that would be illegal in its regulated field. This leads to a great phone call and promises of more and better phone calls. On the second call, they bring on their corporate counsel, who asks me some version of the following question: What if you find a problem with our algorithm that we cannot fix? And what if we someday get sued for that problem and in discovery they figure out that we already knew about it? I never get to the third call.
In short, the companies want plausible deniability. But they won’t be able to hide their heads in the sand for long. More cases like Amazon’s will surface, and journalists and regulators will start to connect the dots.
Ideally, companies will face up to the issue sooner, which will mean spending much more on recruiting, or at least on making sure their algorithms aren’t illegal, an upfront cost they don’t relish. More likely, they’ll keep ignoring it until they attract a series of major lawsuits — possibly from regulators but, given the current climate, probably through class actions. And when the plaintiffs start winning, shareholders will recognize that big data isn’t quite the blessing that they had hoped it would be.