Money Laundering, Machine Learning, and Bias

Maciej Cegłowski spoke to the SASE Conference in 2016 about, among many other related topics, machine learning. In discussing the world software developers have created, he stated the following:

Instead of relying on algorithms, which we can be accused of manipulating for our benefit, we have turned to machine learning, an ingenious way of disclaiming responsibility for anything. Machine learning is like money laundering for bias. It’s a clean, mathematical apparatus that gives the status quo the aura of logical inevitability. The numbers don’t lie.

In another talk, this time to the Library of Congress, he talked about a food cart vendor who was shut down for never changing the oil in which he fried everything. This, it turned out, resulted in the unique flavor that kept customers returning for more. You could fry anything and it was delicious. He then asks a question about how we use and analyze data:

So what’s your data being fried in? These algorithms train on large collections that you know nothing about. Sites like Google operate on a scale hundreds of times bigger than anything in the humanities. Any irregularities in that training data end up infused into…the classifier.

The world is full of data. It is increasingly difficult to justify collecting even more data. It is even more difficult to use this data in a way that doesn’t reinforce the status quo or harm at-risk communities.

There are no clear-cut answers to the problem of what to do with all the data. But thinking critically about it is essential. These two talks are a great starting point for people, and organizations, to begin thinking carefully about ethical data collection, usage, and analysis.

View archived version

13 June 2022