Presentation: How do we Audit Algorithms?



4:10pm - 5:00pm

Day of week:

Key Takeaways

  • Examine critically how much faith and trust we put in models.
  • Understand the models ML practioners generate should be audited for fairness.
  • Comprehend the degree to which models can affect real people if not properly audited.


Algorithms are increasingly being used to automate what used to be human processes. They are potentially more fair and objective, but they are not automatically so. In fact they can easily codify unfair historical practices. I will discuss some examples of this problem and then I'll pose the question, how would we transparently and comprehensibly audit such algorithms for fairness?


QCon: Please tell me about your role today.
Cathy: I am doing data science consulting work for the Attorney General’s offices of Illinois and other states.
QCon: What is the motivation for your talk?
Cathy: I think that big data is increasing inequality and threatening democracy. I think algorithms that are being used carelessly and being created to carelessly create negative feedback loops which destroy people’s lives.
And I am interested in measuring these feedback loops. I want to measure the effect and the way you do that is by auditing the algorithms themselves.
The data that you are using to train a model is just data that is collected from the world. I argue the world is flawed, so our models are flawed. At the very best, you are embedding and encoding past flaws into your future outputs. For example, if we don’t hire women as engineers, and we try to create an algorithm that tries to hire engineers based on historical data. The algorithm will have a bias against women.
QCon: What can be done about it?
Cathy: The number one thing is to realize that just because it is an algorithm, doesn’t mean it is objective and fair. The second thing is that you have to have human intervention to avoid this repetition of past mistakes.
QCon: But doesn’t that in itself put bias into the algorithm, if a human intervenes in altering data interpretation?
Cathy: You are right. But you have to be more intentional about it. What I am asking for is instead of relying on default bias (which is essentially the status quo bias), we should realize that we are not removing ethical decisions. We are just letting them be default. Instead, we have to proactively create those ethical decisions.
Data scientists don’t get paid to think about this. We get paid to optimize our models to a certain very narrow sense of success, and that’s too narrow. Especially for certain kind of algorithms. I usually don’t care about algorithms but in certain cases I do. I want to focus on algorithms that make important decisions for people, affecting their lives. Algorithms deciding if they get hired at a job or whether they get put in prison for how long, whether they keep their job or whether they get medical insurance. That kind of algorithms should be scrutinized.
The algorithm itself is seen as the law of the land, except that it fails constitutionally in that it is not open. People can’t scrutinize it. People can’t interrogate the “law.” They can’t interrogate the model, and they have no idea what it is. But we are told that it must be objective and fair, because, after all, it is mathematical.
And that is simply not true. As a data scientist, I could be making these decisions. I could decide one way or the other. Almost nobody questions my decisions because I have a PhD in mathematics, so they assume I must be right.
QCon: How do you propose to deal with these issues?
Cathy: There are multiple ways, but there are two ways that I think of. One is to try to build a model that is inherently fair. The other way is to try to understand or test an existing model through an audit to see whether it is fair and to try to improve it if it isn’t fair.
QCon: How do you audit a model?
Cathy: When I say models discriminate, in fact that is the goal of models. Models are setup to discriminate. The question is whether it’s doing it illegally or unethically. I would consider the legally protected groups. I don’t think it’s a full list but it is at least a partial list. For example, let’s look at recidivism risk models for sentencing, like the Arnold Foundation studies. If there was a white guy and a black guy who have the same history of violence, do they get the same risk model? Do they get the same risk score?
Let me break that down a little bit. You are not allowed to use race in an algorithm for the most part. That’s not to say you shouldn’t have race on hand because having race actually helps you determine whether a model is racist, right?
The most direct way of figuring it out is if you have race on hand but you probably don’t want to use race as one of the characteristics of an individual if you are assessing an individual for creditworthiness or various other kinds of risk. Generally speaking people don’t use race, but they do use proxies for race all the time. Without even thinking about it, they will use zip code for example. People who are a little smarter than that say oh, zip code is obviously a proxy for race because our country is so segregated, so they don’t use zip code. But, without really processing it, they use other things as subtle proxies for race.


Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March

Conference for Professional Software Developers