In popular science fiction, the portrayal of an artificial intelligence is that of an entity completely compelled by pure logic, objective fact being the only thing that factors into its considerations. The same can not be said of the artificially intelligent tools currently in use in the real world. Much like their flawed creators, many of today’s AI systems have presented some considerable biases in their operations. Let’s discuss this issue, and how it can be resolved.
What Kind of Biases Have AI Systems Demonstrated?
There are a variety of biases that have been observed in artificially intelligent systems. These biases are as follows:
- Sampling Bias - This kind of bias comes up when a study only considers part of a given population, or the selection of samples to consider isn’t a random process.
- Voluntary Bias - A kind of sampling bias, voluntary bias specifically refers to how a population’s results are artificially skewed by their willingness to participate.
- Design Bias - This bias describes when a process itself leads to skewed outcomes, often impacting the data collection process or the analysis of that data. In AI, a skewed dataset is the most likely culprit.
- Exclusion Bias - This form of bias is the result of the removal or omission of some feature of the data, leaving out important information that could impact the significance of the data and providing fewer or less valuable insights.
- Label Bias - Predictably, this is simply the phenomenon of data being labeled incorrectly. This itself often appears in two forms:
- Recall Bias - This form of bias appears in data that has been mislabeled and annotated inaccurately.
- Measurement Bias - This division of label bias is the result of inaccurately or inconsistently taken data points.
- Confounding Bias - This bias occurs when external variables have impact on your data, in addition to what you have based your data on. This can lead to great inaccuracies in your final data.
- Survivorship Bias - This variety of bias is what we see when only the data that has made it through the selection process is considered. For instance, World War II researchers made this error when examining fighter jets to better reinforce them. By only examining jets that survived the trip back from a combat mission, the most useful information (where the planes that went down were hit) was ignored.
- Time-Interval Bias - If data is collected over periods of time, this bias can emerge when only data from a certain time period is actively considered, rather than the complete set.
- Omitted Variable Bias - This kind of bias occurs when the data to be collected is cherry-picked, with only certain variables considered. By removing this data, the results can be skewed.
- Observer Bias - Better known as confirmation bias, observer bias is the phenomenon where the individual making their observations only considers the data that lines up with their own values or goals.
- Funding Bias - Predictably, this variety of observer bias comes when the interests of a financial backer leads to the data being skewed.
- Cause-Effect Bias - Or in other words, correlation being seen as causation. This bias dictates that two events happening concurrently must be the result of one leading to the other, rather than a third factor contributing or the two being completely unrelated.
- Model Over/Underfitting - This bias comes from the analytical system (called the model) not having the capability to see the forest for the trees, as it were, or not being equipped with enough features to identify the patterns it should.
- Data Leakage - This type of bias comes into play when two separate data sets to be compared inadvertently share data… for example, predictions for a certain time period also including actual observations from that time period.
Where Do These Biases Come From?
Looking over this list, a clear pattern emerges: in the vast majority of cases, the bias doesn’t come from the system. Rather, it comes from the person using the system itself.
AI Bias is Just an Extension of Human Bias
Whether made in error or based on some prejudice or assumption, the majority of biases (particularly those that could impact the function of an artificially intelligent system) come from the user.
For instance, let’s say that we wanted to try and figure out what the most important aspect of our services was to our clientele. While this is a greatly oversimplified example for AI’s capabilities, it gets our point across. While the algorithm powering the AI could be perfectly put together, the data used can easily skew those results. For instance, let’s say that your data was specifically and exclusively collected from your business’ Facebook followers. There are numerous biases that could very much impact the accuracy of your data. For instance, drawing your data specifically from your Facebook followers is clear sampling bias (compounded with voluntary bias as your followers need to opt into providing you with said data).
That’s just one example. We’ve all seen the news articles telling stories about how AI-powered facial recognition systems aren’t able to recognize people of certain races, or in one particularly egregious case, labeling all members of certain races as criminals.
Obviously, not ideal.
AI presents a few additional biases as well, particularly when it comes to predicting unprecedented events… after all, the data to consider it just isn’t there (unintentional exclusion bias). The big problem here is that, like most biases, it takes an awareness to avoid them—an awareness that an AI system unfortunately lacks.
How Can Bias Be Avoided in AI?
There are a few different steps that need to be taken to mitigate the issues that bias can create in AI. The approach to this needs to follow two parts.
In terms of creating an AI in the first place, there needs to be the capability for a human being to observe the program’s processes and catch its mistakes, with (as we always promote) frequent updates to ensure any issues are addressed and the system in general is improved upon. There also needs to be standards in place in terms of the data collected and used to ensure that the above biases are minimized as much as possible.
Likewise, your team members need to keep these kinds of biases in mind when they’re working without the assistance of AI. As we’ve established, the biases present in AI are generally sourced from human biases, which means they can potentially influence your business even if you aren’t using an artificially intelligent system. In this way, you need to make sure that your team members are keeping open minds as they process the data you collect and generate.
As artificial intelligence and machine learning becomes more accessible and commonplace, it’s likely that businesses of all sizes will be able to embrace more advanced tools. In the meantime, we’ll be here to support the tools you currently rely on. To find out more about our fully managed IT services, give us a call at (732) 291-5938