Addressing Bias & Algorithmic Fairness

Nathan Jeon
9 min readJun 24, 2022

More than ever, companies and organizations are employing Artificial Intelligence algorithms in new technologies like social media and teleconferencing, contributing to a dramatic development in productivity, information flow, and collaboration.

However, innovation presents a different set of challenges as humans become more dependent on the decisions of AI, affecting the state of fairness. To address this issue, researchers James Zou and Londa Schiebinger observe the potential impacts biomedical Artificial Intelligence can have on human health, from instrumentation and data collection to individual health assessment. Common medical devices like the pulse oximeter can influence fairness. Although initiatives to promote fairness are rapidly growing in various disciplines, there is growing concern that even medical devices may perpetuate existing health inequalities as they continue to be distributed. Therefore, is it possible to make ‘fair’ technologies? Put simply, it is possible to make fair technologies, however, it’s necessary to account for the behaviors and decisions of both humans and AI. Indeed, Artificial Intelligence can be a precise indicator, however it doesn’t uphold qualities of common sense and intuition, which humans possess. Therefore, when making decisions — especially in medicine and healthcare — it’s critical to consider the nature of fairness, as every individual shares their unique values and beliefs which may not align with an algorithm’s predicted outcomes. Additionally, it’s important to consider the mindset when observing the relationship between Artificial Intelligence and humanity. Companies and organizations should confront social, technical, and political systems for an optimal model of fairness which addresses both the company’s goals and provides the consumers with ‘fair’ technologies rather than an algorithmic dependent mindset. Lastly, it’s crucial to consider the perspectives of fairness among different groups of minorities, ethnicities, races, etc., and ensure that technologies should address outliers and not just commonalities to maintain fairness for all groups.

Firstly, medicine is one of many fields which is strictly vital to ethical standards, dependent on the state of trust and overall relationships with patients. Transparency is another substantial element in medicine promoting quality and consistent care with practice in healthcare — covering treatment, disease prevention, medical research, and promoting and maintaining the health and wellbeing of patients. Artificial Intelligence plays an impactful role in human health, yet also displays inconsistencies among diverse populations in various commonplace technologies. Researchers James Zou and Londa Schiebinger evaluate modern biomedical AI and proclaim how “AI algorithms are often developed on non-representative samples and evaluated based on narrow metrics” (Zou and Schiebinger). The researchers outline key challenges concerning imperfections in medical technology in outcome design, data collection, and technology evaluation illustrating how bias and disparity may arise (Zou and Schiebinger). For instance, they mention medical devices, such as the pulse oximeter — essential in collecting health-related information like oxygen levels, sleep heart rate, and arrhythmia — as a technology demonstrating varying results in data collection. The researchers highlight that “the problem with devices that use infrared and red light signaling is that these signals interact with skin pigmentation, and accuracy may vary with skin tone” (Zou and Schiebinger). More specifically, they analyzed how pulse oximeters are capable of measuring oxygen levels by drawing arterial blood, yet “may overestimate arterial oxyhemoglobin saturation at low SaO2 in patients with darker pigmented skin” (Zou and Schiebinger). Therefore, they discovered that it may be possible that patients might not be getting the supplemental oxygen to avoid damage to vital organs (heart, brain, lungs, and kidneys). Furthermore, they observed a case study of two different populations and compared their oxygen saturation levels with pulse oximeters. With over 47,000 paired readings, the researchers concluded that “oximeters misread blood gases 12 percent of the time in Black patients compared to 4 percent of the time in white patients” (Zou and Schiebinger). The authors believed that a contributing factor to the drop in percentage was because of the different composition of race/ethnicity, sex, and gender in the test patients, resulting in different performances across the two demographic groups. Their concern is that pulse oximeters illustrate how algorithmic bias and disparities can arise from inadequate outcome choice, data collection, and model evaluation (Zou and Schiebinger). These “algorithmic bias[es] and disparities” are critical in medicine, as inaccurate data collection could lead medical professionals to make an incorrect decision, which could lead to disparate mortality outcomes for Black patients.

Secondly, Zou and Schiebinger propose various methods to achieve ‘fair’ technologies by implementing both short-term and long-term technical solutions aligning with my perspective. The researchers claim that monitoring medical AI algorithms post-deployment is an essential short-term technical solution to ensure their safe and unbiased application (Zou and Schiebinger). They argue that, over time, its performance and algorithm can change, affecting the user behavior and creating a feedback loop. As a result, the deployed AI algorithm may have biases compared to the data that it was trained on, revealing the vulnerability and inconsistencies and portraying how monitoring medical AI algorithms are necessary to achieving ‘fair’ technologies. Zou and Schiebinger’s perspectives share a similar argument with my claim that it is possible to make fair technologies and that it’s necessary to account for the behaviors and decisions of both humans and AI. Although algorithms are constructed to create optimal prediction models, they don’t uphold qualities of common sense and intuition, which are necessary qualities when considering fairness.

Furthermore, researchers Kristian O. R. Myrseth and Conny E. Wollbrant observe whether it is intuitive to behave fairly and concluded that “since a decision that relies on intuition is typically made faster than a decision that relies on deliberation [it]. . . provides an important indication of the intuitiveness of fair behavior” (Myrseth and Wollbrant). Hence, human qualities in observing AI algorithms post-deployment and intuition are both significant, opening the possibility of achieving ‘fair’ technologies — rather than leaving dependency on them. Considering both humans and AI, pulse oximeters across different populations are critical for ensuring safe, unbiased application and exacerbating existing structural health disparities.

Thirdly, a proposed long-term structural solution by Zou and Schiebinger is to embed ethical reasoning through curriculum and team development in organizations to reinforce social inquiries and suggest structural solutions and policies for agencies and researchers. Organizations such as universities have become socially responsible for Artificial Intelligence by reforming curricula. Additionally, companies such as the American Medical Association have stated that their Artificial Intelligence for health care should be “thoughtfully designed, high-quality, [and] clinically validated” (Zou and Schiebinger). However, the American Heart Association also mentioned a “call to action” to overcome structural racism using a “structural racism and health equity language guide” (Zou and Schiebinger). Moreover, the FDA commends five criteria for excellence in its Digital Health Innovation Action Plan. The plan recommends expanding the current FDA guidelines for software through various reviews of ML health tools including “an analysis of health disparities in the clinical domain of interest; a review of training data for bias; transparency surrounding decisions made regarding model performance, . . .and post-market review of health equity outcomes” (Zou and Schiebinger). From this statement, not only does the algorithm assess the cooperation of human and AI usage, it supports my claims for organizations and researchers to acknowledge, that when making decisions — especially in medicine — it’s critical to consider the nature of fairness as every individual shares their unique values and beliefs which may not align with an algorithm’s predicted outcomes. Companies and organizations should all approach their objective of innovating while utilizing fairness. Zou and Schiebinger discuss Barbara Grosz’s et al. article example of integrating ethical reasoning in class sessions into courses through the computer science curriculum can be beneficial in “habituat[ing] students to thinking ethically as they develop algorithms and build systems” (Barbara Grosz et al.). By providing computational training and relevant material in computer science and medical ethics, Artificial Intelligence can reinforce social inequities and suggest structural solutions. Indeed, the coexistence of humans and AI in work settings should be structured for the success of both parties. Therefore, it’s crucial to formulate models of fairness in workplace settings for organizations and agencies to not depend on the decisions by AI, but to develop real-life structures and systems to approach the nature of fairness for both.

Fourthly, another short-term technical solution presented by Zou and Schiebinger is to increase the diversity of medical data resources, since many commonly available datasets in AI algorithms fail to properly represent outliers, including minorities and disadvantaged populations. Consequently, especially for devices like pulse oximeters, data collection is an essential step in developing and evaluating medical AI algorithms in achieving ‘fair’ technologies. For instance, Zou and Schiebinger remind the audience of the physical biases of pulse oximeter data, which feed into algorithms that increasingly guide hospital decisions where darker-skin data is underrepresented populations displaying different performances across demographic groups. They introduced the Stanford Skin of Color Project which was an attempt to update representation and increase the diversity of medical data resources to combat institutional racism within medical education. Researchers Yusef Yousuf and Jaime C. Yu noticed the lack of representation in dermatology medical education and realized the concerns “as numerous diseases have cutaneous manifestations that differ in darker skin tones and thus can impact presentation and outcomes” (Yousuf and Yu). The authors’ objective was to remedy the evident gap between various overrepresentations of white skin tone and the underrepresentation of dark skin tone in medical textbook imagery. Even small, subtle actions can benefit bias mitigation, improve patient satisfaction in healthcare, and reduce health disparities (Zou and Schiebinger). Acknowledging underrepresentation is a critical step in increasing the diversity of medical data resources, it supports my justification to consider the perspectives of fairness among different minorities, ethnicities, races, etc., and ensures, that technologies should address outliers and not just commonalities to maintain fairness for all groups. Therefore, when organizations decide, it’s important to assess algorithm design, evaluation metrics, and fairness through the lens of human and technical qualities.

Overall, today Artificial Intelligence seems to run the show. From social network platforms such as Facebook, AI applications in clinical genomics, and biotechnology through image and drug screening, Artificial Intelligence has reshaped organizations, businesses, and societies. Despite its extensive range of benefits, Artificial Intelligence presents a different set of challenges as humans become more dependent on the decisions of AI, affecting the state of fairness. People and AI both bring unique strengths and weaknesses to the table, thus making it possible to make fair technologies, accounting for the behaviors and decisions of both humans and AI. Researchers James Zou and Londa Schiebinger observe the potential impacts biomedical Artificial Intelligence can have on human health, from instrumentation and data collection to individual health assessment. Their research case and from biomedical health devices such as pulse oximeters, it illustrates how algorithmic bias and health disparities can surface from inadequate outcome choice, data collection, and model evaluation. It’s critical to consider the nature of fairness, as every individual shares their unique values and beliefs, which may not align with an algorithm’s predicted outcomes. Additionally, it’s important to consider the mindset when observing the relationship between Artificial Intelligence and humanity. Companies and organizations should confront social, technical, and political systems for an optimal model of fairness which addresses both the company’s goals and provides the consumers with ‘fair’ technologies rather than an algorithmic dependent mindset. And lastly, considering the perspectives of fairness among different groups of minorities, ethnicities, races, etc., and ensure that technologies should address outliers and not just commonalities to maintain fairness for all groups. In terms of future trajectories for Artificial Intelligence, organizations need to recognize how their technologies impact all users. As mentioned, it’s significant to account for both humans and AI, as both parties maintain qualities to achieve fairness. Achieving fairness is one of many concerns in the fields of genetics, biotechnology, Artificial Intelligence, machine learning, architecture, etc. however, handling concerns such as inclusion and diversity are just as important. Diversity and inclusion are more than policies and programs. They share a deeper trust and commitment to help engage researchers, business leaders, and the consumers. It presents a sense of belonging in an inclusive environment. That’s the beauty of society.

Work Cited

Grosz, Barbara J., et al. “Embedded EthiCS.” Communications of the ACM, vol. 62, no. 8, 2019, pp. 54–61. Crossref, https://doi.org/10.1145/3330794.

Myrseth, Kristian O R, and Conny E Wollbrant. “Commentary: Fairness is intuitive.” Frontiers in psychology vol. 7 654. 9 May. 2016, doi:10.3389/fpsyg.2016.00654 https://dl.acm.org/action/downloadSupplement?doi=10.1145%2F3330794&file=p54- grosz-supp.pdf

Yousuf, Yusef, and Jaime C Yu. “Improving Representation of Skin of Color in a Medical School Preclerkship Dermatology Curriculum.” Medical science educator vol. 32,1 27–30. 30 Nov. 2021, doi:10.1007/s40670–021–01473-x

Zou, James, and Londa Schiebinger. “Ensuring that biomedical AI benefits diverse populations.” EBioMedicine vol. 67 (2021): 103358. doi:10.1016/j.ebiom.2021.103358

--

--

Nathan Jeon

Graduate @ucsdbiosciences @ucsd_ghp ’22 | Fencer & Referee @usafencing | Researcher @sqatucsd @stanfordmed | #MedEd