Updated: May 3
Data and data-driven technologies are playing an increasingly influential role in health care, helping to detect disease earlier, move care closer to home, encourage health-promoting behaviours, and improve the efficiency of service delivery. Although data-driven technologies have potential for good, they can also exacerbate existing health inequalities, which are deep-rooted and have been laid bare during the COVID-19 pandemic.
In this Comment, we examine how structural inequalities, biases, and racism in society are easily encoded in datasets and in the application of data science, and how this practice can reinforce existing social injustices and health inequalities. Approaching the problem from the perspective of data scientists, we follow the stages in an analytical pipeline to consider how and where things can go wrong. We then outline the essential role of data scientists in tackling racism and discrimination.