Awakening to the Realities of AI in Healthcare

By Ellen Nantau

I was standing in an airport customs line when I realized that I wanted to be a physician. An announcement sounded overhead – an employee asking if anyone present was a doctor. My parents, both in medicine, shared a look. We were already at risk of missing our connecting flight and neither of them were currently in general practice or emergency medicine. When no one else stepped forward, my father pushed through the nearby crowd that had formed around a seated woman, crouched to her level, and told her that he was a doctor.

It wasn’t a heroic moment, not in a traditional sense. Neither of my parents dropped everything and sprinted into the action as soon as they got the call. But the normalcy of the moment for my father is what made the crowd’s response impressive, what made me sure that I was witnessing something special. The people dispersed, and the air of the room changed. All Dad did was say “I’m a doctor,” and there was calm.

All Dad did was say “I’m a doctor,” and there was calm.

In the end, the woman was fine, we missed our flight, and I became a pharmacist. No airport has ever required the presence of a chemist. If I ever fly somewhere with my own children, they’ll make their flight. But what I learned on that day from my father and what he has continued to teach me is that by being in healthcare, you can improve the lives of other people, just by doing your job. I went into healthcare, because healthcare is about people, and I opened an independent pharmacy so that I could emphasize the provision of patient-centred care.

At this point, it would be fair to ask what any of this has to do with artificial intelligence (AI). So far, the answer is “nothing,” and perhaps this should remain the case. In the 21st century though, AI has pervaded all aspects of daily life in the Western world, including our healthcare system. Even during the past few months, when COVID-19 has dominated headlines, AI has managed to share the spotlight. Within weeks of the epidemics in China and Europe hitting the news, stories emerged of robots helping to care for infectious patients and of programs aiding in the search for a medical solution to the virus.

Once upon a time, these articles would have seemed ridiculous. It is true that technology has been infiltrating different industries for centuries; after all, the term “technological unemployment,” referring to job loss due to task automation, was coined by economist John Keynes almost a hundred years ago now. What may be changing since Keynes first described this phenomenon is the proportion and types of jobs impacted. Careers that require “human” traits such as creativity, imagination, communication, empathy, or compassion have historically been secure. Jobs involving healthcare, the arts, invention, and teaching could never have been managed by something non-human. Now, artificial intelligence (AI) and neural networks have changed the scope of careers threatened, making Keynes’s term more relevant than ever.

It has taken the Western world centuries to understand that the practice of medicine must focus on people rather than disease and embracing AI in healthcare would be a blow to the hard-won progress we have made towards patient-centred healthcare.

With new technologies, computers outperform physicians at detecting early-stage breast cancers, robots provide companionship to nursing home patients, and AI is credited with the composition of a musical album. Technology can learn, empathize, and create. Or rather, thanks to recent advances, AI possesses the appearance of these abilities. Some people may think it esoteric to debate whether these technologies actually imbue AI with empathy and creativity or merely the illusion of such; however, the distinction is important when deciding if AI belongs in social domains such as healthcare. Do we want something that only appears to be compassionate caring for our loved ones when they are at their most vulnerable? Perhaps the illusion of empathy would suffice if healthcare were only about treating sickness, but medicine involves more than diagnosing disease and prescribing medication. It has taken the Western world centuries to understand that the practice of medicine must focus on people rather than disease and embracing AI in healthcare would be a blow to the hard-won progress we have made towards patient-centred healthcare.

The Medical Patriarch

In healthcare, paternalism refers to the traditions of a none-too-distant past in which the doctor stood above all, at the tip of a pyramid composed of other healthcare workers and borne on the backs of patients and their family members. The model makes sense in a way. Doctors have been to medical school, providing them the tools needed to make informed medical decisions. Besides, practicing medicine is inherently godlike, offering a command over life-and-death decisions that few other people will ever experience. It is understandable that many societies have viewed doctors with awe.

Whatever the original reasons behind the doctor-first model of Western medicine, its ethical and practical shortcomings have led the healthcare system in Canada to transition towards a patient-centred model of care. In this model, patients are directly involved in decisions regarding their treatments. They stand at the centre of their own wheels, with spokes comprised of the patient’s healthcare team. This team, which includes doctors and pharmacists and nurses and others, attempts to consider factors unique to the particular patient when making recommendations and providing information. Patient autonomy usurps the healthcare provider’s pedestal, and the provider “tries to enter the patient’s world, to see the illness through the patient’s eyes.” 

Slowly, patients ceased to be scientific specimens, clusters of pathophysiology and symptomology to be examined. Medicine became a study of people.

Patient-centred care represents a drastic shift in Western healthcare practices that began only decades ago, when the views of early psychologists, who treated humans with personalities, histories, and emotions, began to seep into medicine. Slowly, patients ceased to be scientific specimens, clusters of pathophysiology and symptomology to be examined. Medicine became a study of people. Taking this a step further, Michael Balint, both psychoanalyst and medical doctor, coined the idea of the “doctor as drug.” He theorized that the doctor-patient relationship was a therapeutic tool, and like any drug, it could be beneficial or harmful depending on the nature of the relationship. With insight like Balint’s, the healthcare fields are learning to merge Hippocrates’ oath with an understanding of the importance of empathy and humanism. As a direct result, the provider-patient relationship is shifting from the caring yet dominating status quo of parent and child, to a meeting of equals – the doctor being the expert in medicine and the patient the expert on self.

Despite the shift in medical hierarchy, recent years have brought new peril to patient-centred care. This time, the threat is not from our own egos and the resulting belief that healthcare workers know best, but from something inhuman: the algorithm. As one physician complained in a 1982 editorial in the American Journal of Public Health, “…there have been increasing attempts to transform the ‘art’ of medical decision-making into a ‘science,’ to supplement a spontaneous, informal, and implicit set of judgments with the conclusions of a predetermined… scheme of logic.” Whether or not this physician is correct regarding the inherent evils of standardization, his comment was prescient. Since his time, algorithms have come to play a dominant role in the practice of healthcare. Open any recent medical textbook and you will find standardized methods for triaging, diagnosing, and treating every condition from hay fever to hepatitis. And while pre-designated treatment plans have undeniable benefits, they encourage practitioners to once again think of patients as clusters of symptoms and to follow steps designed to suit the disease rather than the human.

Algorithms, including those that provide recommendations for medical treatment plans, are created around study results and expert experience, both of which are based on probability and statistics. While probability is wonderful for predicting trends in populations, it is not suited to predicting the behavior of an individual. Take height as an example. If I want to know the average height of a Nova Scotian, it would be ridiculous for me to measure everyone living in Nova Scotia. Instead, I might measure a hundred people and assume that their average height reflects the average of the entire population of the province. The larger my sample is, the more likely that it will represent the population it is drawn from. On the other hand, I cannot take the average height of my sample and use it to draw simultaneously precise and accurate conclusions about every individual Nova Scotian.

Statistical results from drug trials are even more limited in their applicability to real-world patients, because most clinical trials are done on near perfect subjects. The participants in these studies are not too old and not too young. They have no medical conditions or medications besides the one being studied that can peskily interfere with results. Believe it or not, real patients are rarely perfect. They are pregnant, have multiple comorbidities, or are on multiple interacting medications. Because of this, the samples used in these studies may not accurately represent the populations we care about, meaning that the outcomes of a clinical trial may have little application to the people most likely to require the drug or treatment being tested. This is before we consider the idiosyncrasies that make an individual patient, an individual. These can include financial factors and the varying degree of importance that different people will place on the potential risks and benefits that accompany any drug or medical procedure. In short, algorithms offer organized ways to attack a medical dilemma, but they should not be used for more than an initial overview of options for the healthcare practitioner’s consideration, lest the patient get lost in a sea of numbers and probabilities, and we replace our previous misconception of the patient as a grouping of symptoms with the equally damaging and reductive view of a patient as a set of statistics.

Why then, can’t AI be a doctor, or a pharmacist, or a nurse, or an occupational therapist?

Increasingly, we are being encouraged to accept AI, simplistically speaking, a glorified algorithm, into our everyday lives. AI can supposedly provide the same comfort as a pet and the support of a home-care worker. Why then, can’t AI be a doctor, or a pharmacist, or a nurse, or an occupational therapist? It has already proven to do a better job of interpreting mammographies and predicting the development of breast cancers than a human doctor can. In pharmacy, we rely on it to make sure that we have not missed potentially harmful drug interactions. However, what makes AI ideal for reading a diagnostic test does not enable it to tailor a treatment option to a particular patient or offer comfort to an individual.

I Said This Had Something To Do With AI…

Much of the hype surrounding AI in recent years is attributable to deep learning and the neural networks on which it relies. The previously mentioned mammography example is made possible by neural networks, as is a similar story of an AI reading EKGs to predict cardiac-related deaths.

Designed to enable computers to perform the same tasks as a human being, neural networks were modeled based on our (limited) understanding of the human brain. They are intended to acquire knowledge from their environment, dubbed “learning,” and to store the acquired knowledge in so-called “interneuron connections” (Haykin, 2009). Basically, evolution should have copyrighted our brains.

In the simplest neural networks, the initial layer of neurons takes in information and passes it to a second set of neurons. These inner neurons apply specific transformations to the data before passing the revised information to a final level of neurons, which outputs the predicted results. During training of the network, expected outcomes are known and can be compared to the result predicted by the network. The difference (“loss”) or error between the calculated answer and the known answer is sent backward through the network. Neurons all receive unique information regarding the loss, based on how much their action contributed to the network’s output. In other words, each neuron is told how much it contributed to the error, allowing the network as a whole to “learn” how it needs to treat inputted data in order to generate the appropriate result. When this process is repeated enough times, the network can take inputs it has never been given before and make near-perfect predictions. These predictions are based on the probability that the examples to which the neural network has previously been exposed are representative of the current set of data. If so, the network can respond to new situations. If the initial data inputted was well-chosen by the humans responsible, then the network can even respond to new situations appropriately. This is how a computer can take in thousands of EKGs representing known normal and abnormal heart function and use this information to not only predict the meaning of EKGs it has never seen before but to do so with better accuracy than any human. Afterall, my brain can’t remember where I put my keys, let alone the result of every EKG I have ever seen.

Oliver Sacks Is Rolling In His Grave

My husband and I joked about writing a facetious book: How to Use Statistics for Evil. We never did, but the point is that there are enough ways to misuse statistics and probability that one could easily fill a book. Sample size, or “n,” provides a chapter all its own. Often, healthcare practitioners are taught to be suspicious of any study with a small sample size. Take the previous example of trying to find the average height of Nova Scotians. In that case, studies with higher n values are more likely to accurately reflect the population from which the study sample is drawn. If I was to take the time to measure everyone who lives in the province, my calculated average would be perfectly accurate. In addition, every Nova Scotian would be represented in my data, meaning that my data would be applicable to each of them. From the perspective of a medical study, the likelihood that a study’s results will apply to my particular patient increases as the number of patients who participated in the study increases.

Based on all of that, big n = good. Unfortunately, quod erat non demonstrandum. High sample sizes have downsides of their own. As sample sizes increase, each individual’s data contributes less to the overall results of a study. If I poll ten people regarding their preferences for sausages or burgers, each subject is contributing 10% of my data; however, if I include 100 subjects, each of their answers carries less weight, contributing only 1% of the data. As n goes up, we get results that are more generalizable to every member of the population and increase the likelihood that our data will somewhat fit everybody. On the flip side, we lose the idiosyncrasies and individuality conveyed by individual responses.

The effects of averaging to a population is easily seen in the clothing that we purchase. The clothing carried in stores is sized around the average human body that buys from retail. Because of this, almost anyone can walk into almost any clothing store and find a shirt that will cover his or her torso. Almost no shopper will find that a particular shirt fits perfectly though. Each item, with its sizes ranging from extra-small to extra-large, will fit most body-sizes somewhat, but most people will need a nip here or a tuck there to get a custom-looking fit.

What if we are talking about drugs or other medical treatments instead of clothing? It’s one thing for a shirt that is a little large on me to look boxy, but any medical treatments I receive should be, as closely as possible, a perfect fit for my physiology, my lifestyle, my beliefs, and my disease. Human healthcare workers take the information provided by studies – how well a treatment works for the average person with the average course of a disease – and try to apply the information to their patients. We act like a tailor, adjusting drug strengths or dosing directions or medication selections to fit a particular patient as well as possible. In the words of Oliver Sacks, a British physician and author of Awakenings, “One must drop all presuppositions and dogmas and rules… one must cease to regard all patients as replicas, and honor each one with individual reactions and propensities; and, in this way, with the patient as one’s equal, one’s co-explorer, not one’s puppet, one may find therapeutic ways which are better than other ways, tactics which can be modified as occasion requires” (1999).

…one must cease to regard all patients as replicas, and honor each one with individual reactions and propensities; and, in this way, with the patient as one’s equal, one’s co-explorer, not one’s puppet…”

Oliver Sacks, Awakenings (1999)

A neural network, designed around probabilities, cannot do the required nips and tucks. It learns by finding the commonalities between bits of data and replicating its treatment of subjects until it has an algorithm that can be applied to new scenarios. It is intended to iron out the individual propensities and find the solution that works in every scenario. The CATIE study provides the perfect example as to why such methods are not practical in healthcare. This trial featured over 1400 participants across fifty-seven sites. It provided information never available to clinicians before: a direct comparison of the efficacy of certain antipsychotic medications based upon how participants responded to the medications. Secondarily, the study’s investigators noted which antipsychotics were best tolerated on average and which were more likely to cause sides effects. The results of even this large-scope study do not give a clear-cut answer as to which of the drugs featured is the ‘best’ for psychosis though. Even the least well-tolerated intervention saw 81% of subjects remain on it, while even the best-tolerated treatment lost 10% of patients to discontinuation.* Efficacy was similarly murky in that no drug worked for every person tried on it, and we cannot yet tell who will respond best to what drug. The results of the CATIE study also cannot tell us if a particular patient will prefer the drug that can on average cause kilos of weight gain or the one that might lead to movement disorders – an uncontrollable smacking of the lips, a debilitating tremble, or an insoluble inner restlessness. Neither choice is right or wrong. Will the patient’s answer change if he works as a lawyer or a lecturer, where incessant smacking of his lips might slur his words? What if his father passed away at an early age from a heart attack? In that case, a drug that could worsen his cholesterol levels or blood sugar might present, to his mind, an unacceptable risk.

The good physician treats the disease; the great physician treats the patient who has the disease.”

William Osler

When I consider a patient’s options, I take what I know about them – their family lives, their hobbies, their jobs, their history – and I present the relevant medical information in a way that will be meaningful to them. Good communication is of utmost importance, as is sharing the information in a way that is applicable to the patient’s own life – a way that allows for deep comprehension of available options, rather than just a vague sense of understanding. The job of healthcare workers is to enable patients to make a choice that is appropriate for them. If we stop at the studies, at statistics and probability, we have failed them. As William Osler, a founding father of modern medicine, said “the good physician treats the disease; the great physician treats the patient who has the disease.” Even with its anthropomorphous appearance of empathy, artificial intelligence will never be great in the field of medicine while its “understanding” of the human condition is reliant on thousands of case studies amalgamated via the melting-pot of probability.

When a patient wants a second opinion, it is because there is a chance that two doctors will say different things, even given the same schooling and the same expertise. The same cannot be said if those two doctors each purchased the same AI-driven software and relied on it for their answer. This isn’t because one of the doctors in the first scenario is wrong but because there sometimes is no right answer in medicine. It is all shades of grey and comes down to guiding the patient and supporting them while they make the best choice for them. And until machines cease to work entirely off of probability, they are not equipped to deal with humans and all of their shades of grey. Their black-and-white treatment of data is a loss of patient autonomy without ever removing the right of the patient to choose.

*Numbers solely reflect drop-out rates due to treatment intolerability. Overall dropout rates (that take into account treatment efficacy and other factors) are higher.


Ellen Nantau obtained her pharmacy degree from Dalhousie University in Halifax, Nova Scotia. After a few years of working as a community pharmacist, Ellen opened Tiny Oak Pharmacy, aiming to foster an environment ideal for providing patient-centered care. Ellen’s personal interest in mental health led her to finish her degree in psychology at Saint Mary’s University in Halifax, where she was introduced to discussions regarding the intersection of healthcare and technology.

Image: SRI International, CC BY-SA 3.0, via Wikimedia Commons