Data Science in the Health Care Industry
By Kat Campise, Data Scientist, Ph.D.
The rising costs of health care in the U.S. are reaching a critical juncture. Indeed, the U.S. currently has the highest health care expenditures throughout the world and comprises roughly 18% of the entire U.S. GDP. The primary factors contributing to the steady, yet costly increases, are the U.S. Medicare system, the perpetual uptick of costs for health care goods and services, and an upsurge in insurance premium rates — particularly those associated with the Affordable Care Act (ACA). Certainly, one upside of the ACA is that a more significant portion the U.S. population has health insurance. But, this fact has not stemmed the ballooning price tag that is becoming an obstacle to health care access.
The Insurance Middlemen
There are now many layers in between the individual’s health and the frequent life or death decision making that occurs at the insurer level of the medical system. While in one sense it seems logical for insurance companies, who are attempting to control costs, to act as a gatekeeper for approving medical treatments, the process is frequently overseen by individuals who are neither licensed medical personnel nor are they experts within the field of medicine (in any capacity). Checks and balances between the medical profession and the pooled risk insurance organizations are one thing, but the heavy compliance constraints placed on medical professionals (including hospitals) is one of the central dysfunctions with U.S. health care. For example, the fax machine is still an often-used method for gaining pre-authorizations for medical treatment. In the 21st Century, where just about everything is now exchanged in digital format, it’s astonishing that such an antiquated method — which can seal the fate of one’s medical treatment — continues to be used. Doctors, physician’s assistants, and other medical office staff spend upwards of 20 hours per week communicating and coordinating with the different health insurers as each has varying degrees of coverage (or no coverage) for a vast array of procedures and treatments.
Lifestyle Choices and Chronic Disease
Although one’s genetics definitely plays a part in the physiological matrix, lifestyle factors are a large contributor to chronic diseases such as heart disease, diabetes, and cancer. Smoking, excessive alcohol consumption, an unhealthy diet, and little to no exercise are aspects of health that can be mitigated at the individual level and reduce the likelihood of chronic disease. Ideally, those who take great care of their health would pay less for the health care services they did utilize on a less frequent basis. This is not the case as there are plenty of other individuals who continue to consistently engage in high risk behaviors with little regard to their health. Surely, freedom of personal choice is a principle to be lauded. However, it’s not just the individual who pays the price in terms of health care costs. Everyone making payments into the pooled risk system will also see price increases for behaviors they are actively avoiding so as to keep themselves disease free.
How Can Data Science Help Improve the U.S. Health Care System?
Technology in general, and machine learning or AI specifically, may provide solutions to both the costliness and disjointedness of U.S. health care. Massive amounts of data flow through the health care system and can be leveraged by data scientists for improving the outcomes of health care patients, assist in modifying behaviors that place individuals at higher risk for chronic disease, boost the deployment of precision medicine, and streamline the sharing of patient records digitally while also maintaining HIPAA compliance requirements.
Smart Rooms
In 2005, IBM began collaborating with the University of Pittsburgh Medical Center for the purpose of designing a hospital “smart room” where interconnected devices would help to simplify the front-line staff’s workflow. Everything from voice activated temperature controls, alerting nurses when a patient has left their bed, to identifying staff as they walk into a patient’s room (and automatically pulling up a patient’s medical records for the approved medical team) have been proposed features for the smart rooms. By deploying machine learning and AI algorithms, the series of tasks for a caregiver — as based on their assigned role for patient care — will be analyzed and automatically prioritized in connection with the individual patient’s condition and treatment protocol. Caregiver workload management can also be monitored via sophisticated algorithms and alert patient care management regarding the likelihood of when staffing levels need to be increased, when routine work is likely to fall behind schedule, and automatically re-assign workloads to available medical staff. Per IBM’s white paper, such implementations show over a 60% improvement in nursing documentation.” One of the primary functions of most, if not all, data scientists is to create predictive algorithms which are at the center of a fully functioning smart room facility. While data scientists don’t construct the front end technological tools (e.g., the patient monitors, caregiver monitors, and other medical devices), the algorithms required for responding to human interaction, predicting and modifying human behavior, and making recommendations for (or taking an action such as automatically re-assigning a caregiver’s asks) are within the data science domain.
AI and Robotic Surgery
Surgery is one of the most intricate and risky fields within medicine. Depending on the type of surgical procedure, the patient may be on the operating table for an hour (sometimes less) or for many hours as the surgeon and their surgical staff labor to preserve the patient’s life before, during, and after the operation. However, a surgeon’s skill and physical wherewithal (surgery is stressful for all involved — to some extent) will vary. While it’s deeply uncomfortable, if not frightening, to think that surgeons can make mistakes, they do. They are human, after all. Enter the world of AI and robotics which can monitor a surgeon’s movements, assist with precision decision-making by providing immediate feedback to the surgeon throughout the surgical process and for the patient post-surgery, and collaborate with the surgeon by performing specific surgical procedures. On the surface, this appears to be a straightforward algorithmic implementation. But, human physiology is far more complex, and a compendium of data must be collected and analyzed on the patient side, the surgeon’s side, and for the robotic functioning within this elaborate equation. Such is where data scientists can bridge the gap between human and robotic interaction by building an intelligent algorithmic analytics system that perpetually self-updates based on the constant environmental data stream. In collaboration with medical and robotics experts, data scientists can do far more than create AI that teaches itself chess (and defeat professional chess players): data scientists can help save lives.
Wearable Technology and Behavior Modification
Fitbits, Apple Watches, heart rate monitors, and other medical devices or fitness trackers that give immediate feedback to users are already in widespread use. Millions of individual users track their steps, sleep patterns, water intake, macronutrients, blood glucose levels, and caloric expenditure by using their smartphones and related apps. Not surprisingly, the wearable technology market is predicted to reach $25 billion by 2019. In terms of behavior modification and its relation to healthcare costs, AI algorithms can be used to notify the user as to the predictive likelihood that a behavior will not only increase the risk of developing a chronic health condition (in connection with health data that has already been collected for the individual user) but also increase their health care costs. Insurers and medical care providers can use this information to adjust health insurance premiums and co-pay costs automatically or more precisely regulate a treatment protocol for an existing condition. Users still have the freedom to either move forward with the behavior or immediately discontinue it; but, the user is promptly made aware of the financial and health consequences of their choice. For users who are already undergoing treatment, wearable devices and apps can notify them as to when it’s time to take their prescribed medications. If a patient is not adhering to their prescribed treatment, then health care providers can be alerted and, along with the AI warning sent to the user, quickly follow up with the patient to either adjust the medication or prompt them to adhere to their course of treatment. As with everything health care related, a one size fits all method for analysis and prediction will not suffice. Therefore, data scientists and the medical profession are tasked with ensuring that the algorithm is able to individualize its responses in connection with all stakeholders involved: the patient, the health care provider, and the health insurer.
Precision Medicine and Digital Health Records
One of the first topics in many machine learning courses is training an algorithm to identify and classify a series of images correctly. The pertinence here is the direct association with utilizing AI (which is initially trained and tested by using machine learning algorithms) for medical imaging where the algorithm provides real-time analytics of the CT scan, X-ray, MRI or another image type. This process can be taken several steps farther by offering a predicted diagnosis, whether additional tests are needed, specify which tests should be included, and a recommended course of treatment. If insurers are involved — which is almost assured — then the patient’s medical file can be automatically updated for their insurance records, and certain treatments can be pre-approved. All of this information can then be delivered to the patient’s primary care physician, medical specialist or any other health care provider that the patient has given prior authorization for access to their private health care data. Ultimately, health care providers and the patient must still maintain the autonomous decision-making process and work in collaboration with AI rather than being enslaved to an algorithm that — although it’s been created by humans — lacks the complete human experience of empathy and compassion. Data scientists are not merely algorithmic quantifiers confined in isolation from the results of their work. They are the human intermediary between the computational world of machines and the labyrinthine realm of human psychology and physiology. This reality should be kept in mind for anyone considering a data science career, and particularly for those who desire to make a positive impact in the lives of others through the health care industry.
Data Science Tools in Healthcare
Putting the hardware and software tools in data science aside for a second, the most underrated and least talked about data science tool is communication and collaboration. Within the health care community, data scientists must communicate with a variety of stakeholders: doctors, hospitals, insurers, patients, medical researchers, medical software vendors and programmers, data engineers, producers of medical equipment, and IT professionals — along with a plethora of other experts. Thus, being able to communicate data science concepts that others may not fully understand is essential. Furthermore, medical terminology and compliance issues that are specific to the industry must also be understood as collaboration becomes challenging when stakeholders are not using a shared language. With regard to the hard skills and tools required, insurers, hospitals, and individual or small health care groups each utilize their own internal billing and health care database systems. Data scientists will still need SQL, MySQL, NoSQL, Python, R or other query and programming language skills. But, unless the data scientist has already worked within the health care sector, there will be a learning curve for coming up to speed on the current software systems as well as the medical coding and classification protocols used in the industry. Additionally, natural language processing (NLP) and image identification and classification experience is a must for analysis of patient records, disease prediction methods, and navigating the legalese of the insurance claims process. As always, in addition to advanced math and statistical knowledge, hands on application with machine learning, deep learning, and building AI algorithms is a non-negotiable requirement.
Conclusion
Incorporating AI into the field of medicine can significantly lower health care costs by automating, thereby streamlining, the communication between the patient, health care providers, and insurance companies. Health related decisions at all levels can be attained more quickly, accurately, and with fewer layers of bureaucracy through the collaborative assistance of data scientists who’ve achieved expertise within the health care industry. If you’re interested in becoming a data scientist in health care, and you’ve not yet had any exposure to the industry, then one trajectory is to begin as a data analyst or a junior data scientist for a health insurer or another health care enterprise (e.g., for a hospital). If you’re beginning your formal education as a data scientist, then make sure that you take coursework directly related to the sector such as physiology, anatomy, medical terminology, and so forth. The more you know and understand before being hired within the industry, the less energy you will need to expend “learning on the job.” Summarily, data science isn’t merely the “sexiest job of the 21st Century.” You can be a major influence on the health and well being of possibly billions of individuals throughout the world.