How to adopt predictive modeling in healthcare painlessly

8 min.

Healthcare providers spend most on the smallest segment of highly expensive patients. For example, Kaiser Permanente in Southern California reported that 1% of their patients makes 29% of their total costs, with whopping $98K per patient annually. The main reason for this imbalance is the lack of continuous focus on patients with chronic conditions, which results in ill-timed interventions and increased readmission rates.

To reorganize care delivery effort in an informed way, organizations turn to big data analysis, embracing the power of predictive modeling in healthcare. When applied properly, predictive modeling is a multitool for:

  • Managing population health management
  • Identifying care needs
  • Preventing readmissions
  • Introducing cost-effective decision-making
  • Observing trends in care quality and further outcomes

However, while there is no shortage of needed data or custom healthcare software ready to tackle the challenge, the tough part is making this data actionable. It is necessary to see the full path ahead and get ready to apply the received insights right away, such as dedicating a care management team to contact high-risk patients and their relatives to schedule a screening test.

Although there is no one-size-fits-all approach to predictive modeling implementation, we have a few universal tips that will help healthcare organizations avoid the most common pitfalls along the way.

The essence of predictive modeling

Predictive modeling uses data mining and probability to identify patterns in data and recognize the chance of particular outcomes occurring. To build an accurate predictive model, developers define the problem, collect data, and compose different models. For most analytics goals, a combination of clinical and claims data is used.

Modeling Process

Each model is created with a range of predictors¾variables that affect future results. The model can use a simple linear equation or a complex neural network to perform advanced tasks. Data analysts run and assess different models to solve the defined problem. After the assessment, the model with the highest score is validated, tested, and ran against a real-world dataset.

Getting ready to embrace predictive modeling in healthcare

The HIMSS Analytics Adoption Model for Analytics Maturity (AMAM) and the Healthcare Analytics Adoption Model by HealthCatalyst are two similar models that introduce a step-by-step roadmap for adopting analytics in healthcare organizations.

Predictive analytics, together with its main techniques, including machine learning, data mining, and predictive modeling, stands at the next-to-last stage within both models. Accordingly, providers have to level up a few times to reach the needed analytics maturity rank with automated reporting, continuous data warehouse updates, and standardized patient records. Let’s be honest, some healthcare organizations may lag behind a few stages.

Healthcare Analytics Adoption Model

However, it is still possible to start incorporating predictive analytics and modeling from stages 4 in the AMAM and 5 in the Healthcare Analytics Adoption Model accordingly. These analytics adoption levels are mature enough and thus can benefit from accurate forecasts in waste and care variability reduction as well as in population health management.

Rule #1: Bigger isn’t always better

In healthcare, generalization and global features will lack utility and only introduce noise. So, while working with big data, organizations need to think small and narrow down their focus, targeting patient groups and particular care settings or departments within a hospital, not the whole health system.

For example, the overall readmission rate prediction can be useful for general reference. But in order to identify the right target for timely interventions and preventive actions, providers should tap into a specific condition and identify corresponding patient groups.

Rule #2: Know your data

Data comes from various sources and in all shapes and sizes. So, it is important to understand the limitations and possibilities it brings into predictive modeling.

For example, episodic data with diagnoses and treatment plans tends to be unstructured, because some providers jot down examination notes on paper and then transfer them into EHRs as plain text. Others just skip some fields. This data variability may significantly hinder further analysis.

On the other hand, claims data is structured. It always includes ICD-10 and CPT codes, even identifying services and prescriptions that a patient received outside the health system or network. However, it isn’t precise enough to offer details on the patient’s health status changes within care cycles across different care settings. Also, claims have a lag time by default, so they aren’t a good fit for short-term predictions.

Data also defines the possibility to predict certain events right away.

For example, if providers want to understand the groups of patients prone to readmissions within certain timeframes, they already have the needed data: diagnoses, information on readmissions, documented periods between discharges and readmissions, etc. In the same manner, organizations can define individual patients or patient groups that risk developing post-surgery complications or sudden hypertension based on historical data.

Predictive modeling in healthcare analytics can fail where it is most valuable¾in cases that occur outside medical facilities. These cases are usually related to chronic disease events and at-home post-surgery rehabilitation periods. The problem here is that the post-discharge data can be unavailable or too fragmented. The reason is, healthcare organizations either use patient-submitted surveys at specific time intervals or gather health status information during follow-up appointments and care episodes.

These fragments appear in EHRs and can point out only severe health deteriorations, which may be useless for both individual patients and patient groups due to many unknowns in-between encounters.

To solve this problem, providers can introduce patient-generated data (PGHD) into the picture. This will help them not only make more informed and accurate predictions for individual patients but also elicit patterns for different patient groups.

Rule #3: Predictions are educated guesses, for now

Predictive modeling can be similar to looking into a crystal ball, a high-tech crystal ball that processes an enormous amount of data. Given that healthcare offers scarce care quality criteria and health outcome measures, organizations will often wander in twilight when trying to predict risks in patients with conditions beyond diabetes and COPD.

Without relevant and standard outcome measures, preferably enforced by the CMS, widespread adoption of predictive modeling in healthcare is hindered. But health specialists are used to making educated guesses, working with incomplete patient information on a daily basis. They back up medical decisions with past experience, collective care team knowledge, and training.

Adopting this approach, organizations can define their own internal measures of patient health outcomes and care quality to train algorithms and build structured models. Rule #2 can guide the efforts.

Rule #4: Implementation struggles are real

When deciding on predictive modeling adoption, healthcare organizations should consider these three tiers of potential challenges: technology, security, and people.


Technology is a pool of numerous issues in regard to demanding clinical systems, workflows, and environments. Providers need to enable seamless integration between all systems, including EHR, CRM, LIS, RIS and more, which will ensure smooth data aggregation.

Then, both structured and unstructured data needs to be leveraged, which means natural language processing challenges and possible data errors. To make results immediately comprehensive for all stakeholders, data should be properly visualized, with summaries and automatically generated insights for each prediction.


Sensitive data privacy concerns are always on the table, especially when it comes to collaboration between organizations or data exchange within one health system. Providers need to invest in information security assurance to retain most of the utility of collected data for predictive modeling while complying with HIPAA regulations. Besides, population health management studies require data anonymization to be safely used by any organization.


With expert assistance from vendors, providers can let go of the most technology and security-driven troubles, but it will take a certain commitment from the organization itself to engage decision-makers and health specialists into predicting outcomes and reacting to them.

Predictive modeling can’t offer a practical use in vacuum conditions, both regarding data and human expertize. Health specialists are invariable in this equation, because they represent the actual workflow, understand the data they input, and can offer clinical experience to interpret predictions more accurately.

For the long-term success of predictive modeling in a healthcare organization, all three tiers should be embraced by providers and technology enablers.

Predictive modeling rules aren’t meant to be broken

The value-based approach to healthcare encourages organizations to jumpstart using predictive analytics for improved patient health outcomes, promising incentives, and major reimbursements. There are multiple additional benefits on the way, including more informed clinical workflows, optimized facility administration, less risky supply chains, and cost-effective decisions.

But it is not easy to succeed in predictive modeling without knowing the basic rules. Providers can’t just leave data hanging after getting a prediction in some form. The set of insights has to offer recommendations and entail particular actions for each predicted category, event, or health outcome.

That’s why clinical event predictions should form around properly gathered, processed, and secured data, supported by real-world expertise of health specialists.