At J2, we spent a lot of time talking about optimizing provider networks. How do we show users the impact of different provider configurations? How can that lower overall payer spend? How does that relate to actual member care? 


Our provider data is at the heart of everything we do at J2 Without it, our calculations and reports wouldn’t be significant or impactful. We dedicate substantial resources to ensuring that our provider data is the best (and we’ve found that our results are 50% more accurate than industry standard!) To provide a deeper view, we sat down with two of our data scientists: Josh Temin (Principal Data Scientist) and Andrew Kim (Data Scientist).


[The following interview has been condensed and edited for clarity.]


Thanks for sitting down with me today. Let’s start with something simple. What is provider data?


Josh Temin (JT): Provider data is information about health care providers, individuals, facilities, organizations that provide health care services. The real challenge with provider data is getting the right kind of data and using it and making sure it works, and it’s consistent. Provider data is notoriously unreliable.


Tell me more. Why is provider data considered to be unreliable?


JT:  There are so many different ways of getting provider data and those sources sometimes conflict. A provider may list their public address as one thing, submit their claims from a different address, and send billing information to the member from a third address. Sometimes, the information submitted is incomplete. At J2, we have to take all those different kinds of provider data and curate them to make sure they work together. 


Andrew Kim (AK): Time is another big issue when it comes to provider data. Providers move and their data is all of a sudden inaccurate. I like to search for my family members who are doctors in our provider sources and issue spot. My sister moved from California from Boston and so many sources don’t know how to capture the ground truth of where she is practicing today. Oftentimes, when you are googling providers manually, you see where they have historically worked but that doesn’t reflect today’s reality.


This is really problematic as it relates to my work at J2 on adequacy. Finding the ground truth on location is essential to calculate access but it’s not just address. Specialty tags can be really difficult and using the wrong tag can have regulatory repercussions. Our provider data needs to be flexible enough to be able to be mapped to both very high level inclusive categories, and also, very fine-grained categories. 


JT: There are more than just regulatory implications. Inaccurate data can cause problems on the contracting side. If your team can’t contact the physician because you have the incorrect addresses or incorrect phone number, you may be making lots of extra calls. And then, of course, inaccuracies can negatively impact member experience. You need to actually know where the physician is to be able to kind of create the best access to care for your members. And if you don’t have that, if you have a blurry view of that, it can be difficult to optimize the right provider ecosystem for all your members.


How does J2 get its provider data? What steps are you taking to ensure its accuracy?


JT: There are a lot of avenues to get provider data. J2 has a mix of industry standard sources (directories, regulatory filings, etc) and private data vendors and in addition to our own primary research. Internally we then aggregate that data, double check that all the fields we need are there, and compare them against each other to create a consistent set of high quality data.We have our own system of determining, which is sort of the best of that data and aggregating the different pieces. 


Let’s say that we have 5 phone numbers for the same provider. Through automated and manual processes, we weigh our sources based on historical accuracy to find the right phone number. Even after we determine where we think the information is accurate, we continually perform consistency checks on our provider data universe to make sure we’re up to date. For each provider, we assign our own confidence score to the associated data so users can prioritize that data accordingly. 


How does this all relate to network building?


AK: This relates back to what Josh was saying before about contracting. Basically when we advise our clients on access, we offer providers and facilities that can fill adequacy gaps. If we suggest the provider or facilities that either doesn’t exist at that location, or doesn’t actually practice the services that we say they do, that’s a huge impediment to the client being able to construct their network, and provide care to their members. We have a really high bar for what provider data we use at J2.


JT: Our users are using J2 to meet regulatory compliance so it’s not only important that we are giving them the right provider data to fill gaps but also that the underlying data in their networks is correct. We help them understand the current state of their networks so they know where they need to build. One of ways we help with that is through determining provider accuracy. Basically clients send us their provider data and we map it into our internal systems to see if it matches our provider universe. 


For example, if you send us a phone number that we’ve assigned a low confidence score to, we’ll flag that and provide information that we’ve determined is more up to date. Especially when it comes to locations and speciality tags, accuracy fixes can really bring adequacy and access into focus.


Beyond adequacy, what are some of the other J2 use cases for provider data?


JT: We’ve been looking at provider performance– how are they delivering care and how does it measure to local standards.  One thing that’s really important for that comparison is understanding and tagging the right specialties and location. From there a cost and quality analysis is easier to spin up. Another use that we’re proud of is marketability- understanding how desirable a network or a provider is to members. The first step is understanding that is building an encompassing provider universe.


AK: One thing I’ve been working on directly is being able to leverage provider data to better understand relationships between providers, their affiliations. For many providers, you can’t just call up an individual provider and contract with them. In most cases, they usually have some entity that contracts on their behalf. Some of the claims data and new price transparency data can give us insight into these contracting relationships. As we assemble that picture of the different subcontracting relationships, we can empower our clients to mix and match different provider groups in order to build the best network possible.


You mentioned the Price Transparency Act, which recently went into effect that requires health plans to publish pricing information.  How can that help network development and design?


JT:  This is going to change the way our users think about network costs. With this better cost information, J2 can more authoritatively tell users the downstream cost impacts of contracting with certain providers. Users can use this information to target new providers and in contract negotiations. I think that’s one of the most exciting use cases. It opens the door to new comparisons. With this, we’ll be able to show users how their rates stack up to competitor networks. They will be able to say, wow, I’m creating a much more expensive network than I need to; I should go back and renegotiate some of my contracts.


I do want to say that price transparency data in its current form has its problems. Right now, it includes cases where payers have adjusted the rate without context. As a result, you have two records, basically like two rows in your spreadsheet, for the same provider in the same service in the same group with wildly different rates. This poses a challenge as, to transform this data into something usable, we need to go through and verify each individual field. It’s gonna be exciting to actually start working with this data with, you know, real world use cases to see how people can actually transform their networks and start taking actions.



Josh Temin

Principal Data Scientist at J2 Health

Josh is a data scientist with experience across the healthcare and startup ecosystem. He’s led provider analytics efforts at major payers, designed claims and EMR datasets for financial institutions, as well as developed provider quality rating systems. Josh holds a BA from Washington University in Saint Louis in Mathematics and a MA in Economics from the University of Pennsylvania.

Andrew Kim

Data Scientist at J2 Health

Andrew is J2’s first data scientist and has primarily dedicated his time to developing its access algorithm. Andrew holds a BA from Harvard University in Psychology and a Masters of Divinity from Harvard Divinity School. Andrew has previously served as a teaching assistant at Harvard University in Machine Learning and Statistics courses.