“I’ve always told my former students in biostatistics that all the numbers should have an explanation,” says Dr. Benjamin Co.
The doctor’s predilection to teach has gone up a gear during the COVID-19, as he has shared many insights in the past few weeks in his informational treasure chest of a blog. In recent posts, he has talked about everything from the factors that the government should look into to lift the lockdown to how life would look like after the quarantine.
In his most recent post, he broke down the numbers presented in the Department of Health’s COVID-10 tracker, the data of which, he says, leads to a lot of questions. “It is imperative that data gathered is accurate, valid, and not confusing to the reader,” he explains, emphasizing that this should lead to better next steps.
Here is his blog post in full:
I always tell my students that data can always be interpreted correctly or end up confusing to the reader.
I have provided the link of the Department of Health’s Covid-19 tracker. There are various sources where one can obtain good data analytics from. If you go to this link, you will find out the improvement in the COVID-19 tracker.
You may also like:
- Voices from the frontline: This delivery rider braves the roads for those who can’t
- This group is creating scalable structures so hospitals can accept more COVID-19 patients
- New Zealand’s quick and decisive response leads to falling COVID-19 numbers
- A Filipino in Berlin during COVID-19: ‘There is security amidst fear and anxiety’
- Life after lockdown: An infectious diseases expert on how we could avoid a surge in COVID-19 cases
Then there’s the down side. The data analytics is wanting in useful information. The kind of information that should make us decide on how bad this Novel Coronavirus pandemic is in the Philippines and whether we’re actually “flattening the curve.”
Once you get into the site, this is what you see.
It gives you a pretty good picture of the daily information on the epidemiology of COVID-19 in the country: 4,932 confirmed, 3,082 currently admitted (they don’t identify how many are critically ill and the rest are probably mild and quarantined), 315 deaths, and 242 recovered. Then in the gray bar are the ones “for validation.” There’s a caveat that says “around 25 percent of province-level data are still undergoing validation.” But wait… there’s more!
If you scroll down a bit, you will see the table on testing capacity. Here’s where the data actually differ. The testing capacity as of April 11, 2020 showed 4,913 positives. The total confirmed as of April 13, 2020 is 4,932. On April 11, the DOH officially reported 4,428 confirmed cases. What accounts for the discrepancies in the numbers coming from the agency?
The 1,293 case for validation (said to be province-level data) is confusing.
I would assume that province-level data would point to tests coming from Southern Philippines Medical Center, Baguio General Hospital, Vicente Sotto Memorial Medical Center, Western Visayas, and Bicol Regional Diagnostic Laboratory. If you total their positive cases, that would just be 204 positives from the provincial testing centers.
We can assume that some specimens were sent to Manila for analysis. How many were sent to Manila? These inconsistencies in the numbers make one wonder if they just deduct the total cases from the remaining cases in order to arrive at the discrepancy?
Let’s look at the numbers again: 4,932 total confirmed. 3,082 currently admitted, 315 deaths, and 242 recovered.
Assuming the numbers are correct, that’s 3,639 cases (admitted currently, died, and recovered). What of the remaining 1,293 cases “undergoing validation?” What does undergoing validation mean?
First, if they’re still not yet validated, they shouldn’t even be part of the total statistic.
Second, why are the numbers in the provincial testing centers not tallying (202 from all the provincial testing sites versus 1,293 cases at the national level)?
The third and most vital query, does the DOH actually have data on the patients that tested positive, but were sent home for quarantine? How many of them returned for retesting? Of 4,932 confirmed, only 3,082 are currently admitted. Or 1,850 cases distributed as follows: 315 deaths, 242 recovered and 1293 for validation?
Data integrity is important in the analysis of outcomes and planning of mitigation strategies. It is, after all, the basis of our life after April 30.
The new site, while providing information on cases, deaths, recoveries, is wanting in the type of severity of the illness. We all know that not all admitted cases are in the ICU or are critically ill.
If we scroll down a bit, there’s interesting information regarding the availability of beds and mechanical ventilators in various hospitals.
This above information above while helpful is disturbing. If you look at the ICU beds, only 391 beds (out of 1,085) are filled. The remaining 2/3 are unoccupied. There are more than 87.29 percent mechanical ventilators still available. Yet reports coming from various private hospitals are that they are filled to the brim and that the healthcare system is overwhelmed by the coronavirus infection that some hospitals had to turn away patients.
Based on the data provided in the DoH website, of the 3,082 currently admitted, 391 are in the ICU (as of this writing). That means only 12.7 percent are critically ill (or needing intensive care).
There are more deaths than daily recoveries among those currently admitted. The graph below is a screenshot of the new daily deaths and new daily recoveries on the website. Implying that of the 391 cases in the ICU, 315 have died, an 80 percent mortality rate when intubated or in intensive care.
Of the 3,082 cases admitted in the hospital, assuming that only 391 are in the ICU, what happened to the remaining 2,691? That’s the majority of the patients, who happen to be mild or moderate and would probably recover.
Finally, there’s the interpretation of the data.
Cumulative is the operative word in the presentation of data.
Which means that regardless of patients getting better or dying, the total cases is what you see. But in reality it is not. Minus the deaths and recoveries, we actually are tracking 4,375 cases that are still active. The deaths and recoveries are considered closed.
The red herring here? The ones for validation. They form the inconsistent information that needs an explanation so that it is not misinterpreted.
I’ve always told my former students in biostatistics that all the numbers should have an explanation. It is imperative that data gathered is accurate, valid, and not confusing to the reader. It is also important that all tables and graphs are reconcilable. Otherwise, any conclusion or decision that is made with this kind of data is confusing and simply leads to bad decisions in preparation and planning.
The bottom line of good solid data? The April 30 deadline.