1/ Now that @HealthyFla is reporting cumulative person-level testing results stratified on pg 1 of their daily statewide #COVID19 report:
I've been getting MANY questions about why some seemingly simple calculations don't align, & are evidence of erroneous data.
Let's clarify!
I've been getting MANY questions about why some seemingly simple calculations don't align, & are evidence of erroneous data.
Let's clarify!
2/ Image
shows how we would use the last 2 daily statewide reports to estimate:
new cases (first-time + people) -- 3,787
new people tested w/ + or - result -- 26,113
Why only among Florida residents? Because subsequent testing data on page 2 is restricted to FL-residents.



Why only among Florida residents? Because subsequent testing data on page 2 is restricted to FL-residents.
3/ But when one looks at the 8/18 county-level report, on pg 2 (statewide summary)
At the section titled "Percent positivity for new cases in Florida residents", for 8/17 (the day to which our calculated numbers from prev tweet applies)...
It suggests there are 3,922 new cases
At the section titled "Percent positivity for new cases in Florida residents", for 8/17 (the day to which our calculated numbers from prev tweet applies)...
It suggests there are 3,922 new cases
4/ Both numbers are supposed to capture NEW + PEOPLE, but 3,787 does not equal 3,922 last time I checked.
This inequality makes people uncomfortable with the numbers.
But there IS an explanation - and like so many other metrics - it all has to do with DATES.
This inequality makes people uncomfortable with the numbers.
But there IS an explanation - and like so many other metrics - it all has to do with DATES.
5/ When we use the most recent two statewide reports to estimate the number of new cases REPORTED, it's just that - based on date REPORTED.
When @HealthyFla calculates the % positivity measures, it is based on "CASE DATE", or the date on which someone was confirmed as a case.
When @HealthyFla calculates the % positivity measures, it is based on "CASE DATE", or the date on which someone was confirmed as a case.
6/ Just because we learned of a new + person "today", their test could have actually been yesterday, 3 days ago, 1-2 weeks ago...
This is analogous to learning today about 200 people who died from #COVID19, but those deaths could have occurred at various times in past few weeks.
This is analogous to learning today about 200 people who died from #COVID19, but those deaths could have occurred at various times in past few weeks.
7/ I have PROOF that this is the explanation:
On LEFT: yesterday's @HealthyFla report of % positivity for new cases among FL-residents
On RIGHT: The epidemic curve on my dashboard, restricted to FL-residents during same time frame, and based on CASE DATE.
Numbers are identical
On LEFT: yesterday's @HealthyFla report of % positivity for new cases among FL-residents
On RIGHT: The epidemic curve on my dashboard, restricted to FL-residents during same time frame, and based on CASE DATE.
Numbers are identical
8/ This is also why the numbers for the same day on 2 consecutive reports CHANGE.
There is frequent updating of specific dates of events, including the CASE DATE upon which these figures are based.
The changes are usually very small, and are to be expected.
There is frequent updating of specific dates of events, including the CASE DATE upon which these figures are based.
The changes are usually very small, and are to be expected.
9/ This all just means it's really hard to reconcile exact numbers for testing (especially repeat negatives) because some of our calculations are necessarily based on REPORTED DATE and others we access (e.g., for new positives) are based on CASE DATE.
10/ BUT, I think our estimates of:
- total people tested daily
- new people tested daily
- % of people tested who are repeats
- new cases daily
Are solid estimates, despite the lack of complete granularity to make the precise calculations we all desire.
- total people tested daily
- new people tested daily
- % of people tested who are repeats
- new cases daily
Are solid estimates, despite the lack of complete granularity to make the precise calculations we all desire.
11/ I know it's therapeutic to have everything line up exactly with no "remainders", but unless the method of reporting or availability of person- & testing-level datasets to download changes, just ain't gonna happen.
Our estimates are close enough to reflect what's going on.
Our estimates are close enough to reflect what's going on.