Data

Distributed Handling
Data Specifications
Analyse Measures
Permission Policies

Distributed Handling

Data production / collection, data description and data analysis can be done separate today.

Separation of roles has a long tradition in science: Tycho Brahe collected the planetary motions, Kepler described them and Newton explained them.

Today separation of roles can be done faster, easier and in cooperation, thanks to modern communication technologies. The basis is, that data is observed/produced, recorded (often free of costs nowadays), annotated and searchable published.

In computer science it is well known that distributed and parallel data handling works best. This even on computers with only a couple of cores.

Map reduce and similar
Programming paradigms enabling parallel execution data handling

We humans have billions of brains, so with good communication and openness we’ve got a lot of brain power to tackle any task.

Data Specifications

For data analysis it is important to have accurate, consistent and clearly specified data and notifications about possible inaccuracies or changed specifications. Notified corrections are helpful too.

The different measurement methods regarding Covid-19 with no or few specifications have a lot of space for improvement all around the world. A uniform agreement on annotations would make the data better comparable.

E.g. In Switzerland as of 17.4.20, the testing policy is to test only people who are sick and belong to a risk group. The policy is mostly followed, however is hard to find on the official page from the BAG (Swiss Health Department, bag.admin.ch in the German Covid-19 FAQ) and even left out on the English version (as of 17.4.20). The official statistics on the same domain, has no note about this testing scheme. The data are analyzable (since largely consistent) and expressive but just not comparable to countries, where everybody who is sick or even everybody who could be infected is tested. Remark: As of 1.5.20 more people are allowed to test.

Data Features

For analysis is useful to have as complete, accurate and fine-grained data as possible and data protection laws permit (e.g. age, medical conditions and locations).
Clinical observations annotated and published with as much data as possible
Interpretations and analyses are nice but data analysis can be done dis

Easy Readable

Data ideally are in a common format (e.g. CSV), on a public API (e.g. GitHub repository or REST API) and are annotated e.g. contain a readme with the specifications what and how is measured.

Analyse Measures

in work

Permission Policies

Default Allow

Default Allow = Allow unless Denied: Target everything with a specific feature and allow everything else. Also called blacklist (=items to block) approach in computer science)
Default Deny

Default Deny = Deny unless Allowed: Allow only items with a specific feature = block all other items i.e. those without the specific feature. Also called whitelist (=items to allow) approach in computer science.

Permission policies are relevant for:

Individual Measures and Covid Apps [page currently in work]
The immune system: Some parts of the immune system work by default allow (e.g. T cells: pathogen recognition by pathogens specific patterns) and some parts of the immune system work by default deny (e.g. natural killer cells check for MHC expression and target all other cells; the blood brain barrier blocks by default).