We need an online Coronavirus Census now!


An open letter to Bill Gates, Jeff Bezos, Mark Zuckerberg, Sundar Pichai, Satya Nadella, Donald Trump, Andrew Cuomo and you.

Proposal: Create an online Coronavirus census that would give medical researchers the massive dataset that they need to really understand how the coronavirus spreads.

Note: Please, I need your help. If you think that this idea might work, please repost. To the best of my knowledge, no one has suggested this idea before, and it may save thousands of lives at very little expense. My hope is that a key decision-maker in business or government will read this post and be willing to devote the small resources necessary to make it happen. Your help would be much appreciated.

Like most people, I have been following the progress of the coronavirus. The thing that has struck me more than anything is how little we know about how the virus actually gets transmitted between people. “Information” swirls around the internet and media, but there is actually very little data to that enables us to separate opinions from facts.

Unfortunately, this lack of real information does not in any way inhibit people from coming to strong conclusions about what should be done. And the media blasts all of these opinions around the globe making it seem as if these opinions are backed by real data, when they are often not. Worse, experts build these opinions as assumptions into their statistical models to make projections that are likely to be off by an order of magnitude.

The fact is that we know very little about how the coronavirus is transmitted from person to person.

Is it transmitted by “snot” globules that only move six feet? Is it transmitted in aerosol that travels much further? Is it transmitted from touching something directly? Do masks help? Do gloves help? Are meetings with 6 people dangerous? 100? 1000? Are airlines, mass transit, restaurants, local businesses dangerous? We just do not know.

Accepting our own ignorance is the first step to fighting the virus.

Our ignorance is a huge problem for policy-makers, as they are forced to make rapid decisions based upon very little information. Even worse, they are surrounded by people with very strong opinions, each of whom predict catastrophe if their proposed idea is not implemented immediately and in full. The policies implemented seem to be based upon little more than educated guesses and fear. Given what is at stake, this is state of affairs is unacceptable.

Many of the policies implemented by national and local governments are designed to “buy time” for researchers to develop a vaccine, but it is very unclear how long that will take. Estimates range from six months to 2 years or longer. The possibility of never being able to develop a vaccine for a rapidly mutating virus is very real.

In the meantime, we need to implement policies and make behavioral changes that minimize the number of people who die from the virus while simultaneously minimizing the negative impact of those changes on the rest of society.

What we need is a very fast, inexpensive and globally comprehensive way of acquiring a dataset of all the symptoms that people have and the actions that they have taken over the last few months. Thirty years ago, this would have been virtually impossible. Today, however, we have a perfect means for doing so: the internet.

My proposal is that an established Internet company with a robust digital infrastructure and a large number of trained software developers, testers and user experience designers build a website designed to create an online coronavirus census with an open source database. It does not need to be a big company, but it must be already established (although I would not object to a small start-up giving it a shot!).

The database acquired by this census would give medical researchers the massive dataset that they need to really understand how the coronavirus travelled throughout the world, throughout  each community and throughout each family. I have no idea what they will find, but I have confidence, that with very little effort, this census can make major contributions to the fight against the coronavirus. And, even better, medical researchers can use the database to construct plans for the future that can be rapidly implemented when the inevitable next outbreak occurs.

Given the state of the epidemic, it is not tolerable to wait for medical studies to be published. Current medical studies are impaired by a lack of data, so they use a very small sample size. Worse, they take months. What we need is a sample size in the millions or ten millions with finely-detailed data points on people’s symptoms and behaviors.

The website itself would be very simple. Think of it as a census, which you may have already filled out for the federal government. But this census would be specifically focused on fighting the coronavirus. It would consist of many simple questions that about the symptoms experienced and the actions taken by an individual.  

A census is not a study. A study is usually based upon a small sample size of a few hundred or thousand. This small sample size limits the ability of researchers to drill deep in the data and make scientifically valid conclusions. A census has millions of datapoints, making conclusions far more valid.

This massive database can be captured quickly and with very little cost. For a company with a large software company with a website, a large number of servers and technicians to back them up, this is not a particularly difficult project. Because location is so critical to understanding transmission, the system must be integrated with Google Maps or some other mapping solutions.

Federal, state and even local governments could encourage companies to act by offering a cash prize to the first company who is able to build and deploy the census. They can also assist by creating a list of questions to use. Two conditions should be that no personally-identifying data can be used for commercial purposes and the data must be publicly available for free on the internet.

Once the website is up and running, individuals, non-profits, businesses and government organizations could easily generate enough publicity online to get people to answer the questions on the survey.

The questions themselves could be developed in one day (I have a few suggestions below). The coding itself would be trivial.  Any competent programmer could write the code almost as fast as one can write the questions.

Because people could answer at home on the internet, there would be no additional risk of infection. Personally identifying information like Name, Social Security Number and Date of Birth would be obfuscated to protect the privacy of the respondents.

While most people could fill out the questions in only fifteen minutes, it would add enormous value to medical researchers in being able to track the transmission of the coronavirus. As coronavirus tests become available, medical workers could append actual test results to a person’s account. Then the entire history of that persons actions will be mapped to whether they actually got coronavirus.

Because this census is on the internet, it would be possible to continually improve on it. Questions could be added or reworded for clarity. Researchers could follow up with key respondents who might have been early spreaders for more information than is necessary from a typical respondent. Accounts between household members could be linked for ease in data entry and research.

Of course, much of the entered information will be inaccurate due to fading memories and even deliberate lies, but using statistical analysis, researchers should be able to separate the “signal” for the “noise”. With a very large sample size, researchers can develop statistical models and make projections that are far more useful than the ones in current use.

It is important that we do not wait. Memories will fade. By next year, it will be impossible to create such a dataset. Every day that goes by means that the information will lose its validity.

If we could get a few million people to answer the online coronavirus census, we might be able to crack the code about what works and what does not work. Policies could be based on fact, not just opinions. A few simple behavioral changes might make all the difference, while ineffective policies that cause more harm than good could be repealed.

In this way, we could give medical researchers the time that they need to find a vaccine in a way that minimizes the impact to society.

If you agree that this idea might work, please repost. My hope is that a key decision-maker in business or government will read this post and be willing to devote the small resources necessary to make this happen.

Below is my suggested list of questions for the coronavirus census. I am sure that medical researchers could come up with a better list, but this will at least give us a start:

  • Are you answering this questionnaire for yourself, or for another person?
  • First and Last Name
  • Social Security Number (or other identifier)
  • Date of Birth
  • Email address
  • Phone Number
  • Most recent residence (all addresses should be integrated with Google Maps to ensure their accuracy)
  • All previous residences since Nov 1, 2019
  • What other people lived in the same house, apartment? For each, enter their names. If possible, ask them to fill out a questionnaire.
  • Have you had a coronavirus test? For each occurrence, enter:
    • Date
    • Result
    • Institutions that administered the test?
  • Has anyone in your household been diagnosed with coronavirus. For each enter name.
  • Have you experienced any of the following symptoms since Nov 1, 2019? For each occurrence, enter the following information:
    • Select the symptoms that you experienced
    • Start and End date
    • Were you at your residence?
    • Did you work during this period?
  • Has anyone in your household experienced any of the following symptoms. since Nov 1, 2019? For each enter name.
  • Have you been employed outside the home since Nov 1, 2019? For each job, enter the following information:
    • Company Name
    • Start and End dates of employment
    • Address where you actually worked
    • Did you work indoors or outdoors?
    • Did your job take you to many different places?
    • How many other people worked in the same room as you?
    • How many other people worked within 6 feet of you? If known, enter their names.
  • Have you been attended school or university outside the home since Nov 1, 2019? For each school, enter the following information:
    • Name
    • Location
    • Start and End dates
    • Frequency of attendance per week
  • Have your travelled outside of your metro region since Nov 1, 2019? For each trip, enter the following information:
    • Method of Travel
    • Flight, bus, train number and time
    • Residence during visit
    • Room number (if applicable)
    • Events attended with address, time and date.
  • Have you flown on a commercial airplane since Nov 1, 2019 for any other reason?
  • Have you been on a cruise ship, intercity bus, train not mentioned above since Nov 1, 2019? For each trip, enter the following information:
    • Method of Travel
    • Flight, bus, train number and time
  • Have you used mass transit regularly.
    • Name/number of line
    • Time of day used
    • How often used?
  • Where do you shop for groceries? For each, enter the following information:
    • Name
    • Address
    • How often used?
  • What church or other regular religious institutions do you attend? For each occurrence, enter the following information:
    • Name
    • Address
    • How often attended?
    • Date and time.
  • Have you attended a movie theater since Nov 1, 2019? For each occurrence, enter the following information:
    • Name
    • Address
    • How often attended?
    • Date and time.
  • Have you attended a conference, retreat, opera, play, concert, sporting, political rally, or religious events since Nov 1, 2019? For each occurrence, enter the following information:
    • Name
    • Address
    • Date and time
  • Where do you get your for pharmaceuticals? For each, enter the following information:
    • Name
    • Address
    • How often used?
  • Have you attended a hospital or medical institution since Nov 1, 2019? For each occurrence, enter the following information:
    • Name
    • Address
    • Date and time
  • Have you attended a gym since Nov 1, 2019? For each occurrence, enter the following information:
    • Name
    • Address
    • How often attended?
  • Since mid-March 2020. have you:
    • Sheltered at home?
    • Travelled outside your metro area?
    • Worked from home?
    • Continued to work outside home?
    • Washed hands with soap and water frequently? If so, how often?
  • Since mid-March 2020. have you done the following while outside:
    • Worn face mask while outside home? If so, how often?
    • Worn gloves while outside home? If so, how often?
    • Kept six feet from people? If so, how often?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s