In the midst of the COVID-19 pandemic, we are constantly being bombarded with graphs and tables showing the impacts of coronavirus to date and predicting the levels of infections, hospitalizations and deaths over the coming months.
These charts and graphs reported in the news are based on a mix of actual data reported by hospitals and governments as well as synthetic data created from the reported data. Epidemiologists and other data analysts extrapolate out the actual data, based on analysis and assumptions, into a synthetic data set that is used to build the projections we are all so eagerly following in order to find out when we can start returning to our normal lives.
Data intelligence is critical not only to managing the health care impacts of COVID-19, but just as importantly, for informing businesses navigating disaster planning, business continuity, product development and client servicing during this crisis.
Businesses need broad access to both internal and external data to understand the financial risks to the business as well as the health risks to their employees and customers, to make prudent and practical decisions in order to survive the (hopefully) short-term revenue losses and to plan for the changes in business practices that will likely be with us for a long time to come. Access to comprehensive data is more critical than ever; and ensuring that the data is secure has never been more important as in this time when our vulnerability — in terms of our health, our finances, our overall safety — is the highest it has ever been.
Synthetic data is an artificial data set that mimics the original data; however, it removes the personal or other sensitive information that may be included in the original data. Raw data is run through special algorithms and generators to create new data sets that cannot be traced back to the original consumer or transaction. However, this “fake” data set retains the accuracy and statistical significance of the original data set, making it ideal for creating a baseline for future studies or testing, modeling business opportunities, projecting trends and more.
For researchers and scientists tracking the COVID-19 crisis and working to develop treatments and vaccines, synthetic data can be used to aid in the creation of a much larger baseline for testing and clinical trials. For business or product owners, synthetic data can be generated on a one-to-one basis so that the final synthetic data set matches the original set field-for-field, but without the privacy risks. This “new” data set can then be safely used for performance analysis, benchmarking, forecasting or product development, producing results as valid as using the original data and at no risk of misuse of personally identifiable information.
Data collected from hospitals and health departments have been critical inputs into understanding the health impacts of the COVID-19 pandemic, but they only tell part of the story of the changes COVID-19 has created in our lives. As businesses, retailers and restaurants have been reopening, government officials and public health personnel are using reported infections data to track new outbreaks, trace contacts and make plans to manage the evolving situation. Businesses themselves need data to understand when and how to reopen their doors, how consumer needs are changing and how best to confidently and competently serve a client base whose interactions and purchasing behavior are now very different from just a few months ago. And consumer transactional data should not be evaluated separately from pandemic trend data; rather, the two need to be combined so that operational decisions are informed from both health and business perspectives to ensure the economy is being reopened in the safest and most beneficial manner possible.
One example of an organization enabling broad data sharing across industries is
Combining health data with economic data allows us to construct models for guiding reopening planning, identifying businesses by level of criticality and economic value and balancing this against the level of health risk posed to customers of those types of businesses. Below is an example of one such model: a reopening road map created by
Safe, accessible and comprehensive data is critical to getting our economy going again. Synthesizing data allows for broad sharing of the inputs businesses and municipalities need to make decisions, all with much reduced levels of concern by health care professionals, government officials, business owners, compliance officers and PR staff about the risks of personally identifiable information being misused or stolen.
Imagine how useful a synthetic data set would be to a product manager or business owner. With easily accessible, rapidly updatable and statistically valid synthetic data, a product manager could be much more proactive in responding to customer issues, predicting future product trends or generating ideas for new product features based on deeper analysis of product usage and customer feedback. All of this with much reduced levels of concern by business owners, compliance officers and PR staff about the risks of personally identifiable information being misused or stolen. Synthetic data can also be used to accurately train machine learning models and neural networks, critical for such areas as fraud detection and management systems, which need mountains of reliable data for testing and strengthening.