The novel coronavirus has generated unprecedented urgency to learn from health data. There is a variety of data that are crucial to the pandemic response, but not all of these data are collected or governed by the same organization. For example, COVID-19 test results aren’t necessarily kept in the same database as data about medical conditions, medications or social determinants of health.
It is critically important that data relevant to COVID-19 are brought together — and governed and managed in a responsible and trustworthy manner — so that they can be used for research and analysis. This will require new kinds of collaborations, one of which is data trusts.
Guidance for data trusts
In December 2019, a group of 19 people representing 15 Canadian organizations and data infrastructure initiatives came together with the goal of developing practical guidance for data trusts. We defined a data trust as a repeatable mechanism or approach to sharing data in a timely, fair, safe and equitable way. This definition is intended to be broad and could apply to both data institutions (like the organizations that make up Health Data Research Network Canada) and large datasets accessed by multiple researchers.
Based on our first-hand experience with data governance and management, and concepts from the literature, we identified 12 essential requirements for data trusts, which have been published in the International Journal of Population Data Science.
We used a minimum specifications requirements approach, where participants brainstormed a list of good to have characteristics. They then asked the question, “Is it possible to have a complete and well functioning data trust without any of these?” and crossed off everything on the list that wasn’t seen as essential.
Our work predated the pandemic, but we believe the 12 requirements are relevant and can serve as a beneficial checklist for new and existing data trusts, including those related to COVID-19.
It is a good idea for everyone to have some knowledge about how their data are protected. There are laws that apply to companies, such as the federal Personal Information Protection and Electronic Documents Act, and others that apply to public sector data, for example Ontario’s Personal Health Information Protection Act and the Freedom of Information and Protection of Privacy Act in British Columbia.
Those in charge of data trusts need more than cursory knowledge of legal requirements. They must be fully aware and compliant with all relevant laws and have legal authority to collect, share and hold data. Understanding legal requirements and authorities can be complex, so groups creating new COVID-19 data trusts should obtain legal advice.
Governance and management
All data trusts — including those associated with apps and websites where people voluntarily enter their COVID-19 symptoms and test results — have a responsibility to ensure that privacy is protected, and that data are secure and not used for purposes that the people who contribute data don’t agree with. For that reason, the majority of our 12 data trust requirements focus on governance and management.
To fulfil the 12 requirements, COVID-19 data trusts would need to have a stated purpose, a governing committee or board, and implemented policies that address important questions, including: What data are available and with what level of detail? Are data downloadable or accessed via secure remote connection? Who can use the data (academic researchers, students, independent analysts, company employees)? Can data infrastructure established in response to COVID-19 also be used to study other diseases and topics?
Data trusts should also establish mandatory training for all data users covering allowed and prohibited activities and require signed data user agreements with enforceable penalties if users breach the data trust terms, for example, by attempting to re-identify anonymized individuals. Our article provides examples of existing training materials and data user agreements that could be a good starting point for COVID-19 data trusts.
There is extensive published guidance about how to govern and manage data trusts, including the Five Safes Framework. However, practically speaking, the workload and expertise required to establish trustworthy data infrastructure may mean that new COVID-19 data trusts are better off partnering with an organization that has experience governing and managing data, rather than starting from scratch.
We wholeheartedly support the recent increased focus on public involvement in decisions related to health data, including acknowledgement of the need for the public to have a say in data use for COVID-19 research.
We recommend that all data trusts have early and ongoing engagement with members of the public and other data providers, and tailored engagement with groups that have a particular interest in, or would be affected by, an activity of the data trust.
In the case of COVID-19 data trusts in Canada, this could translate into a focused effort to engage and involve people who live or work in long-term care homes and racialized minorities who have been disproportionately affected by the pandemic.
Our paper presents a distillation of concepts from the literature, combined with first-hand knowledge from 15 Canadian initiatives focused on data infrastructure. As we begin to implement and evaluate the 12 requirements, we expect that they will evolve in response to new opportunities and threats. For example, the minimum specs may need to be expanded or modified to adapt to new developments in scientific methods such as artificial intelligence, new data sources such as wearables, changes in public sentiment related to data use and cybersecurity threats.
We look forward to working with members of the public, and with stakeholders from other organizations and countries, to refine the 12 essential requirements and put them into practice for data trusts, including those developed specifically for COVID-19.
This article is republished from The Conversation, a nonprofit news site dedicated to sharing ideas from academic experts.
- Small Data approaches provide nuance and context to health datasets
- The public needs to know why health data are used without consent
P. Alison Paprica receives funding from the Canadian Institutes of Health Research and other research funding agencies.
Adrian Thorogood receives funding from the Canadian Institutes of Health Research, Genome Canada, and Genome Quebec.
Alex Ryan works for MaRS Discovery District. His group receives funding from 60 government and corporate partners, including Public Health Agency of Canada, Merck, and the J.W. McConnell Foundation.
Kimberlyn McGrail receives funding from the Canadian Institutes of Health Research and other research funding agencies.
Michael J. Schull receives funding from the Canadian Institutes for Health Research.