Data Sharing Policy
PURPOSE
This document is directed to all persons or organisations interested in sharing clinical research data and looking for a secure and non-profit long term storage repository. It describes the policy framework of the clinical research Data Sharing Repository (crDSR) with respect to the management of data objects within it. It first clarifies the interpretation of several key terms, and then sets out the organising principles of the repository’s approach. It continues with a high-level description of the options that will be made available for secondary use of data objects (or ‘data sharing’), the ways in which Data Transfer Agreements will underpin later data object use, and the data sharing process itself.
This policy document is designed to complement any specific contractual agreement between the crDSR and other organisations, whether for data object transfer or for secondary use. It is also designed to act as a basis for more detailed documents that provide information on how the policy should be interpreted in practice.
THIS VERSION
2.0: 20 August 2024
AUTHORS
Christian Ohmann, Gerd Felder, Mihaela Matei, Steve Canham, Jacques Demotes, Maria Panagiotopoulou, Sergio Contrino
TERMINOLOGY
In this document we refer to the following terms:
clinical research Data Sharing Repository (crDSR): The clinical research Data Sharing Repository (crDSR) is a joint project of the European Clinical Research Infrastructure Network (ECRIN) and the University of Oslo (UiO). ECRIN manages the storage of metadata relative to clinical research, so to allow findability and eventual accessibility to clinical research data. UiO operates the secure environment (Trusted Research Environment, TRE) where sensitive data is stored for secondary use. It also enables the secure transfer of the data from the data provider to its TRE called Services for Sensitive Data (TSD).
Trusted Research Environment (TRE): TREs are highly secure computing environments that provide remote access to sensitive data for approved researchers to use in research. Also known as ‘Data Safe Haven’ or ‘Secure Processing Environment’.
Data Object: A Data Object is any file available in electronic form, of any type (document, data, media, etc.). The repository is designed to contain protocols, analysis plans, consent forms, result summaries and other documents associated with a clinical research study, as well as Individual Participant Data (IPD, see dataset definition below).
Dataset: The term Dataset refers to a data object that contains only data – e.g., a spreadsheet, CSV, JSON or XML file, database dump, etc. In the context of the repository, “dataset” will usually refer to the file or files of IPD derived from a clinical trial/study.
Data Object Provider (or ‘Provider’ in short): A Data Object Provider is an organisation that provides data objects to the crDSR. It must be a legal entity in order to enter into a Data Transfer Agreement with ECRIN. Unless those data objects are already explicitly in the public domain, the Provider is assumed to have the legal power to enter into that agreement, for instance they would hold copyright or intellectual property rights on data objects. For datasets of sensitive personal data, the Provider would be the data controller as defined under the General Data Protection Regulation (GDPR).
Data Transfer Agreement (DTA): The Data Transfer Agreement (DTA) is a legal agreement (contract) that governs the transfer of all data objects from the Provider to the crDSR. This agreement will reference the GDPR and associated legislation, and, when applicable, intellectual property law. Each DTA will have an appendix describing the data objects to be transferred to the repository, and – if datasets – whether they are categorised as anonymised or pseudonymised. It will also stipulate any data sharing rules and prerequisites that the Provider wants to impose to Secondary Users.
General Data Protection Regulation (GDPR): The Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC.
Data Controller: As defined in the GDPR, the term “data controller” refers to the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data.
Data Use Agreement (DUA) (also known as Data Access Agreement): The Data Use Agreement (DUA) is a legal agreement (contract) between the Requester (see below) and ECRIN that governs the secondary use of controlled access data objects within the crDSR. The DUA will include clauses prohibiting any attempt to re-identify individuals within a dataset, stipulate that datasets should be stored securely and destroyed after access rights expire, and require that the Provider and the repository are notified of the results of the secondary use.
Data Object Secondary Users (or ‘Secondary Users’ in short): Data Object Secondary Users are individuals (most often researchers) seeking to reuse data objects.
Data Object Requester (or ‘Requester’ in short): The Data Object Requester is the organisation arranging re-use of data objects on behalf of researchers, normally their employer.
Repository Manager: The Repository Manager is an ECRIN staff (employee/consultant) who oversees the processes of data transfer and data use and ensures that the secondary use is compliant with the data provider’s requirements.
PRINCIPLES
The following principles form the basis for data sharing within the crDSR:
The Data Object Providers retain control over all the data objects they transfer to the repository. In particular, each Provider stipulates the access arrangements to be applied to their objects, identifying those that will be available without restraint, and those that will be under a controlled access regime. For objects under controlled access, the Provider can set fixed prerequisites for access, or reserve the right to review and grant access on a case-by-case basis. For datasets falling under the GDPR or related legislation the Data Object Provider will remain the Data Controller. For data objects that are explicitly put into the public domain the Data Object Provider should specify the licence under which such transfer is made.
The repository securely stores and processes data objects on behalf of the Data Provider but does not take independent decisions about making data objects that are under controlled access available to Requesters. If, however, the Data Object Provider has stipulated clear criteria for allowing access, the Repository Manager can check a request against those criteria and allow access if the criteria are met, in effect relieving the Data Object Provider of the burden of dealing with such requests.
Transfers of all data objects will be the subject of Data Transfer Agreements (DTAs), between the Data Object Provider and ECRIN. The DTA will specify the roles and responsibilities of the organisations involved and state the commitment of the Data Object Provider to submit the necessary metadata and pseudonymisation/anonymisation information for datasets.
Controlled access to datasets will be managed using Data Use Agreements (DUAs) to ensure that suitable safeguards are in place. The repository will provide a standard or default Data Use Agreement, though individual Data Object Providers may negotiate amendments to create their specific DUA.
The crDSR will hold both anonymised and pseudonymised data, with the classification being made by the Data Object Provider. For pseudonymised data the repository will not hold any linking or identifying data.
To mitigate privacy concerns linked to the secondary use of clinical trial individual level data, the crDSR recommends that the pseudonymised data is further processed after the completion of the clinical trial to ensure removal of any remaining personally identifiable information. Such data processing may involve date re-basing and removal of narrative text fields. This is a recommended yet not mandatory step and the final responsibility for implementing it lies with the Providers. In all cases, the crDSR will require a summary of the privacy-preserving techniques applied.
To ensure FAIRness of data objects, all such objects should be linked to metadata on discovery, access and provenance. The Repository Manager will facilitate the adequate provision of metadata, making use of the ECRIN metadata schema for clinical research[1]. Data Object Providers will also be encouraged to ensure originating studies, including observational studies, are registered, which will make the provision of required metadata much easier.
Datasets should be associated with detailed descriptive metadata, describing the individual data points. Ideally this would be in a standardised format (e.g., CDISC Define XML) but may be a simple spreadsheet based ‘data dictionary’. The repository should reserve the right to insist on descriptive metadata being available – usually as a public data object, even if access to the dataset itself is controlled.
OPTIONS FOR SECONDARY USE
The crDSR provides different access options:
Public access, freely available data objects
Many data objects, documents in particular, are expected to be made freely available. No special procedures are necessary for these data objects – there should be a link to them from the metadata, and that link should lead to the document. Normally the document should be visible in the browser and downloaded from there.
Data objects with controlled access, managed solely by the Data Provider
This may be applicable, for example, if Data Object Providers have and wish to use their own Data Access Committee. For these Data Objects, users wishing access would be told, via the associated metadata, to contact the Data Object Provider directly.
Data objects with controlled access, managed by the crDSR
When requested by Data Object Providers, controlled access management can be delegated to the crDSR. In this case, explicit prerequisites need to be stipulated by the Providers, which are listed in the Data Transfer Agreement. Data Object secondary users enter in contact directly with the Repository Manager, who checks if the prerequisites set by the Data Provider have been met (e.g., there is a protocol describing the proposed use of the datasets). Assuming the prerequisites were clear enough for their fulfilment to be unambiguous, the Repository Manager would then grant access to the data object(s), following the signature of a DUA.
Imposition of an embargo period on data objects with controlled access
Data Object Providers will be offered the option of setting an embargo period. This would extend to a fixed date. The embargo period should not exceed two years from the date the data objects were transferred to the crDSR.
DATA OBJECT TRANSFER
The Data Object Transfer Process, and the associated Agreement(s), are key to both maintaining the quality of the repository’s contents and in establishing the data object access regime for each data object transferred. To support this process, in outline:
Any initial enquiries need to be met with a clear explanation of the Data Object Transfer procedure, including the need for the provision of metadata.
The collection of metadata for each data object (and for the generating study or studies) is a necessary first step in the Data Object transfer process, as it characterises the objects to be transferred.
For each object, the access regime to be applied should be clearly indicated. For controlled access regimes, the nature of any prerequisites to be checked by the Repository Manager should be made clear.
For datasets with personal data, the data protection status of the datasets (anonymised or pseudonymised) should be made clear by the Data Object Provider.
The data object transfer is governed by a Data Transfer Agreement between ECRIN and the Provider, that will cover all data objects (not just the datasets). Even when data objects have been explicitly placed in the public domain the Data Transfer Agreement should indicate the specific licence that applies to them. The specific access regime and other details described above should be included within supporting appendices to each Data Transfer Agreement.
The details of the data transfer process and the access requirements for each data object need to be stored in the repository.
SECONDARY USE
The details of the secondary use process will depend on the stipulations of the Data Object Provider, as written within the Data Transfer Agreement. For controlled access where the repository is involved:
The Repository Manager will check if the stipulated prerequisites have been fulfilled by the Data Requester / Secondary Users. If previously instructed to do so, it will also pass the request to a Data Access Committee for their recommendation. If the Data Object Provider’s requirements are clearly met the Repository Manager can make the requested data objects available to the Secondary Users. If not, the request will have to be relayed to the Data Object Provider for their final decision.
The secondary use of the controlled-access datasets is governed by a Data Use Agreement between the Requester and ECRIN. This will, amongst other things, impose restrictions on usage – normally only for the task explicitly described by the Data Object Secondary Users – and on any further dissemination. It will also prohibit any attempts to re-identify trial participants. To save time and effort, the repository will offer only limited flexibility in the wording of the Data Use Agreement.
The progress of the secondary use process needs to be stored in the repository. This system is used to direct and record the granting of the necessary rights to access and / or download data objects.
The repository will make public the objective of the secondary use of repository data, making it possible for trial participants to be informed of the nature of the secondary research studies using their personal data. Secondary users will be encouraged to report any publication resulting from their re-use of the data objects.
Last updated
Was this helpful?