Data management of sensitive human proteomics data: current practices, recommendations and perspectives for the future.

Document Type


Publication Date


Publication Title

Molecular & cellular proteomics : MCP


washington; seattle; isb; genomics


Today it is the norm that all relevant proteomics data that support the conclusions in scientific publications are made available in public proteomics data repositories. However, given the increase in the number of clinical proteomics studies, an important emerging topic is the management and dissemination of clinical, and thus potentially sensitive, human proteomics data. Both in the United States and in the European Union there are legal frameworks protecting the privacy of individuals. Implementing privacy standards for publicly released research data in genomics and transcriptomics has led to processes to control who may access the data, so called "controlled access" data. In parallel with the technological developments in the field it is clear that the privacy risks of sharing proteomics data need to be properly assessed and managed. As the proteome is directly derived from genome data, proteomics data can potentially reveal similarly sensitive data as nucleotide sequencing data. In this manuscript, we summarize the conclusions about this topic that have emerged from two meetings held in 2019 and some follow-up discussions, with a primary focus on data management practices. In our view, the proteomics community must be proactive in addressing these issues. Yet a careful balance must be kept. On the one hand, neglecting to address the potential of identifiability in human proteomics data could lead to reputational damage of the field, while on the other hand, erecting barriers to open access to clinical proteomics data will inevitably reduce re-use of proteomics data and could substantially delay critical discoveries in biomedical research. In order to balance these apparently conflicting requirements for data privacy and efficient use and re-use of research efforts through the sharing of clinical proteomics data, development efforts will be needed at different levels including bioinformatics infrastructure, policy making and mechanisms of oversight.


Institute for Systems Biology


Biomedical Ethics