Half a million UK Biobank participants’ health data was listed for sale online on the Chinese e-commerce platform Alibaba, the government has confirmed, in what technology minister Ian Murray called an “unacceptable abuse” of data. The breach, disclosed to ministers on Monday, involved de-identified information drawn from all 500,000 volunteers who had contributed biological samples and lifestyle records to the world’s most comprehensive biomedical database.
The breach
Ian Murray told the House of Commons on Thursday that UK Biobank had informed the government of three separate listings on Alibaba, at least one of which appeared to contain data from every participant. Additional listings offered support for researchers seeking legitimate access to the database, or analytical services for those who already held approved access. The minister said the vendor – Alibaba – had not believed any purchases were made before the listings were taken down, following joint action by the UK government, Chinese authorities and the company.
The data offered for sale did not include names, addresses, telephone numbers or other direct contact details, Murray confirmed. Instead, it consisted of the de-identified datasets that UK Biobank routinely makes available to approved researchers. The incident is understood to be a breach of contract by three academic institutions that had been granted access, rather than a sophisticated cyber-attack.
Government response
The government moved swiftly after being alerted on Monday. “Once the government was made aware, we took immediate action to protect participants’ data,” Murray told MPs. He thanked the Chinese government for its co-operation in having the listings removed. UK Biobank has revoked data access for the three research institutions identified as the source of the leak, and the charity has temporarily suspended all further access to its research platform. Strict limits on the size of downloadable files are being imposed, and every exported file will now be monitored daily.
Murray also confirmed that UK Biobank has referred itself to the Information Commissioner’s Office (ICO), which is making its own enquiries. A comprehensive, board-led forensic investigation is under way. The minister said he could not offer a “complete guarantee” that no individual could ever be re-identified from the exposed data, but added that doing so would require “a very advanced way”.
What data was compromised – and the risks to participants
The datasets on sale contained fields that could include gender, age, month and year of birth, socioeconomic status, lifestyle habits, and measures from biological samples. While stripped of direct identifiers such as name or address, experts warn that such a rich combination of attributes can, in the age of cross-referencing and artificial intelligence, be used to re-identify individuals.
Professor Luc Rocher of the Oxford Internet Institute noted that this was the 198th known exposure of UK Biobank data since last summer, and that some material remained available online for download despite the latest takedown. His own website tracks previous exposures, many of which, according to a Guardian investigation in March 2026, resulted from researchers inadvertently posting datasets on public repositories such as GitHub. UK Biobank has previously revoked access from institutions – including Yale University – for similar breaches.
Professor Elena Simperl of King’s College London said the incident was “not a moment to point fingers, but to take seriously what it tells us about national data infrastructure”. She described the leak as “an infrastructure problem, not the result of a complex cyber attack”, adding that the costs of maintaining such flagship data stewardship projects are too often seen as an afterthought.
Concerns about the trustworthiness of data handling were echoed by Conor O’Neill of OnSecurity, who said that data protection failures are frequently caused by a “cultural gap between policy and practice” among researchers rather than malicious intent. Kirsty Gouldsmith of the law firm Spencer West questioned how a breach of this magnitude could occur, stressing that the public need a clear explanation of what happened and what steps will prevent a repeat.
Why UK Biobank matters – and what happens next
UK Biobank is the world’s most comprehensive dataset of biological, health and lifestyle information. It was established to advance medical research, and scientists across the globe may apply to use its de-identified data for studies judged to be in the public interest. The data has so far been cited in more than 18,000 peer-reviewed papers and has already contributed to improvements in the detection and treatment of dementia, cancers and Parkinson’s disease.
All participants were aged between 40 and 69 when they enrolled between 2006 and 2010, and their long-term health is tracked to help researchers understand, prevent and treat serious illnesses. Professor Sir Rory Collins, UK Biobank’s chief executive and principal investigator, apologised to participants in a statement, saying “your personally identifying information in UK Biobank is safe and secure” and emphasising that the data offered for sale was de-identified and removed before any purchases occurred.
To prevent a recurrence, UK Biobank is developing an automated checking system designed to block bulk downloads of de-identified participant data, with a target of operational deployment by the end of 2026. The government, meanwhile, plans to issue new guidance on the control of data derived from research studies. Professor Andrew Morris of HDR UK praised the rapid, joined-up response but stressed that public trust in data handling is foundational to such research advances.
