Confidential health records of half a million UK Biobank volunteers were offered for sale on Chinese online marketplaces, an incident that has raised serious questions about the security and governance of sensitive medical data and the future of public trust in health research.
The de-identified data – which excluded names, addresses and precise dates of birth but still contained gender, age, month and year of birth, socioeconomic status, lifestyle habits, mental health history, cognitive function, physical measurements and details of health outcomes such as cancer diagnosis dates – was listed on the e-commerce platform Alibaba. UK Biobank informed the UK government on 20 April 2026 after three listings were identified. With support from both the UK and Chinese governments, Alibaba swiftly removed the listings. It is not believed any sales were completed.
The data had been made available to researchers at three academic institutions under contract. Those institutions have had their access suspended, as have the individuals involved. UK Biobank also temporarily suspended all access to its research platform and implemented stricter controls on file exports. Professor Sir Rory Collins, the body’s chief executive and principal investigator, described the listings as “a clear breach of the contract signed” by the institutions. “We are sorry that this incident has occurred and hope you are reassured by the swift and decisive action we have taken,” he said, adding that participants’ personally identifying information remains safe and secure and that there had been no hack or data breach of UK Biobank itself.
Yet this was not an isolated event. Professor Luc Rocher of the Oxford Internet Institute, who has tracked the repeated posting of Biobank data online, said this was the 198th known exposure of UK Biobank data since the previous summer. Previous exposures have occurred when researchers unintentionally uploaded datasets to code-sharing platforms such as GitHub; UK Biobank has issued numerous takedown notices. The sheer scale of these lapses has led to questions about whether security measures are adequate.
The controversy has also fuelled deeper concerns about consent. Rocher noted that when participants originally signed up they were told data would be used for non-profit research, and the discovery that it had been sold to industry or appeared on third-party platforms undermines the basis on which they volunteered. Research suggests consent documents may not have adequately informed participants about the full range of lawful uses, including potential controversial applications. Critics have also raised the risk of re‑identification: even with de‑identified data, the combination of demographic and health attributes can make individuals identifiable, especially with advances in artificial intelligence and data linkage. The Guardian has previously demonstrated re‑identification of a participant using limited personal facts.
NHS data plans under the microscope
The Biobank affair comes at a delicate moment for the NHS, which is pursuing ambitious plans to digitise healthcare data. NHS England has announced a Single Patient Record, confirmed in the King’s Speech, that will consolidate medical history, test results, treatments and prescriptions into one place, accessible through the NHS app from 2028. Legislation will require GPs and hospitals to share patient data. A separate initiative, the NHS Federated Data Platform – built on software from the US company Palantir, awarded a £330 million seven‑year contract – is already live in 123 hospital trusts, according to NHS England, and is being used to coordinate theatres and waiting lists.
Proponents argue that joining up data is vital for continuity of care: a patient who arrives unconscious at A&E should not have to rely on memory or a delayed GP letter for clinicians to know they are allergic to a drug. Surveys suggest strong public appetite for such uses, with 95 per cent of patients comfortable with data being used to improve individual care and 83 per cent trusting the NHS to keep their data secure, according to a May 2024 NHS survey.
Yet incidents like the Biobank breach risk eroding that confidence. Trust has already been challenged by revelations that Palantir – founded by the controversial democracy sceptic Peter Thiel and involved with US immigration enforcement and military intelligence – has given staff access to patient data while working on the Federated Data Platform. Both NHS England and Palantir have said they are accessing data strictly in line with policies. However, internal briefings have acknowledged a “risk of loss of public confidence”, and the British Medical Association has advised doctors to limit engagement with the platform over concerns about Palantir’s track record. Some MPs have warned the arrangement is “dangerous”. Dr Nicola Byrne, the National Data Guardian, expressed profound concern over the Biobank incident, stressing the need for transparency, accountability and decisive action to maintain public confidence.
Rebuilding trust in health data projects
For the broader NHS data strategy to succeed, experts argue that rebuilding public trust is not optional – it is a prerequisite. Jon Baines, senior data protection specialist at law firm Mishcon de Reya, said: “Those who allow their data to be used for research must be able to trust that their rights will be safeguarded. As long as that trust can be achieved, then the NHS Data Strategy should not be too threatened by the concerns over exposure of Biobank information.” He added that few would dispute the benefits of lawful and responsible data use, but that “it is crucial that all involved are aware not just of the risks, but of the technological and legal complexities”.
Rocher pointed to successful programmes around the world that allow researchers to access very sensitive data – including financial and healthcare records – without making headlines, because there is no security breach. “We need to look at those gold standard approaches and try to understand what they’re doing differently to us – and adopt the elements of their model that work,” he said. In particular, a safe model is one where approved researchers can analyse sensitive datasets without being able to download the original files. Within the NHS, traceable systems that record who accessed which record, when, and for what reason would provide a similar safeguard. By proving that paper trail and demonstrating that access can be trusted, it is possible to win back a sceptical public after a publicly embarrassing incident.
Rocher expressed confidence that the public is capable of making nuanced judgments. “People are not stupid,” he said. “The public can see the difference between a good scheme and a scheme that has poor security practices.” That distinction will be critical as the NHS proceeds with its “Data Saves Lives” strategy, which aims to improve data sharing, transparency and data‑driven innovation. The strategy’s success depends on overcoming fragmented systems, inconsistent standards and a data literacy gap among staff, while ensuring that the ethical and equitable use of data is placed at the centre of the effort.
Baines noted that the Biobank incident, though damaging, need not derail the entire enterprise. “Few people would dispute the benefits of lawful and responsible use of data to drive health improvements and efficiencies,” he said. “And few could argue that UK Biobank does not present continuing huge potential benefits for the NHS and for health research more widely.” His call was for realism: “It is crucial that all involved are aware not just of the risks, but of the technological and legal complexities.” UK Biobank is conducting a comprehensive investigation and implementing additional security measures, while Professor Collins has apologised and reassured volunteers. Whether that will be enough to restore confidence depends on whether the lessons from both the Biobank breach and the global gold standards are learned – and applied – before the next exposure occurs.
