/ via embl/
EMBL-EBI Team Leader Sandra Orchard reflects on a life in protein science and biocuration
Born in a small farming village in the north-west of England, Sandra Orchard didn’t grow up imagining a life in molecular biology. Her parents worked in agriculture, and encouraged her to go down the science route. “At a careers evening organised by my school, I got talking to someone representing biochemistry and I thought it sounded interesting,” Orchard remembers.
This conversation set a course. After a biochemistry degree at the University of Liverpool, Orchard joined pharmaceutical company Roche as a graduate biochemist. “Scientists were just starting to understand cellular signalling in those days, and we were able to do a lot in the lab, from setting up assays to compound screenings etc.” she said.
Over the 18 years Orchard spent at Roche, she became interested in kinase biology and drug target identification. Kinases are enzymes that add chemical groups called phosphates to other molecules. Like a molecular switch, kinases can cause molecules to become either active or inactive in the cell. Because of this key regulatory role, kinases are important targets for drug development.
“Towards the end of my time at Roche, the bench science I was doing started to feel more like a conveyor belt, and I became interested in the new wave of bioinformatics,” said Orchard. When the UK site of Roche shut down, Orchard was made redundant, but soon after started working at EMBL-EBI as a contractor, one day a week, for the then Amaze pathways database, before becoming a full-time SwissProt curator.

What are SwissProt and UniProt?
SwissProt is the manually-curated part of UniProt – a suite of databases that acts as the leading global data resource for protein sequence and functional information. UniProt is a collaboration between EMBL’s European Bioinformatics Institute (EMBL-EBI), the Swiss Institute of Bioinformatics (SIB), and the Protein Information Resource (PIR). According to a recent cost-benefit analysis, UniProt provides a benefit of up to €565 million per year to its scientific users.
“When I arrived at EMBL-EBI, there were fewer than 100 people there. I was part of a big recruitment drive, and you could sense that the place was about to grow,” said Orchard. In those early years, she helped expand the human proteome in UniProt from 8,000 entries to over 20,000. “We didn’t know we were aiming for 20,000 proteins, but there was a sense that it was achievable. It was an exciting time – feeling that you were contributing to building something foundational,” she said.

Orchard also became deeply involved in data integration efforts across multiple EMBL-EBI resources, from InterPro to IntAct, Gene Ontology, and the Complex Portal. Largely as a result of this broad exposure to a number of protein databases, she worked with the nascent EMBL-EBI Training team to establish the EMBL-EBI roadshows, which enabled users around the world to access open data resources.
“I had a close call during the first-ever such event at Purdue University in the US,” remembers Orchard. “I was presenting during a massive thunderstorm when a bolt of lightning hit the cable of the microphone I was using – fortunately only frying that and not the presenter!”

In 2015, Orchard became a Team Leader, first for the Molecular Interaction team and then in 2017 for UniProt, focusing on protein function content and working closely with her counterpart – Maria Martin, who coordinates protein function development – and with Alex Bateman, Senior Team Leader for protein sequence resources.
Some of Orchard’s proudest moments came from those early collaborations, including the official completion of the human proteome in 2008: “It was exhausting for those working on human entries, but it felt like we’d achieved something monumental.”
Another highlight was helping to build IntAct, EMBL-EBI’s open data resource for molecular interaction data. “My colleague Henning Hermjakob had sketched it out on paper. I entered the first curated data into IntAct, and wrote the first curation manual for it – two pages of A4. It’s now about 90 pages long,” she said. She also initiated work on the Complex Portal, a reference resource of protein complexes.

In 2023, Sandra Orchard received the Lifetime Achievement Award for exceptional contributions to biocuration from the International Society for Biocuration. This is a fitting recognition of the critical part she played in developing open data resources and data standards, and of the impact of her scientific contributions.
Here, in her own words, are some of the memories and lessons that Orchard is taking away from her 20+ years at EMBL-EBI, along with some valuable advice for young researchers.
Pioneering standards in proteomics
“I was walking down a corridor when Rolf Apweiler, who was leading UniProt at the time, poked his head out the door and asked me to take the minutes at a meeting. That impromptu assignment became the start of the Proteomics Standards Initiative (PSI), launched under the Human Proteome Organisation (HUPO).”
“Those first meetings about proteome standards were intense – 60 people crammed into two small rooms with no air conditioning, during an unusually hot April. But everyone agreed that standardisation was essential if proteomics was going to progress.”
“More than two decades later, the HUPO-PSI and related initiatives underpin essential resources like PRIDE, ProteomeXchange, and IMEx, ensuring that proteomics data can be reused around the world. Persuading scientists to use data standards is still an ongoing effort, but a worthwhile one.”
“People feel passionate and protective about their work – and rightly so. In my experience, when developing standards, the devil is in the details, but you can usually find a compromise. It helps to have plenty of tea breaks and cake to keep everyone amenable.”

On AI and the future of biocuration
“Tools like text mining have improved enormously, even before the latest AIs came along. Now AI is highlighting just how important biocuration is – because AI systems can’t learn without curated data.”
“Humans are still essential in biocuration. Scientists tell stories in papers, and sometimes the data are secondary. We need people who can understand the leaps a scientist makes when writing a paper, and fill in the gaps.”
“In ten years, AI is likely to be taken for granted and deeply integrated into scientific workflows. Although biocuration is critical, it’s still an unsung hero of data-driven biology, and difficult to get funding for.”

Words of advice
“To young scientists, I would say get as much varied experience as you can and don’t skip the bench work.”
“We need to secure sustainable funding for the core resources that the global scientific community relies on. Start new databases by all means, but we need to make sure we support the ones that the research community already depend upon.”
“We’ve built something extraordinary at EMBL-EBI: resources that the world depends on. We need to keep that going even when geopolitics becomes more fragmented.”

