Our research

Spotlight on: Informatics

What is informatics?

Informatics refers to the collection, management, analysis and combination of large, and often complex, datasets. In the field of health and biomedicine, these datasets come from a variety of sources; for example, electronic patient records, population studies, clinical trials, imaging, and large-scale cellular and molecular studies. The volume of data from these sources is growing constantly, creating exciting opportunities for health research on a scale that has previously not been possible.

The value of using data

The world creates 2.5 quintillion bytes of data every day, equivalent to over 150,000 iPads’ worth of information. In medical research, large and complex datasets contain huge amounts of information. Using this data in research is incredibly valuable, particularly when different datasets are linked together, as it can help us to understand the causes of disease, develop more effective treatments and improve our health services.

The key to this type of research is that it can be carried out on a large scale and reveal insights that would otherwise remain hidden. In clinical and population studies, huge numbers of people – sometimes in the millions – can be followed over many years to provide rich information on their health and behaviour. At the same time, large-scale studies in genetics and imaging are providing us with important information on disease mechanisms and the behaviour of cells and organs. These datasets are also offering opportunities to model complex bodily systems using computational approaches, which will allow us to understand how different components, such as cells, interact and underlie the behaviour of the whole system. Ultimately, this can help to shed light on the complex interaction between biological and lifestyle factors that cause disease, and how the body responds to different treatments.

The UK has a unique research advantage in this area. The NHS contains a wealth of patient information on the whole UK population throughout the life course. Researchers can analyse this data securely and link with genetic and biological data to gain new medical insights. No other country has this amount and type of data held by a single healthcare provider.

Global IT companies – such as Microsoft, IBM and Google – are also increasingly involved in the field of personal health, healthcare services and research and development. This is introducing new skills and technologies to help improve our understanding of health and disease, and to create smarter and more cost-effective healthcare services.

Informatics in research

The use of informatics is extremely valuable in medical research, and allows researchers to make discoveries that could not be achieved using other methods.

Recently, researchers from the Wellcome Trust-MRC Cambridge Stem Cell Institute and Microsoft Research created the first computer model that can simulate blood cell development. The team based this on measurements of gene activity in over 3,900 blood stem cells. By using a computer to firstly model how the various blood cells in the body develop, the model could then be used to explore how blood cancers form and to find new treatments for them. “Because the computer simulations are very fast, we can quickly screen through lots of possibilities to pick the most promising ones as pathways for drug development,” says Professor Bertie Gottgens.

Informatics research can also provide insight into how a drug works and who is most likely to benefit from taking it. In 2009 a group of MRC scientists explored whether the drug, metformin, can protect against cancer by bringing together and analysing data from several health datasets. Metformin is a drug widely used by patients with type 2 diabetes to control their blood sugar levels, but it also appears to be linked to a lower occurrence of cancer. Their research showed that metformin does reduce cancer rates in diabetic patients by activating a cancer suppressor gene. This finding could have important implications in the fight against cancer.

IBM and Google – are also increasingly involved in the field of personal health, healthcare services and research and development. This is introducing new skills and technologies to help improve our understanding of health and disease, and to create smarter and more cost-effective healthcare services.

Prof Ian Douglas (click to read more about Ian Douglas)
Prof Ian Douglas is an MRC researcher who has used health informatics to study the safety of the obesity drug, orlistat.

Another MRC researcher, Professor Ian Douglas, from the London School of Hygiene and Tropical Medicine has used informatics to test whether the drug, orlistat, is safe for patients to take. Orlistat is commonly prescribed to treat obesity, but there have been reports that it may be associated with liver problems. Ian linked the health records of around 100,000 patients over a period of more than 10 years to show that there is no evidence that orlistat increases the risk of developing liver problems.

Finally, the use of informatics in research can help us to look at the impact of new health policies. Professor Jill Pell is an MRC-funded scientist in Glasgow who has used informatics to look at the impact of the 2006 Scottish smoking ban on pregnancy complications. By analysing 14 years’ worth of data on over 700,000 births, Jill was able to show that the number of preterm and low-birth weight babies dropped in the three years after the ban was introduced. Her work strongly suggests that the smoke-free legislation has had a positive impact on this area of public health.

Jill Pell and Liam Smeeth (click to read Network magazine)
MRC researchers, Professor Jill Pell and Professor Liam Smeeth, discuss the importance of sharing patient records to improve our health and healthcare in an MRC Network magazine article (page 22).

These are just a few examples of research which has analysed biomedical information and linked large health-related datasets to improve our understanding of biology and make important medical discoveries. To read more examples like these, visit the Farr Institute for Health Informatics Research website.

The MRC has also been involved in a campaign to raise awareness of the benefits of using health records in research

How is the MRC involved?

MRC activity in informatics

The MRC is committed to making the UK a world leader in informatics research. In partnership with other major funders we have awarded over £100 million in this area over the past few years to support several nation-wide initiatives. These initiatives are supporting pioneering research using health and biomedical datasets, developing the expertise and skills needed to analyse and link complex data, and providing the best tools, technologies and environments for scientists.

Working in partnership

Farr Institute logoTogether with nine  partners encompassing charities, research councils and government departments, we have invested £40 million to establish the Farr Institute of Health Informatics Research. The Farr Institute has brought together medical, population and data scientists from across the UK to analyse and link large health datasets. This research will help to benefit the health of patients and the public, and will boost the UK’s reputation as a world leader in research using electronic health data. The Farr is made up of 24 universities across the UK and has four central ‘hubs’ in London, Dundee, Manchester and Swansea.

The Farr Institute also runs the Farr Network, formally known as the UK Health Informatics Research Network (UKHIRN), which was set up in 2013 to bring together informatics expertise from across the UK. The Farr Network will help to coordinate career development opportunities for scientists, develop standards to improve the way data is linked in research, and build partnerships between industry, the NHS and the public. 

Recent technological developments in analysing genomic information has offered unique opportunities to understand how a person’s DNA is linked to disease, and how it can be used to better diagnose and treat patients. 

The UK Government is now capitalising on these developments through its strategy to sequence 100,000 whole genomes, called The 100,000 Genomes Project, and to introduce genomic medicine into the NHS. The sequencing is focusing on rare inherited diseases, cancers and pathogens, with a vision to support the linkage of this data to patients’ health records and other phenotypic data. The project is being led by Genomics England (GeL), which was initially set up through a £100 million investment from the Department of Health. The MRC has recently pledged £24 million to support these ambitions and help establish the GeL research Data Centre to manage, link and analyse genomic and health information. 

Looking beyond the UK, the MRC jointly supports the European Bioinformatics Institute (EBI) through the European Molecular Biology Laboratory partnership. EBI provides freely available data from life science experiments, performs basic research in computational biology, and offers training to support researchers in academia and industry. Similarly, the €200m ELIXIR initiative aims to bring together broader collections of molecular and cellular research data produced across Europe. The MRC, together with the Biotechnology and Biological Sciences Research Council (BBSRC) and the Natural Environment Research Council (NERC), is funding the ELIXIR ‘UK-hub’, supporting research and training in how to manage and analyse large datasets. 

An integrated approach

Until recently the clinical and biomedical research communities have largely worked independently of each other. One of our goals is to “join up” the broad field of informatics to link detailed biology with human health and disease. In 2014 we invested a further £39 million in six major collaborations (led by Imperial, Leeds, MRC/UVRI Uganda Research Unit on AIDS, Oxford, Warwick, and UCL)  that will support the development of tools, infrastructure and careers. Together this will improve the analysis of molecular, phenotypic, microbial and exposure data, and its linkage to clinical and health records.

Another example of this joined-up approach is UK Biobank, which the MRC established in 2005 in partnership with other charities and government funders, and which to date represents a total investment of £143 million. UK Biobank collects biological samples and genetic, imaging and lifestyle data from half a million people and links this to their medical records. This has created a unique global resource of unprecedented scale that will continue to grow as the participants are followed up over many years. UK Biobank data is being made available for use by the wider research community with the aim of helping prevent, diagnose and treat a wide range of diseases.

Building capacity

In order to make the most of the data being generated in health and biomedical research, we need scientists who have the skills and expertise to link and analyse large datasets. We have recently partnered with the Engineering and Physical Sciences Research Council (EPSRC) to support three multi-disciplinary Centres for Doctoral Training in biomedical informatics, based at the Universities of Oxford and Warwick. The MRC also has an ongoing Skills Development Fellowship programme supporting early career researchers to develop quantitative expertise. This covers mathematics, statistics, computation and informatics applicable to any biomedical or health-related data sources, from molecular to population level.

Safe data

We are committed to making sure that our researchers handle patient data responsibly. This means working in secure and trusted environments that protect participant confidentiality. The MRC Good Research Practice Guide sets out the guidelines and standards that researchers should follow to make sure that their work adheres to legal, ethical and regulatory requirements. We are also currently working with other UK research funders to make sure there is a consistent approach to sharing research data in a safe and ethical way, which promotes excellent research.

Finally, we are working with international partners through the Global Alliance for Genomics and Health (GA4GH) [LINK] to develop a common framework that will allow genomic and clinical data to be shared effectively and responsibly. This includes policies on consent, privacy and data governance, approaches to data storage and standards to promote interoperability.