Machine learning reveals hidden features in histology images

Home
News
Machine learning reveals hidden features in histology images

18 July 2024

Machine learning reveals hidden features in histology images

Researchers at Human Technopole developed a self-supervised machine learning model that combines histology, gene expression, and genetic variation to automatically identify and cluster distinct tissue substructures, cells, and pathological features in human tissues.

Histology is a technique that allows microscopic identification of different cellular components and structures in a tissue. Histological examination of tissues is paramount for accurately diagnosing diseases and provides crucial information in clincal diagnoses. Traditionally, pathologists examine stained tissue sections under a microscope. However, the advent of digitalisation and computational methods have made it possible to scan histology images at high resolution and to automatically analyse them using machine learning-based approaches. Recently, efforts have been made to match histology and molecular data, such as large RNA sequencing and Whole Genome Sequencing datasets, from thousands of samples. Combining this information would give important insights into how tissue structure and function vary in a population and how genetic variation and gene expression impact healthy and diseased tissues.

Research conducted by Francesco Cisternino, a PhD student in the lab of Dr Craig A. Glastonbury (The Glastonbury Group) at the Human Technopole Genomics Research Centre has led to the development of a new machine learning model based on Vision Transformers (ViT) that learns to cluster and segment tissue automatically. The researchers combined histology, gene expression, and genetic variation data in more than 13,000 samples representative of 23 healthy human tissues from 838 donors.

The study has now been published in Nature Communications.

By analysing gigapixel Whole Slide Images, the Group found significant intra-tissue variability across donors and identified unannotated pathologies such as calcification events, incorrect tissue assignment and tissue contamination. In addition, they discovered gene expression signatures of specific tissue substructures and revealed previously unknown genetic associations.

The researchers also developed RNAPath, a machine learning model that enables them to predict and spatialise gene expression levels from H&E histology images alone. RNAPath outperformed other competing methods, such as HE2RNA, a widely used deep learning model to predict RNA-Seq expression from whole slide images.

In summary, this study reveals that self-supervised machine learning methods and histological archives can be used to learn new insights into disease pathology and tissue organisation and allows researchers to explore the interplay between morphological tissue variability and gene expression.

The research lead Craig Glastonbury commented, “As histological archives and pathology workflows become digital, we believe there is substantial opportunity for using self-supervised learning to uncover novel, fundamental biology about tissue structure, function and its variability in a population in both healthy and diseased subjects”.

Cisternino, F., Ometto, S., Chatterjee, S. et al. Self-supervised learning for characterising histomorphological diversity and spatial RNA expression prediction across 23 human tissue types. Nat Commun 15, 5906 (2024). https://doi.org/10.1038/s41467-024-50317-w

Image: RNAPath predicting the spatial location of CD19 expression across a H&E thyroid tissue section.

You could also like:

How the human genome condenses during cell division

As a result of an international collaboration, Human Technopole researchers identify M18BP1 as an activator of condensin II through a phosphorylation-regulated competition with MCPH1. This previously unknown mechanism fills a significant gap in our understanding of cell division and suggests avenues for exploring how its misregulation contributes to human disease. The results of the research are published in Molecular Cell.
Ageing and Chronic Diseases: Europe Rewards Human Technopole’s Research

The ERC has awarded HT a €2.5 million grant for a study investigating how lifestyle, biology and environmental exposures weaken the immune system over the course of life and increase vulnerability to disease in old age. This new knowledge will help develop models to predict long-term disease risk.
1 fully-funded PhD Position in the Jug Group through DADS

Are you passionate about AI, cutting-edge research, and making a real impact in biomedical science? The Jug Group at Human Technopole is thrilled to announce a new PhD opportunity as part of the prestigious PhD Programme in Data Analytics and Decision Sciences.
How thyroid hormones are transported into target cells

An international collaboration between Human Technopole and Erasmus Medical Centre researchers sheds light on the molecular mechanisms underlying the transport of thyroid hormones in human cells by monocarboxylate transporters (MCTs). The results of the research are published in Nature Communications.
Cancer, insomnia, ancient DNA: over 100 projects from Italian scientists at HT

Human Technopole has presented 102 research projects that were the first to benefit from the cutting-edge equipment of the National Facilities — the advanced technologies that the Institute makes available to external researchers from across Italy. The majority of studies focus on cancer and neurodegenerative diseases, accounting for 50% of the selected projects. The announcement was made this morning in Milan during “Open HT – The Open Day on Life Sciences”, an event designed to showcase Human Technopole’s achievements and outline future goals.