Please use this identifier to cite or link to this item: http://theses.ncl.ac.uk/jspui/handle/10443/6665
Title: mitoML: Machine Learning to Understand Mitochondrial Disease Pathology
Authors: Khan, Mir Atif Ali
Issue Date: 2025
Publisher: Newcastle University
Abstract: Mitochondria are organelles that reside in virtually every cell of the human body and provide the energy for the cells to function. OXPHOS is the main metabolic pathway through which mitochondria generate energy, it is a machinery made up of five complexes each built with sub-units of multiple proteins and molecules. Defects in OXPHOS machinery manifest as results of genetic mutations and lead to mitochondrial disease. Mitochondrial diseases are currently untreatable due to our limited understanding of their pathology. The study of mitochondrial disease pathology involves discovery of OXPHOS protein expression patterns linked to various genetic mutations. Mitochondrial disease affects high energy demanding cells like Skeletal Muscle (SM) cells (myofibres). The expression of various OXPHOS proteins in myofibres taken from SM biopsies is studied. These OXPHOS proteins in SM tissue are observed using various imaging techniques such as Imaging Mass Cytometry (IMC). IMC produces high dimensional (up to 40 channels) multiplexed pseudo-images representing spatial variation in the expression of a panel of OXPHOS proteins within a tissue, including sub-cellular variation. In previous methods good quality ‘analysable’ myofibres in these multichannel images are segmented and various statistical summaries, such as mean protein expression, are computed per myofibre. Statistical summaries of various groups of myofibres linked with different genetic mutations and a healthy control group are compared to analyse and understand the OXPHOS protein expression patterns of various mitochondrial diseases. Theses methods have a number of limitations i) profiling OXPHOS protein patterns in high dimensionality data: Due to high dimensionality multiplex data, it is not possible to classify and discover the OXPHOS protein expression pattern for four out of five groups of genetic mutations affecting mitochondria that have been studied [1] i.e. except for one group of genetic mutation the classification accuracy for all other groups was below 90%. ii) Precise segmentation and curation of myofibres: It is not possible to precisely segment and curate myofibres with existing applications without heavy manual corrections. iii) The use of statistical summaries per myofibre ignores all intra-myofibre features. There are many hypotheses [2, 3] that theorise the existence of differential features within myofibre in various mitochondrial dysfunctions. In this thesis I use Machine Learning (ML)-specifically logistic regression and XGboost, and various Deep Learning (DL) methods to address the three limitations mentioned above with the following contributions. I) Classify myofibres of mitochondrial patients affected by various genetic mutations, using explainable ML and myofibre statistical summaries. I show that using ML the classification accuracy for all five mutations exceeds 90% . I also demonstrate the use of explainable ML methods to discover the OXPHOS protein expression patterns associated with these high predictive accuracy ML models. II) Precise myofibre segmentation and curation pipeline: I developed ‘myocytoML’ a precise myofibre segmentation and curation pipeline that meets the quality of gold standard manual human annotations. This also led to the building of NCL-SM: A large dataset of more than 50k manually annotated myofibres, which is now available for public use. III) Classify myofibres of mitochondrial patients affected by various genetic mutations, using explainable DL and segmented multichannel raw images. I show that using DL the classification accuracy for all five mutations exceeds 98%. I also demonstrate the use of explainable DL methods to discover the OXPHOS protein expression patterns associated with these high predictive accuracy DL models.
Description: PhD Thesis
URI: http://hdl.handle.net/10443/6665
Appears in Collections:School of Computing

Files in This Item:
File Description SizeFormat 
KhanMAA2025.pdfThesis19.36 MBAdobe PDFView/Open
dspacelicence.pdfLicence43.82 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.