OUR MISSION STATEMENT
Transforming diagnostics with machine learning and artificial intelligence
Machine learning and artificial intelligence have the potential to transform nearly every facet of healthcare, and prognostics/diagnostics for acute care are no exception. When coupled with rapid profiling of patient host response from blood, ML/AI-driven models promise earlier and more accurate detection of the presence and type of infection, timely stratification of patients by the severity of their condition, and characterization of patients into molecular subtypes more amenable to different treatment strategies.
Experience brings novel approaches
At Inflammatix ML, we understand that bringing world-class ID diagnostics to the point of care demands a holistic approach. Our highly interdisciplinary team draws on decades of experience in academic and industrial science with backgrounds ranging from applied statistics, computer science, engineering, bioinformatics, and software development. We pursue both methodological and applied research directions, developing novel models and methods when the need arises but also investigating the feasibility of the ‘tried-and-true’ as well as the ‘latest-and-greatest’.
We prioritize sharing our work and insights with the broader research community in both publications and international conferences. We continue to grow and curate one-of-a-kind patient datasets, leveraging information at multiple scales to guide selection of high-performing biomarkers and diagnostic classifiers. We engage stakeholders across the company and the medical community to help our test achieve leading performance and optimal integration with clinical workflows and decision-making.
Areas of research
Systems for diagnostic classifier development
Patient host response, as measured by expression of targeted mRNA biomarkers from blood, can vary from patient to patient. Previous approaches have developed host response signatures and classifiers that tend to generalize poorly, owing to biomarker selection and classifier training and validation on a limited, unrepresentative set of patient observations from a single study or hospital. Pooling of data across multiple studies has proven effective in producing more generalizable host response signatures and classifiers but introduces other methodological challenges.
For example, before being used in standard classifier development, our multi-cohort data must first be co-normalized to minimize spurious variation unrelated to our classification tasks. Also, the manner in which we select our classifiers must move beyond random cross-validation to reflect the structure and heterogeneity in our patient population and to provide more realistic estimates of generalization performance. In addition, our training data have been profiled with multiple assay platforms, none of which are fast enough to enable clinically actionable turnaround times for indications like sepsis.
Our biomarker signatures and classifiers must be able to generalize to measurements obtained on more deployment-ready platforms, possibly without access to such limited data at training time. We addressed these challenges in an important proof-of-concept study (Mayhew et al., 2020a) that helped systematize our process of diagnostic classifier development.
Hyperparameter optimization of diagnostic classifiers
An essential step in classifier selection is hyperparameter optimization, or the identification of values for a classifier’s hyperparameters that optimize some objective function (e.g. performance in cross-validation). While conventional methods of grid search and random sampling can still be effective, more sophisticated approaches such as Bayesian optimization (Snoek et al., 2012) and Hyperband (Li et al., 2018) have demonstrated gains in terms of both performance and efficiency (i.e. producing high-performing classifiers with fewer evaluations of candidate hyperparameterizations) on a range of ML benchmarks.
However, these ML studies tended to focus on computer vision tasks and to use large-scale and fairly homogeneous datasets, making their insights less applicable for our setting. Also, these studies only considered performance in internal validation (e.g. cross-validation or validation with a small, left-out subset of the training data), rarely if ever evaluating selected classifier performance in a completely different dataset (external validation). Moreover, such hyperparameter optimization studies in the healthcare domain have been limited. In a first-of-its-kind benchmarking analysis for molecular diagnostics, we compared the internal and external validation performance of classifiers selected by conventional approaches such as grid search and random sampling as well as Bayesian optimization (Mayhew et al., 2020b).
Fairness in diagnostics
Recent studies have increasingly shown potential for performance disparities of deployed ML systems associated with demographic characteristics of the populations those systems are intended to serve. These disparities can have harmful downstream consequences when the system outputs are used to guide or replace human decision-making and can arise for reasons such as low representation of certain subgroups in training data or choices of features or training objectives that accentuate such disparities.
These important observations dovetail with decades of research in genetics and epidemiology on the potential confounding effects of demographic variation on associations between biological variables and clinical outcomes. Indeed, the biomarkers we select from multi-cohort data likely carry associations with demographic characteristics of different patient subpopulations that could have downstream impacts on the performance of our diagnostic classifiers. We are actively developing resources and techniques as part of an end-to-end framework to audit and mitigate subgroup disparities in our classifiers, ensuring equitable as well as high performance across our diverse patient population.