Deep Learning Enables Early-Stage Prediction of Preterm

Abstract


Introduction
Preterm births (PTBs) are live births that occur before 37 weeks of pregnancy and are a major public health concern worldwide.It is estimated that about 15 million babies are born pre-term globally each year, putting the global PTB rate at about 11% (Blencowe et al., 2012).PTB is among the leading causes of neonatal mortality and morbidity, especially in low-and middle-income economies (Perin et al., 2022).The global PTB rate is also on a steady rise, thus making it a significant burden (Chawanpaiboon et al., 2019).Approximately 18% of the deaths among children under the age of 5 years happen within the first 28 days of life and can be attributed to complications arising from PTB (Walani, 2020).Additionally, it can lead to long-term health complications such as respiratory illnesses, neurodevelopmental disorders and learning disabilities, arising from the developmental issues associated with PTB (Townsi et al., 2018;Chung et al., 2020).PTB is also not a problem that is specific to underdeveloped or developing countries, although the ill-effects of it may be more pronounced in low-income countries.Incidences of PTB are also found in high-income parts of the world, albeit with a lower frequency, and the rates of PTB have not been on the decline either in most parts of the world (Chawanpaiboon et al., 2019).
The pathophysiology behind PTB is not completely understood yet, although certain risk factors, including but not limited to smoking habits, alcohol intake and reproductive history, have been identified to be associated with increased pre-term delivery risk (Blencowe et al., 2012;Pfinder et al., 2013;Stock et al., 2020).The ill-effects of PTB can be mitigated, and a healthy, full-term gestation outcome may also be achieved if appropriate interventions are administered (Newnham et al., 2014).Their success, however, depends on identifying at-risk subjects at earlier stages of their pregnancy, as these approaches are effective when administered during the earlier stages (Blencowe et al., 2012;Newnham et al., 2014).Current methods for assessing pre-term pregnancy outcomes involve the use of physical and biochemical markers, which are not accurately determinative of a potential incidence of premature culmination of pregnancy in the future (Georgiou et al., 2015).
Many non-pathogenic bacteria, viruses and fungi inhabit various areas of the human body, such as the gut, mouth, and reproductive tracts (Sender et al., 2016), and are collectively referred to as the "microbiome".Microbiomes are essential for normal functioning of the respective organs, and maintain a symbiotic relationship with the human body and drive key biochemical reactions, (Gordon et al., 1971) and dysregulated microbiomes are often implicated in various diseases.These microbial communities are also present in the reproductive tracts, and have been reported to influence the pregnancy outcome (MacIntyre et al., 2015).There is evidence linking the composition of vaginal microbiomes to risk of PTB, and the abundance levels of specific microbiota, such as various species of the Lactobacillus genus, have the potential to be indicative of PTB even at earlier stages of pregnancy (Brown et al., 2019;Romero et al., 2014).Vaginal microbial communities can be categorized into specific Community State Types (CSTs), which are typically characterized by abundances of various Lactobacillus species (Romero et al., 2014).CSTs are associated with increased or decreased risk of abnormalities such as Bacterial Vaginosis (BV), Urinary Tract Infections (UTIs) and even PTB (Gudnadottir et al., 2022).Moreover, alpha-diversity indices, such as Shannon and Simpson diversity, which can quantify the diversity of vaginal microbiota, have been harnessed for predicting PTB (DiGiulio et al., 2015;Haque et al., 2017).However, the vaginal microbiome differs considerably from individual to individual, especially across races (Sun et al., 2022;Gupta et al., 2017).Additionally, the microbial abundance may further vary depending on the sequence processing methods used on 16S ribosomal RNA (rRNA) data, which is typically used to estimate taxonomic abundance at various levels of classification (Bharti et al., 2019).Consequently, the success of diversity indices for estimating PTB risk may be specific to certain cohorts, or be influenced by the sequence processing pipeline and consequently, may not translate across cohorts, as is our observation in this study.Machine Learning (ML)-based approaches have also been explored in this context, which leverage features such as abundance of various taxa, phylotype counts, CST of the vaginal microbiome, age, race and more, for PTB risk assessment.
The vaginal microbiome evolves as the pregnancy progresses (MacIntyre et al., 2015;Romero et al., 2014), and the numerous changes that it undergoes may contain a signature for identifying PTB risk.Currently, there is a severe lack of approaches that exploit the temporal dynamics of vaginal microbiomes, by looking at it as a time-series problem, for PTB risk assessment, and most current predictive methodologies use static data.Deep learning approaches such as recurrent neural networks (RNNs), have previously been used for modeling the dynamics of gut and other microbiota in various contexts (Baranwal et al., 2022;Fung et al., 2023;Medina et al., 2022) and have found success.To the best of our knowledge, such methods have not been applied to vaginal microbiota, especially in the context of PTB risk assessment, so far.This may be partly attributed to the fact that RNN-based approaches demand data sampled at regular intervals, which is challenging to collect as study subjects are often irregular in clinical visits.With this study, we present a deep learning-based approach, "neural controlled differential equations (CDEs)" that is capable of differentiating between term and preterm births using time-series vaginal microbiome data, which overcomes the dependence on regularly sampled microbial data.We also highlight the limitations of alpha diversity indices and traditional ML methods for PTB prediction in racially and ethnically diverse patient cohorts.We show that modeling the temporal dynamics of microbiota using deep learning methods results in more reliable PTB risk scoring than simple ML-based methods.Our best model, utilizing neural CDEs, outperforms any ML-based PTB prediction approaches so far.On the basis of this study, we show the potential of vaginal microbiota for PTB prediction, and that such approaches can be pushed towards complete clinical viability with further efforts.

Dataset
We obtained 16S rRNA sequences collected from human patient samples.The data was sourced from a previously published study by Callahan et al. on refinement of a vaginal microbiome signature of preterm birth (Callahan et al., 2017).The dataset is publicly available under the open access category in the Sequence Read Archive (SRA), BioProject ID PRJNA393472.It consists of 16S rRNA sequence samples, spread across 133 racially and ethnically diverse subjects, and sampled at different points of time during the pregnancy.

16S rRNA Sequence Processing
It has been widely established that the hypervariable regions (V1-V9) within 16S rRNA gene can be used for phylogenetic studies and genus or species-level classification in diverse microbial populations (Weisburg et al, 1991).Furthermore, certain hypervariable regions (such as the V4) are semi-conserved and can reliably predict specific taxonomic levels (Yang et al., 2016).The procedure to convert the 16S rRNA sequence data to microbial abundance involves various stages of processing.In the first step, quality control checks are performed and sequencing artifacts, low quality reads, etc. are removed from the read sequences.Secondly, the preprocessed sequences are aligned against a chosen reference database, and a taxonomic class is assigned 4 to each sequence.The sequences are then grouped into clusters, which represent Operational Taxonomic Units (OTUs), based on sequence similarities.Lastly, the abundances of each OTU in samples are estimated (Schloss et al., 2009;Edgar, 2013;Estaki et al., 2020;Callahan et al., 2016).The microbial abundance obtained depends on the specific processing steps, and variations in processing steps can result in different abundance values (Schloss et al., 2009;Edgar, 2013;Estaki et al., 2020;Callahan et al., 2016).
The dataset (PRJNA393472) derived from SRA contains sequences generated after amplifying and sequencing the V4 hypervariable region of the 16S rRNA gene.We used the DADA2 processing pipeline (Callahan et al., 2016) to derive microbial abundance data from the sequence reads.The metadata and taxonomic abundance tables were generated using the SRA cloud (Katz et al., 2021) and abundances were obtained at various levels of taxonomic classification.We retained genus-level abundances for all our analyses since abundances at further levels were captured at a much lower resolution.

Processing Taxonomic Abundance Data
We eliminated the samples for which metadata information for certain key fields, viz., gestational age at the time of sample collection, gestational age at delivery, etc., was missing.We eliminated the genera abundance for samples collected during trimester 3 (gestational age > 24 weeks) for our analyses, with the intention of being able to predict instances of preterm delivery sufficiently early.Furthermore, we removed samples collected during or before the 8 th week of gestation, as they were present for very few (6 out of 133) subjects.Furthermore, for some of the analyses, we transformed the genera abundance data to sample-wise relative abundance, and filtered out genera with high skewness and high kurtosis, thus removing some of the genera whose abundances contributed to noise.We retained 70% of the subjects (90 out of 133) as the training dataset and the rest were used for validating the approaches.The training and test datasets were kept consistent across all the analyses.The processed taxonomic abundance data and the corresponding metadata are made available in the code repository (see "Code Availability", Section 5).

Diversity Metrics
Alpha diversity metrics have been reported to be potentially indicative of preterm birth, and a highly-diverse vaginal microbiome is correlated with increased risk of preterm delivery (DiGiulio et al., 2015;Hyman et al., 2014;Haque et al., 2017).We computed Shannon, Simpson, Chao1 and Gini alpha diversity indices, as well as Taxonomic Composition Skew (TCS) (Haque et al., 2017), a diversity index specifically tailored for the vaginal microbiome.Unlike other diversity indices, TCS takes into account that vaginal microbiomes are usually dominated by the Lactobacillus species and other genera are in the minority.TCS responds in a different manner, to changes in abundances of sparse and dominant taxa, and thus is possibly more suitable for quantifying the diversity in vaginal microbiomes.We checked for statistically significant differences in alphadiversity index values between the term and preterm classes during various gestational periods using a twosided, independent t-test.The standard diversity metrics were computed using the scikit-bio python library (version 0.5.8).

Traditional ML Approaches
We used two ML classifiers: Decision Tree (DT) and Random Forest (RF), to predict term/preterm outcomes.This constituted a secondary baseline for benchmarking the performance of higher, more complex deep learning-based prediction approaches.For each patient subject, the microbial abundance profile closest to the week of delivery and obtained during the period between the 9 th and the 24 th week of gestation, following the hypothesis that composition of vaginal microbial communities closer to the period of delivery are better indicative of preterm delivery risk.The resultant training and test sets contained 93 and 40 samples respectively.

Deep Learning Approaches
Machine learning classifiers, such as Support Vector Classifiers (SVCs), as well as tree-based classifiers such as DT and RF have been explored extensively for preterm birth prediction using vaginal microbiota, most often in tandem with other features such as physical markers and patient history.However, these classifiers have largely lacked the capability of making reliable predictions.Surprisingly, deep learning models have hardly been explored for this particular problem.Given the time-series nature of the data, we focused on deep learning algorithms for sequential data in this study.

Recurrent Neural Networks
Recurrent Neural Network (RNN) is a type of neural network designed to handle sequential or time-series data.The issue with standard RNNs however, is that they have difficulty in learning long-term dependencies in long sequences, due to the issue of vanishing/exploding gradients (Pascanu et al., 2012).Long Short-Term Memory (LSTM) is a type of RNN that is capable of learning long sequences, and are possibly more appropriate for the week-wise taxonomic abundance dataset.LSTM maintains a hidden state, which stores short-term information, and a cell state, which stores long-term information.The initial hidden and cell states are generally set to zero vectors.A LSTM cell at each time step updates the hidden and cell states based on the states at the previous time step and the input data at the current time step.
(ℎ  0 ,  0 ) = (0,0) However, the conventional LSTM system demands a continuous and uniform time-series dataset, i.e., the time-steps must represent uniform intervals and the input data should be available for each step.While the taxonomic abundance data is uniformly sampled (week-wise), data for some subjects is not present for some weeks.Additionally, the first week for which data is available is different for each subject, and thus, a zerovector initialization will not be appropriate for the initial hidden state.To address these issues, we modified the LSTM network accordingly.Firstly, we initialized the hidden state, , with a trainable embedding layer, ℎ  0 and the cell state, , was initialized as a zero vector.Secondly, at each time step, we had the LSTM cell   0 generate the taxonomic abundance forecast, , at each time step, and used the forecast whenever the input    data was not available.The final hidden state, , was fed to a linear layer with parameters and    ℎ     followed by a sigmoid activation, , to predict the term/pre-term outcome .The entire network was trained end-to-end.
Figure 1 outlines the described LSTM network.The LSTM implementation assumes that the input data is continuously and regularly sampled.However, in our case, data for some intervals may be missing.To overcome this, we masked the data for missing time steps by using zero vectors to substitute the missing data points.In parallel, we also used a vector indicating the coordinates of the masked time intervals for each sample, for which the model used the forecast, i.e., the model-predicted microbial abundance instead of the ground truth values.Additional method details are provided in the supplementary material (Section 2.1) and the hyperparameter values are listed in supplementary A significant limitation associated with clinical data pertains to its irregular sampling of data points, which presents challenges in constructing effective machine learning models that can effectively harness the inherent time-series information.The irregularity in data sampling introduces two notable drawbacks: firstly, the size of input data, contingent upon the number of sampling instances, differs among various subjects; secondly, the timing of sampling instances is not strictly discrete, thereby restricting the applicability of commonly employed RNN models that assume uniform intervals between sampled data points.To overcome this, we leverage a recently introduced class of deep learning models -Neural Ordinary Differential Equations (ODEs) that combine a neural network with ODEs and allow for continuous interpolation between two randomly spaced sampling instants.
Notably, Neural ODEs exclusively consider the evolution of time-series data commencing at a fixed time point, denoted as , which is accompanied by an initial condition represented as .In the context of  0   0 our research, signifies the initial abundance of genera at this specific time , which might correspond to   0  0 the onset of gestation week 9. Regrettably, the trajectory of microbial abundance varies from subject to subject, and the initial abundance data for all subjects may not be accessible for subsequent analysis, as the microbial profiles of each subject were not uniformly sampled at the same time point, namely t 0 .Consequently, while Neural ODEs excel in interpolation tasks, they cannot be seamlessly integrated into our framework due to the absence of consistent initial abundance data.Nevertheless, it transpires that addressing this problem, specifically how to integrate incoming information, has already been thoroughly explored within the realm of mathematics, particularly in the field of rough analysis, which is dedicated to the examination of CDEs (Lyons, 1994;Lyons et al., 2007).Kidger et al. (Kidger et al., 2020) have introduced a novel framework known as Neural CDEs, which extends CDEs to Neural ODE models.To put it simply, Neural CDEs can be seen as continuous-time counterparts of Recurrent Neural Network (RNN) models.These models can be trained efficiently using a method called "adjoint backpropagation", which is elaborated on briefly in the supplementary materials (Section 3.1), and detailed mathematical representation of it can be found in (Chen et al., 2018).In brief, the Neural CDE model can be summarized through the following sequence of operations: (Initialization) Here, and correspond to linear models responsible for transforming the initial taxa abundance (along     with the time-stamp ) into the initial hidden state, , and the final hidden state, , into the output label,  0 ℎ  0 ℎ   respectively.The map, is used to avoid translational invariance to first sampled time instant.is the natural    cubic spline with knots at such that .Natural cubic splines allow for smooth interpolation  0 ,…,     = (  ,  ) and minimum regularity for handling certain edge cases.is neural network model depending on parameters,

𝑓 𝜃
. Due to its dependence on cubic splines, Neural CDEs (Kidger et al., 2020) can be applied to irregularly  sampled time series, even with temporally-scattered initial conditions.Thus, we chose to apply Neural CDEs to predicting preterm birth using the irregularly-sampled microbial abundance dataset.A comprehensive mathematical introduction to Neural CDEs is outside the scope of this paper.For those interested in delving deeper into the mathematical details, we recommend consulting (Kidger et al., 2020) for a more thorough explanation.The torchcde library (version 0.2.5) was used to implement the Neural CDE model in python.Hyperparameter values for the model are listed in supplementary Table 4.

Microbial abundance dataset contains racially and ethnically diverse subjects
The 16S rRNA sequence data was converted to taxonomic abundance (Methods, Section 2.2), which led to approximately 290,000 abundance counts, spanning taxonomic counts at various levels of classification, out of which approximately 65,000 corresponded to genus-level, belonging to 2,326 unique samples which collected at various weeks of gestation throughout the pregnancy, spread across 133 subjects of diverse race and ethnicities (Figure 2a), out of which 85 subjects delivered at term and 48 subjects delivered preterm (Figure 2 b, d).We aligned the abundance counts subject-and gestational week-wise for ease of interpretation (Figure 2c).Abundance counts for multiple samples derived from the same subject collected during the same week of gestation, if any, were replaced by the mean of those counts to ensure consistency, as future analyses were carried out on week-wise data.The distribution of number of genera present in each sample (i.e., number of genera with non-zero abundance in each sample) is visualized in Figure 2e.Microbial abundance profiles corresponding to 43 out of the 133 subjects were reserved as the test set.(refer methods Section 2.3).

Diversity metrics do not reliably identify at-risk PTB subjects
Alpha-diversity indices were computed on samples collected during trimester 1 and trimester 2 (gestational weeks 9 to 24, see methods Section 2.3).The visualization of Chao1, Gini, Shannon, Simpson and TCS alphadiversity indices computed at various weeks of gestation for subjects who delivered at term and preterm is presented in Figure 3 a-e, respectively.The results of two-sided, independent t-tests to examine differences in diversity index values across term and preterm groups are presented in Table 1.Although t-test reveals a statistically significant difference in the Chao1 diversity index between term and preterm groups during gestational weeks 9-12 (p = 0.005) and 13-17 (p = 0.02), there are too few samples in the preterm group during these gestational periods to draw reliable conclusions (see Figure 3).No signature that can distinguish term/preterm birth is observable from these metrics.As can be seen from the plot, the alpha-diversity indices are not predictive of a preterm delivery outcome, and perform worse than random classifiers (prediction accuracy on test set < 50%, refer methods Section 2.4).

Statistical analyses indicate presence of signatures for PTB prediction in evolving microbiomes
To reduce the set of features used for the classification task, we removed the genera with high skewness and kurtosis.Skewness and kurtosis were computed on relative abundance of microbial genera during the period of trimester 1 -trimester 2 (gestational weeks 9 to 24).The abundance distributions of a large number of genera have a high positive skewness and kurtosis (see Figure 4 a, b), and may contribute to noise.Thus, we excluded the genera which had highly skewed relative abundances (i.e., skewness > 10) as well as genera with high kurtosis (kurtosis > 10).As a result, only 6 genera were retained, namely, Lactobacillus, Anaerococcus, Gardnerella, Peptoniphilus, Finegoldia and Prevotella.Except for Finegoldia, these genera have been identified to be linked to PTB risk previously.Lactobacillus is the most dominant genus within the vaginal microbiome, and low counts of Lactobacillus have previously been stated to be indicative of increased PTB risk (Bayar et al., 2020;Gudnadottir et al., 2022).Certain species of Anaerococcus are found to be associated with increased PTB risk (Ansari et al., 2021), however, there are also reports that Anaerococcus species may be protective in nature (dos Anjos Borges et al., 2023).There is strong evidence linking high Gardnerella vaginalis presence with PTB and bacterial vaginosis, which also increases PTB risk (Nelson et al., 2009;Ng et al., 2023).In some populations, increased counts of certain Peptoniphilus species were also found to be associated with high PTB risk (Park et al., 2022a).Similar evidence exists associating high Prevotella abundances with increased PTB risk (Freitas et al., 2018;Fettweis et al., 2019;Park et al., 2022b).However, there is insufficient evidence to establish a link between these observations and the changes that vaginal microbiota undergo throughout the duration of the pregnancy.We further re-computed the relative abundance based on the abundance counts of these 6 genera only, and visualized the gestational week-wise abundances and corresponding term/preterm delivery outcomes.The results are presented in Figure 5 a-f.This analysis highlights that the composition of the 6 genera mentioned above, changes significantly during the period from end of trimester 1 to trimester 2 of pregnancy.The trends indicate that low abundance of the Lactobacillus genus during trimester 2 (gestational Gardnerella counts with PTB risk (Figure 5c, p = 0.0073), and the effect is more pronounced during gestational weeks 16-18.Most of the samples with increased Prevotella counts during weeks 19-21 belonged to subjects who went on to deliver preterm (Figure 5e, p ≈ 10 -6 ).

ML classifiers do not make adequate PTB risk assessment
We tested the performance of two ML classifiers, viz., RF and DT towards prediction of preterm birth.For this, we isolated the latest available microbial abundance profiles of training set subjects as well as the test set subjects, collected during the period between the 9 th and the 24 th week of gestation, with the rationale that the state of the microbiome closer to the period of delivery should be better predictive of term/preterm delivery.Optimal hyperparameters for both RF and DT were identified by performing a grid search on pre-defined parameter search spaces using 3-fold cross-validation on the training set, and are listed in supplementary Table 2.The resultant models were validated on the test set by computing ROC-AUC, accuracy, precision-recall and f1 score.The RF model performed significantly better (Figure 6 b, d) compared to DT (Figure 6 a, c) which performs worse than a random predictor.The detailed results of both classifiers are presented in Table 2 Neither of the models, however, make adequately reliable predictions on the test set.

LSTMs do not decode the temporal dynamics of vaginal microbiomes
LSTM was trained on the week-wise genera abundance data sampled during gestational weeks 9 to 24, after making appropriate adjustments to account for the irregularity in sampling (see Methods, Section 2.6.1, Figure 1).When trained on the entire set of genera with non-zero variance in the training set, the model overfits and does not generalize well to making predictions on the test set (training set accuracy = 100%, test set accuracy < 60%).Even when trained on the set of genera with low skewness and kurtosis, the model fails to make sufficiently accurate predictions on the test set (accuracy = 63%), and is outperformed by the RF model described above.We attribute this lack of predictivity to the increased estimations that the model makes to fill the temporal gaps in the data, and it may necessitate availability of additional data samples to be able to make these estimations more accurately, either in terms of more patient subjects or increased density of samples per subject.

Neural CDEs are capable of achieving PTB prediction with a substantial accuracy
The neural CDE model trained on relative abundances of genera selected by the skewness and kurtosis filtering outperforms all other models described above.The resultant model performs reasonably well on the test set (mean test set ROC-AUC = 0.82, accuracy = 74.5% precision = 0.65, recall = 0.71, f1 score = 0.71).The results are presented in Figure 7 a, b.We then trained the model after shuffling the term/preterm labels for subjects in the training set.As expected, the model behaved similar to a random predictor on the test set (ROC-AUC < 0.5).Albeit not on the same validation dataset, our approach describes better results than the best submission in the DREAM challenge for term vs preterm prediction (ROC-AUC = 0.68, accuracy = 67%, sensitivity = 0.48, specificity = 0.79) (Golob et al., 2023), despite using microbial abundances only up to the end of the 2 nd trimester, i.e., the 24 th week of gestation, as opposed to the 32 nd week of gestation in the DREAM challenge.

Discussion
Current in vitro or in vivo approaches lack the ability to detect PTB incidences at earlier stages reduces the effectiveness of prophylactic or therapeutic interventions that can be administered to mitigate neonatal health concerns associated with it.Current risk assessment approaches involve physical examinations, and factors such as cervical length may be used to estimate the risk.Additionally, abnormality in levels of biochemical markers such as Pregnancy-Associated Protein A (PAPP-A) (Smith et al., 2002;Gundu et al., 2016), Cervicovaginal Interleukins (Manning et al., 2019;Park et al., 2020), etc., may help detect PTB as well.However, there is a lack of definitive confidence intervals for these physical or biochemical tests.Recent studies have highlighted the utility of diversity of vaginal microbial communities towards PTB prediction, by establishing correlations between alpha-diversity indices associated with abundances of various microbial species (or genera) and incidences of PTB (DiGiulio et al., 2015;Hyman et al., 2014;Haque et al., 2017).However, microbial communities are highly diverse across various individuals, and more so for individuals belonging to different ethnic populations (Sun et al., 2022;Gupta et al., 2017).We have demonstrated above, that in a heterogenous dataset, with microbial profiles derived from ethnically and racially diverse subjects, diversity metrics could not accurately estimate PTB risk.This indicates that previously reported success of diversity indices in identifying subjects at high risk of PTB may be dataset dependent, either with respect to the subject cohort or to the pipeline used in computation of microbial abundance from 16S rRNA sequences, based on our observations on a mixed-race dataset with 16S rRNA sequences transformed using a standardized method.
Traditional ML methods, which have been explored previously in the context of predicting PTB using vaginal microbial species abundance, fail to learn an abundance signature associated with PTB in our dataset.While others (Park et al., 2022a) report success with machine learning methods in ethnically homogenous cohorts, we found that the predictive performance did not translate to mixed-race cohorts.As we have demonstrated above, vaginal microbial communities evolve throughout the duration of pregnancy, and abundance levels of certain species or genera may change significantly as the pregnancy progresses.Learning a PTB-associated signature in an evolving microbiome may be out of scope of such models as they are not designed to handle time-series datasets.We explored the utility of LSTM, a type of RNN, which is able to work with sequential datasets.The architecture of RNN-based approaches requires input datasets to be regularly and continuously sampled.As far as human patient subject data is concerned, obtaining such a dataset is a challenge, as study subjects may not be regular or consistent in clinical visits.For this purpose, we filtered the dataset such that a single sample was present across each of the gestational weeks, which constituted the time intervals for LSTM.We suitably modified the LSTM workflow to overcome missing time intervals, however it proved to be incapable of learning any signature associated with PTB.
Neural differential equations have recently gained traction with regards to analyzing sequential data.
Since it uses differential equations to model the temporal dynamics, it can handle irregularly and/or inconsistently sampled data.Neural CDEs are more efficient than neural ODEs (Kidger et al., 2020), and are even capable of working with partially sampled datasets, although we have not harnessed that in this study.We found considerable success in using Neural CDEs to predict PTB, in spite of working with a dataset sourced from an ethnically varied population (refer Figure 2), and outperformed all other approaches that we tested.To the best of our knowledge, this is the first effort towards modeling the temporal dynamics of vaginal microbial communities using deep learning, and the first instance of applying neural differential equations for a problem of this kind.
The DREAM challenge for PTB prediction (Golob et al., 2023), was issued in 2019 with the goal of driving efforts for PTB prediction using the vaginal microbiome.One of the sub-problems for the DREAM challenge consisted of predicting term births (>= 37 weeks of gestation) and preterm births (< 37 weeks of gestation) using vaginal microbiomes.The dataset for this challenge was derived from 9 different studies, and amounted to 3578 samples collected from 1268 individuals (Golob et al., 2023).The dataset used in this study was also part of the DREAM challenge.We also considered using some of the other datasets in the DREAM challenge while outlining this study, but dropped either due to not being labelled week-wise or due to insufficient week-wise samples per patient, for modeling the temporal dynamics.Most of the top submissions in the challenge used tree-based classifiers.On our test dataset, neural CDEs show better predictivity (mean test set ROC-AUC = 0.82, accuracy = 75%, sensitivity = 0.71, specificity = 0.85) than the best submission in the DREAM challenge on their validation dataset (ROC-AUC = 0.69, accuracy = 67%, sensitivity = 0.48, specificity = 0.79) (Golob et al., 2023), despite using microbial abundances only up to the end of the 2 nd Copyright © 2023 the authors Downloaded from Bioscientifica.com at 03/30/2024 02:10:28AM via Open Access.This work is licensed under a Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/deed.en_GBtrimester, i.e., the 24 th week of gestation.The DREAM challenge for PTB prediction used taxonomic abundance data upto the 32 nd week of gestation.Our emphasis was on early-stage PTB prediction, and we were able to achieve better predictive performance in spite of restricting the input data till the 2 nd trimester.
Predictive approaches using data other than microbial communities also exist.For instance, Tarca et al. (Tarca et al., 2021) report the results of the DREAM challenge for PTB prediction using the maternal blood transcriptome and the proteome.The top performing models report better results than what was reported on microbial communities, with a ROC-AUC of 0.76 when proteomics data from weeks 27-33 was used.However, in another sub-challenge where early-stage data (weeks 17-22) was used, the top performing model had a ROC-AUC of 0.62.Obtaining blood transcriptomic or proteomic data may pose difficulties due to the involvement of invasive procedures requiring clinical expertise to perform.On the other hand, microbial abundance data is sourced from vaginal swabs, which can be obtained without invasive procedures, by patient subjects themselves.Several attempts have been made at predicting PTB using biochemical marker (Aung et al., 2019;Leow et al., 2020), however, obtaining such data may require regular clinical visits, and their viability in racially-diverse populations is unknown.
Poorer and remote parts of the world may even lack the medical infrastructure or presence of adequate facilities that are required for PTB assessment and prevention.For example, in remote areas in India, there are clinics called Anganwadis, which roughly translates to courtyard shelter.As of 2018, the Ministry of Women and Child Development reports the existence of 1.4 million Anganwadi centres spread out across the country.Anganwadis provide limited healthcare facilities for maternal and infant health and lack the funding and facilities, or even trained medical personnel required to mitigate PTB and its ill-effects.Given the simplicity of obtaining samples from which microbial abundance is derived, reliable approaches for PTB risk assessments developed on microbiota will greatly help such remote clinics.
While we have demonstrated the capability of Neural CDEs towards PTB prediction using the vaginal microbiome, further effort can be made for increasing its clinical viability.Firstly, our dataset is limited in size (133 patient subjects), and we believe that larger datasets with better racial and ethnic representation may help learn signatures which take into account the diversity of vaginal microbiomes across individuals/races.Secondly, predicting the extent of preterm birth (extremely preterm, very preterm, moderate to late preterm) is also important as far as administering interventions is concerned, as they may have varying impact on maternal and infant health and may require different strategies.This may be achieved by predicting the gestational week of delivery, or by treating PTB as a multi-class problem with different extents of PTB as the classes, on more high-quality datasets.We strongly believe that vaginal microbial communities may be the key to achieving early-stage PTB prediction, and our findings strongly encourage future efforts for pregnancy microbiome data generation and further refinements in modeling procedures, which may take us closer to achieving full clinical viability.

Data & Code Availability
The code and data files for this study have been made available on https://tinyurl.com/3427p6y4.The repository contains a readme file which describes the contents of the individual data files along with the demographics of the train and test data, as well as a brief description of the code files.

Competing Interests
The authors declare no competing interests

Funding
This research did not receive any specific grant from any funding agency in the public, commercial or not-forprofit sector.The data used in this study was derived from a public, non-controlled access dataset from within the Sequence Read Archive (SRA), and hence, ethical approval was not required.

Author Contributions
M.B. and K.K. designed the study.K.K. performed the computational analyses along with assistance from M.B., and M.B. and K.K drafted the final manuscript.

Acknowledgements
The authors would like to thank Dr. Mohammed Haque, Dr. Anirban Dutta, and Mr. Nishal Kumar Pinna for their help in processing the 16S rRNA sequences from the SRA cloud, and Mr. Sunil Nagpal and Dr. Anirban Dutta for outlining the state of the art for PTB prediction using diversity metrics.We would also like to thank Dr. Rajgopal Srinivasan for proofreading and helping improve the manuscript.Figure 1: Visualization of the LSTM Network.Week 0 represents the earliest week of gestation for which microbial data is available.This is passed through an embedding layer, which generates the initial hidden state.Subsequent hidden states are determined by the previous hidden and cell states, and the microbial data at the respective time step.If the microbial data is not available, the output of the previous hidden state is used instead.The final hidden state is passed through a linear layer with a single output neuron to predict the term/preterm outcome, and the entire network is trained end-to-end.Where g is the loss function (binary cross-entropy in our case).The above expression essentially maps this gradient to a scalar, using which the parameters are updated.This can be further represented as: = ∫ 0 + h h Adjoint sensitivity analysis or the adjoint state method essentially refers to computing the cost function gradient using the above expression in an efficient manner.The exact mathematical derivation for this is comprehensively described in Chen et al. [1].

Model parameters
Neural CDE model parameters are listed in supplementary table 4.

Figure 2 :
Distribution of (a) race, (b) number of subjects who delivered at term/pre-term, (c) sample gestational age at the time of collection, (d) gestational age of subject at the time of delivery, and (e) number of unique genera detected in each sample in the microbial abundance dataset Figure 3: (a) Chao1, (b) Gini, (c) Shannon, (d) Simpson and (e) TCS diversity metrics computed on microbial abundance data collected during trimester 1 and trimester 2 of pregnancy.Blue and red points represent samples derived from subjects who delivered at term and pre-term, respectively.

Figure 4 :
Distribution of (a) skewness and (b) kurtosis: computed on relative abundances of genera.

Figure 6 :
Receiver operating characteristic (ROC) curve for the (a) decision tree (DT) and (b) random forest (RF) classifiers on the training and test datasets, and validation metrics (AUC, accuracy, precision, recall and f1 score) computed for the (c) DT and (d) RF classifiers.

Figure 7 :
(a) ROC curves and (b) classification metrics on the training and test datasets for the Neural CDE model.
Where is the neural network with parameters .Training the model essentially refers to update these parameters based on the gradient of the cost function (J) with respect to the parameter vector, scaled by a learning rate ( ). + 1 = + And the gradient of the cost function can be represented as:

Table 2 :
Comparative performance of various machine learning methods