Participants
Infants were recruited from the Early Autism Sweden (EASE) project, an ongoing longitudinal study of infants at elevated-likelihood of ASD. EASE follows younger siblings of children with ASD (elevated-likelihood) and siblings of typically developing children (low-likelihood) from 5 to 36 months of age. Infants at elevated-likelihood (EL) of ASD in EASE have at least one older sibling with a community clinical diagnosis of ASD according to ICD-10 or DSM-5, which was confirmed by inspection of medical records and an interview with parents led by an experienced child psychologist. Infants were recruited through clinical units, advertisement, and the EASE project website. Infants at low-likelihood (LL) of ASD have no familial history of ASD (in first- or second-degree relatives) and at least one typically developing older full sibling. Both groups were primarily from the larger Stockholm area (Sweden). All infants were born full-term (≥ 36 weeks) and did not have any confirmed or suspected medical conditions, including epilepsy, genetic syndromes associated with ASD, or visual/auditory impairments. Potential participants were excluded if they had been exposed to antibiotics. A total of 35 infants met the eligibility criteria and were included in the study, consisting of 19 EL infants (9 females) and 16 LL infants (10 females). All infants were breastfed until at least 6 months of age. An experienced clinical staff assessed the infants developmental level using the Mullen Scales of Early Learning (MSEL) at 5 and 36 months of age [37]. In addition, the Autism Diagnostic Observation Schedule-Second Edition (ADOS-2) was used to assess symptoms of ASD (communication, social interaction, play and restrictive/repetitive behaviors) at 36 months. Informed written consent was obtained from all parents. This study was approved by the Ethics Board in Stockholm and conducted in accordance with the 1964 Declaration of Helsinki.
Fecal sample collection
Parents collected approximately 200 mg of fecal material from a single infant’s bowel movement at 5, 10, 14, 24, and 36 months of age at their homes. At each time point, parents were guided with both oral and written instructions to collect the fecal material into a sterile collection tube and contact a dedicated research project assistant, who coordinated the immediate transport of the samples to the laboratory. Samples were transported on dry ice within 30 min of collection through the help of a dedicated courier service. Samples were aliquoted into two sterile tubes under a biological safety cabinet for shallow shotgun metagenome sequencing (~30 mg) and 1H NMR spectroscopic analysis (~ 50 mg).
Shallow shotgun metagenome sequencing
DNA extraction and sequencing was carried out by Diversigen® (Minneapolis, USA) using in-house developed, CLIA-approved BoosterShot® protocol for shallow shotgun sequencing. Briefly, MO BIO’s PowerFecal DNA Isolation Kit (Qiagen, Hilden, Germany) automated for high throughput on QIACube, along with 0.1 mm glass bead plates for bead beating, were used for genetic material extraction. Extracted DNA was quantified using Quant-iT™ PicoGreen™ dsDNA Assay Kit (Invitrogen, Carlsbad, CA). Libraries were then prepared using an adapted procedure from the Nextera XT DNA Library Preparation Kit (Illumina, San Diego, CA). Sequencing was carried out on an Illumina NextSeq using single-end 1 × 145 reads with a NextSeq 500/550 High Output v2 kit (Illumina, San Diego, CA). DNA sequences were aligned to a curated database including all representative genomes in RefSeq for bacteria with additional manually curated strains. Alignments were made at 97% identity against all reference genomes. Every input sequence was compared to each reference sequence within the CoreBiome Venti database using fully gapped alignment with BURST. Ties were broken by minimizing the overall number of unique operational taxonomic units (OTUs). For taxonomy assignment, each input sequence was assigned the lowest common ancestor that was consistent across at least 80% of all reference sequences tied for best hit. The number of counts for each OTU was normalized to the OTU’s genome length. OTUs accounting for less than one millionth of all species-level markers and those with less than 0.01% of their unique genome regions covered (and <1% of the whole genome) were discarded.
Microbiome analysis
A general overview of the infant gut microbiome composition was generated plotting the most abundant genera of each group at the different timepoints, after calculating the mean relative abundances. The Shannon diversity index, a measure of alpha diversity, was estimated at genus level at each timepoint using the `DivNet` v 0.4.0 package, which takes into consideration the compositional nature of microbiome data, the co-occurrence of the different taxa, and avoids rarefaction [38]. A compositional data analysis workflow was used to investigate the structure of the data and highlight group differences. Zeroes in the OTU count table were imputed using Bayesian-Multiplicative replacement before applying a center log ratio (CLR) transformation with the `CoDaSeq` v 0.99.6 package. Principal component analysis (PCA), from the `mixOmics` v 6.22.0 package, was used to performed dimensionality reduction and the adonis function from the `vegan` v 2.6-4 package was used to perform permutational multivariate analysis of variance (PERMANOVA) on the calculated Aitchison distances. Differential abundance analysis was performed using the `ALDEx2` v 1.30.0 package. ALDEx2 generates distribution probabilities for each analyzed taxon and converts them into distribution of log ratios to account for the compositional nature of microbiome data [39]. Differential abundance is calculated using the Wilcoxon rank-sum test, together with the standardized effect sizes. Obtained p values were corrected for multiple comparisons using Benjamini-Hochberg (BH) correction and only taxa with adjusted p values < 0.25 were considered significant and retained for visualization.
1H NMR spectroscopy
Fecal samples were defrosted on ice and combined with ten 1 mm diameter zirconia beads (BioSpec Products, US) and 700 µL of demineralized water. Samples were homogenized using a Precellys 24 homogenizer (Bertin Instruments, FR) with 2 × 6500 rpm in a 2-min program (2 × 40 s homogenization, with a 20 s interval). Homogenized samples were centrifuged at 10,000 g for 20 min at 4 °C. Fecal water (supernatant) was collected from each tube and 630 µL was transferred to a new 1.5 mL Eppendorf. Each tube was supplemented with 70 µL of phosphate buffer (1.5 M KH 2 PO 4 , 2 mM NaN 3 , 1% TSP solution, pH 7.4) and vortexed for ~ 30 s. Samples were centrifuged at 10,000 g for 5 min at 4 °C and 600 µL was transferred to 5 mm NMR tubes. Samples were loaded into a refrigerated Bruker SampleJet robot on a 600 MHz UltraShield spectrometer (Bruker Biospin, Karlsruhe, Germany). A standard one-dimensional pulse sequence with saturation of the water resonance was applied (RD-90°-t1-90°-tm-90°-acquire FID, with RD set at 2 s and tm at 100 ms) at 300 K. For each spectrum 8 dummy scans, followed by 64 scans with 32 K data points and a spectral width of 20,000 Hz were acquired.
1H NMR data processing and analysis
Spectra were manually corrected for phase and baseline distortion to the TSP singlet (δ 0.00) before data acquisition on TOPSPIN 3.2 (Bruker, Germany). The obtained spectra were digitized in MATLAB 2019b using IMPaCTS. TSP (-0.2 to 0.2 ppm) and water (4.7 to 4.9 ppm) regions were removed from the spectra. Peaks were manually checked and aligned using recursive sample-wise peak alignment (RSPA). Spectra were normalized using a probabilistic quotient normalization approach to account for potential dilution factors. PCA analysis with pareto scaling was used for preliminary unsupervised analysis to identify possible outliers. Four samples were categorized as outliers and excluded due to spectral anomalies, most likely derived from issues during spectral acquisition or sample preparation. Pair-wise supervised analysis between groups on full spectra was performed with projection to latent structures-discriminant analysis (PLS-DA) and univariate scaling in MATLAB. Using in-house databases, Chenomx 8.4 (Chenomx Inc, Edmonton, Canada) and HMDB (http://www.hmdb.ca/), all identified peaks were annotated. STOCSY was used to identify peaks belonging to same metabolite (Pearson correlation >0.8). Representative peaks from the different metabolites were integrated and exported for further analysis.
Integrated peaks from 1H NMR spectra were log10 transformed. PCA was used to observe the overall structure of the data and sparse PLS-DA from the ‘mixOmics’ package was used to extract the most relevant features that were responsible for differentiating the two groups at the different timepoints. Model performance was evaluated using the perf function of the ‘mixOmics’ package with leave-one-out (loo) cross-validation, given the small sample size. A variable importance projection score (VIP) > 1 was used as a cut-off to identify the most relevant metabolites influencing group separation.
Integration omics data
Spearman correlations between and within the two omics blocks (microbiome and metabolome data) were calculated using the ‘Hmisc’ v 4.8-0 package. P values obtained from multiple comparisons (>250,000) were then corrected for false discovery rate using BH and only correlations with adjusted p values < 0.25 were kept in the network. To improve visualization, a subset network was generated retaining only bacterial species directly correlating with GABA or with GABA-correlated metabolites and removing edges within each omics block. The final microbiome-metabolic network was built using the ‘ggraph’ v 2.1.0 and ‘igraph’ v 1.3.5 packages. The network was exported and edited in Cytoscape v 3.8.229.
Microbial cell culture and metabolomic analysis
Bacteria identified from the fecal microbiome analysis to be associated with GABA abundance were investigated for their potential to influence its presence. This included five different Bifidobacterium species, B. breve (DSM20091/ATCC15698), B. bifidum (DSM20215), B. longum subsp. infantis (ATCC17930/DSM20218), B. adolescentis (DSM20083), and B. scardovii (DSM13734), and two Clostridium related species, C. difficile (DSM1296), and C. bolteae (DSM15670) purchased from the DSMZ catalog (Leibniz-Institut DSMZ, Germany). The isolates were screened in anaerobic monocultures (CO 2 /N 2 (80%/20%) at 1.7 atm) with three different substrates (glutamate, putrescine, and spermidine; 4.5 mM each), which can be converted to GABA, as well as GABA itself (4.5 mM). Bifidobacterium and Clostridium species were enumerated in MRS, and PYG broth respectively [40, 41]. A modified rich transoligosaccharide propionate medium was used as a base for culturing the above bacterial species anaerobically [41]. The medium consisted of glucose (120 mM, pH 6.2), yeast extract (1 g/L), potassium dihydrogen phosphate (3 g/L), dipotassium hydrogen phosphate (4.8 g/L), magnesium sulfate (0.2 g/L), sodium propionate (15 g/L), and l-cysteine hydrochloride (0.5 g/L). In addition, bacteriological peptone was added to the medium (1 g/L) to accommodate the growth of Clostridium species. Serum bottles were supplemented with filter-sterilized vitamin mix and trace metals (100x; originally designed for Lactobacillus lactis) [42]. All cultures were incubated for 24 hours at 37 °C. From each culture, 2 mL of bacterial growth medium were collected for 1H NMR spectroscopy. From the preliminary results based on growth levels and metabolic activity, B. breve, B. infantis, C. difficile, C. bolteae were selected for further analysis. These four strains were, after preculture enumeration, made equivalent in optical density under anaerobic conditions with sterile modified TOS, before transfer to 96-well plates containing 380 µL with a 1:100 inoculation. Media that contained either no supplement or supplemented with GABA was added to the wells. The chosen quadriculture ratios were (Bifidobacterium spp.:Clostridium spp.): 2:1; 1:1; 1:2. Following 24 h anaerobic incubation [CO 2 /N 2 (80%/20%) at 1.5 atm], 1 ml of media was collected, centrifuged for 15 min at 10,000 g at 4 °C, and then 250 µL of supernatant was transferred into a 1.5 mL Eppendorf. After adding 350 µL of phosphate buffer solution, samples were vortex for ~ 30 s, centrifuged for 5 min at 10,000 g at 4 °C, and 600 µL of supernatant was transferred to 5 mm NMR tubes. For the quadricultures, 250 µL was retrieved and mixed with 350 µL of NMR Buffer (1.5 M KH 2 PO 4 , 1 g/L of TSP and 0.13 g/L of NaN 3 ), before centrifugation at 10,000 g at 4 °C for 5 minutes and transfer to 5 mm NMR tubes. Spectral acquisition, processing, and integration were carried out as described above on a 700 MHz Bruker NMR spectrometer equipped with a cryoprobe. Peaks arising from metabolites of interest within the monoculture and quadriculture screenings were integrated from the NMR spectra and normalized to the respective TSP signals.
Statistical analysis
All statistical analysis was performed on R version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria). Fisher’s exact test was used to evaluate differences between categorical variables, whereas Wilcoxon rank-sum test was used for numerical variables if not differently specified.