-
PDF
- Split View
-
Views
-
Cite
Cite
Julianne C Yang, Venu Lagishetty, Ezinne Aja, Nerea Arias-Jayo, Candace Chang, Megan Hauer, William Katzka, Yi Zhou, Farzaneh Sedighian, Carolina Koletic, Fengting Liang, Tien S Dong, Jamilla Situ, Ryan Troutman, Heidi Buri, Shrikant Bhute, Carra A Simpson, Jonathan Braun, Noam Jacob, Jonathan P Jacobs, Biogeographical distribution of gut microbiome composition and function is partially recapitulated by fecal transplantation into germ-free mice, The ISME Journal, Volume 19, Issue 1, January 2025, wrae250, https://doi.org/10.1093/ismejo/wrae250
- Share Icon Share
Abstract
Fecal microbiota transplantation has been vital for establishing whether host phenotypes can be conferred through the microbiome. However, whether the existing microbial ecology along the mouse gastrointestinal tract can be recapitulated in germ-free mice colonized with stool remains unknown. We first identified microbes and their predicted functions specific to each of six intestinal regions in three cohorts of specific pathogen-free mice spanning two facilities. Of these region-specific microbes, the health-linked genus Akkermansia was consistently enriched in the lumen of the small intestine compared to the colon. Predictive functional modeling on 16S rRNA gene amplicon sequencing data recapitulated in shotgun sequencing data revealed increased microbial central metabolism, lipolytic fermentation, and cross-feeding in the small intestine, whereas butyrate synthesis was colon-enriched. Neuroactive compound metabolism also demonstrated regional specificity, including small intestine-enriched gamma-aminobutyric acid degradation and colon-enriched tryptophan degradation. Specifically, the jejunum and ileum stood out as sites with high predicted metabolic and neuromodulation activity. Differences between luminal and mucosal microbiomes within each site of the gastrointestinal tract were largely facility-specific, though there were a few consistent patterns in microbial metabolism in specific pathogen-free mice. These included luminal enrichment of central metabolism and cross-feeding within both the small intestine and the colon, and mucosal enrichment of butyrate synthesis within the colon. Across three cohorts of germ-free mice colonized with mice or human stool, compositional and functional region specificity were inconsistently reproduced. These results underscore the importance of investigating the spatial variation of the gut microbiome to better understand its impact on host physiology.

Introduction
There is now an extensive literature on the use of high-throughput sequencing to identify microbial signatures of disease in human cohorts [1, 2]. Mouse models of disease are often utilized to further investigate the role of the microbiome in disease states to reduce individual microbiome variation that can obscure trends in human studies. However, the majority of these studies have reported on the fecal microbiome, which is insufficient to capture the totality of microbe-host interactions [3]. For instance, in one case–control study, the mucosal microbiome outperformed the fecal microbiome in distinguishing inflammatory bowel disease patients from healthy controls [4]. These findings were recapitulated in the 2,4,6 trinitrobenzene sulfonic acid mouse model of colitis, where colonic mucosal microbiomes were more strongly associated with disease severity than fecal microbiomes [5].
The focus on fecal microbiota has led to scant attention being given to the microbiomes associated with the distinct regions along the longitudinal axis of the gastrointestinal (GI) tract (e.g. duodenum, jejunum, and ileum in the small intestine (SI)). Evidence suggests spatial organization of the microbiome in the gut is a critical factor in dictating microbe-microbe interactions and also microbe-host interactions [6]. Loss of gut biogeography is a feature of mouse colitis models [7]. However, many studies aimed at elucidating biogeography either focus on specific regions or on a single region of interest [8–10]. To our knowledge, only a few studies have described regional microbiome differences along the entire GI tract in mice, though these were limited to compositional descriptions in nine or fewer mice [11–13]. Therefore, one aim of our study is to address this gap in the field by documenting compositional and functional gut biogeography with deeper regional resolution and by using larger cohorts.
Factors shaping gut biogeography include nutrient availability and chemical gradients. The host diet regulates bacterial localization within the gut [14]. Easily metabolizable nutrients are frequently absorbed by the host or utilized by the microbiota within the SI, whereas refractory nutrients such as non-starch polysaccharides are utilized by microbes within the colon [15]. Moreover, chemical gradients in the intestines influence the microbiota. Oxygen decreases proximally to distally, and decreases from the cell surface to the lumen [16–17]. These chemical gradients therefore favor the growth of facultative anaerobes in the SI and microaerophilic bacteria at the epithelial surface which survive in low oxygen concentrations [18].
In addition to assessing the compositional and functional differences along the longitudinal and transverse axes of the mouse GI tract, we evaluate the extent to which existing biogeographical patterns along the GI tract can be recapitulated in wild-type germ-free mice that are colonized by gavage of mouse or human stool. Microbiota gavage is an essential tool in the field for establishing causal relationships between the microbiome and host phenotypes, and is commonly performed via orogastric gavage of fecal content. As failure to recapitulate existing biogeographical patterns may influence microbiome-dependent outcomes such as response to dietary challenges or disease induction, the current study provides an opportunity to investigate whether biogeography can be recapitulated via fecal orogastric gavage.
Materials and methods
Animal husbandry
C57Bl/6 mice were housed under a 12:12 light cycle. Two specific pathogen-free (SPF) cohorts were housed in the same vivarium at the University of California Los Angeles (UCLA): an original cohort (UCLA O. SPF), which included wild-type and homozygous floxed Cre- mice, and a validation cohort (UCLA V. SPF) with only wild-type mice. Both UCLA cohorts were housed in static cages containing autoclaved woodchip bedding, provided with unlimited irradiated chow (LabDiet 5053) and acidified water. Another SPF cohort was housed at Cedars-Sinai Medical Center (CS SPF), where the mice were housed in static cages with autoclaved woodchip bedding, unlimited irradiated chow (LabDiet 5LJ5), and autoclaved water. All SPF mice were euthanized at 2–3 months. For SPF Gavage and HUM SD Gavage cohorts, CS SPF pooled fecal pellet suspensions (SPF Gavage) or a single healthy human donor stool suspension (HUM SD Gavage) were gavaged into germ-free mice at 2–4 months, then housed at the National Gnotobiotic Rodent Resource Center (UNC) for 2 months before euthanasia. Mice in SPF Gavage and HUM SD Gavage cohorts were housed on Alpha-Dri bedding in static cages with unlimited irradiated chow (LabDiet 5V0F) and autoclaved water. In the HUM MD Gavage cohort, germ-free mice were shipped from UNC to UCLA, gavaged with human microbiota upon arrival at 2–4 months, and housed in individually ventilated cages for 1 month before euthanasia, with autoclaved woodchip bedding, unlimited irradiated chow (LabDiet 5061) and autoclaved water. Fecal suspensions for gavage were prepared by grinding donor feces with liquid nitrogen and diluting them in sterile, pre-reduced buffers. Detailed orogastric gavage conditions are described in Supplementary Methods.
16S rRNA gene amplicon sequencing
Luminal and mucosal-adherent samples from the duodenum, jejunum, ileum, cecum, proximal colon (PC), and distal colon (DC) of C57Bl/6 mice were harvested and utilized for Illumina paired-end sequencing of the V4 hypervariable region of the 16S ribosomal RNA gene as previously described [19]. Regions proximal to the cut at the junction between the ileum and the cecum were defined as the SI, while regions distal to this cut were defined as the colon. The entire SI was divided into even thirds, yielding the duodenum, jejunum, and ileum regions. After separating out the cecum, the remaining colon tissue was evenly divided into two, yielding the PC and DC regions. Luminal samples were collected via flushing the segment with deionized water. For mucosal samples, intestinal tissues were opened longitudinally, rinsed in 10% fetal bovine serum/Dulbecco’s Modified Eagle Medium (FBS/DMEM), agitated on a shaker at 37°C for 30 min in 1 mM dithiothreitol in FBS/DMEM, passed through 100 micron filters to remove debris, and centrifuged to pellet cells. Sequencing libraries were prepared in the same manner for all five cohorts, with the exception of the genomic DNA extraction method. For the UCLA O. SPF, CS SPF, and HUM MD Gavage cohorts, genomic DNA was extracted with the Zymobiomics DNA Miniprep Kit (Zymo Research). For the UCLA V. SPF, SPF Gavage, and HUM SD Gavage cohorts, genomic DNA was extracted using the MO BIO PowerSoil Kit (Qiagen). Polymerase chain reaction amplification of the 16S rRNA V4 gene region was accomplished using 515F and 806R barcoded primers as previously described [20]. Amplicon products underwent additional purification with the ZR-96 DNA Clean & Concentrator-5 (Zymo Research). Samples were sequenced on either a NovaSeq System (250x2 SP flow cell; UCLA O. SPF and HUM MD Gavage cohorts; Illumina) or a HiSeq 2500 System (150x2 or 250x2 rapid run flow cell; UCLA O. SPF, UCLA V. SPF, CS SPF, SPF Gavage, and HUM SD Gavage cohorts; Illumina). The mean sequencing depths for UCLA O. SPF, UCLA V. SPF, CS SPF, SPF Gavage, HUM SD Gavage, and HUM MD Gavage cohorts were 80 891, 50 429, 47 121, 57 075, 71 727, and 65 148 reads, respectively.
Shotgun metagenomics sequencing
Microbial DNA extracts from a subset of jejunum and DC luminal samples of the UCLA O. SPF (n = 27 mice), CS SPF (n = 10 mice), SPF Gavage (n = 6 mice), and HUM SD Gavage (n = 7 mice) cohorts were fragmented and barcoded using a DNA Prep kit (Illumina). The barcoded shotgun libraries were sequenced on a NovaSeq 6000 System (Illumina) using S4 flow cells and a 2x150 base pair sequencing configuration, achieving a mean sequencing depth of 16 333 219 reads.
Sequencing data analysis and visualization
Details on sequencing data preprocessing are provided in Supplementary Materials. Alpha-diversity was assessed on rarefied data using Pielou’s evenness (pielou_e) and number of ASVs (observed_otus) for evenness and richness, respectively. After adjusting for sequencing batch effects using ComBatSeq2, we assessed beta-diversity through robust Aitchison principal coordinates analysis (PCoA) of prevalence-filtered data [21]. Per-genus association testing with variables of interest was performed using linear mixed-effects models implemented in MaAsLin2 on batch-adjusted data with the DC, colon, or luminal samples used as reference depending on the comparison (Site—comparing DC to proximal sites, Site—comparing colon to SI, and Type—comparing luminal to mucosal) [22]. Hierarchical clustering of genera was performed using Ward’s procedure. Using the PICRUST2 algorithm (v. 2021.11) to generate predicted Kyoto Encyclopedia of Genes and Genomes ortholog (KO) abundances from compositional data, and the actual KO abundances from shotgun data, we estimated metabolic module or gut-brain module abundance based on the KO coverage of each module (https://github.com/raeslab/omixer-rpm) [23, 24]. Using the MelonnPan tool, we trained an elastic net model using paired metabolome and KO metagenome data from other mouse model studies in our lab, with 10-fold cross validation [25]. Applying this pre-trained model to PICRUST2 KO and metagenome KO, we predicted metabolite relative abundances for the luminal datasets [25]. For both alpha-diversity and per-feature (taxon, module, or metabolite) association testing, statistics were reported as the results of linear mixed effects models with variables of interest and confounding variables as fixed effects and MouseID as a random effect. Additionally, per-feature association testing was performed on centered log ratio-transformed taxon or module count data, or on log-transformed metabolite relative abundances. Features were considered significant if the q-value (P value corrected for false discovery rate) was less than 0.05. For beta-diversity, significance was assessed through repeat-measures aware PERMANOVA utilized in the Human Microbiome Project [26]. Software packages and their versions are provided in Supplementary Methods.

Study design and datasets. (A) To profile both longitudinal and transverse differences in the microbiome, we utilized 16S rRNA gene sequencing of luminal and mucosal samples collected along the GI tract and shotgun sequencing of jejunum and DC luminal samples. We define “interregional” differences as comparing the colon to small intestine (SI), “intraregional” differences as comparing the three SI regions to each other and as comparing the three colon regions to each other, and “region-specific” features as features which significantly distinguish the DC from the five other intestinal sites. (B) Three SPF mouse cohorts from two facilities (UCLA O. SPF, UCLA V. SPF—containing mucosal samples only, and CS SPF) and three colonized germ-free mice cohorts (SPF gavage, HUM SD gavage, HUM MD gavage—containing 16S rRNA gene sequencing data only) were used in this study. Intestinal regions are abbreviated in this diagram as follows: SI—small intestine, duodenum—duo, jejunum—Jej, ileum—Ile, cecum—Cec, proximal colon—PC, distal colon—DC. Figure created with Biorender.
Results
Analyses of sequencing data obtained from six regions along the GI tract were performed to evaluate longitudinal differences, defined as distal—proximal comparisons within either the luminal or the mucosal samples, in addition to transverse differences across the GI tract, defined as mucosal—luminal comparisons within each of the six regions of the GI tract (Fig. 1A). To evaluate reproducibility within and between housing facilities, two biogeographical studies of wild-type SPF mice were conducted at UCLA, whereas one study was conducted at CS (Fig. 1B). Lastly, to address whether orogastric gavage of fecal content is sufficient to recapitulate differences in compositional and functional biogeography, we performed similar microbial profiling of germ-free mice that received fecal content from either CS SPF mice (SPF Gavage), from a single healthy human donor (HUM “Single Donor” SD Gavage), or from multiple human donors with 3–5 mice recipients per donor (HUM “Multiple Donor” MD gavage) (Fig. 1B). 16S v4 rRNA gene sequencing was performed on all samples from all cohorts; DC and jejunum luminal samples from the UCLA O. SPF, CS SPF, SPF Gavage, and HUM SD Gavage underwent additional shotgun sequencing to provide increased species-level resolution and improve understanding of microbial functions (Fig. 1A). Dominant phyla and genera differed across UCLA O. SPF, UCLA V. SPF, and CS SPF mice (Figs. 2, S1). Most compositional profiles of donor feces differed from recipient profiles (Figs. S2, S3).

Taxonomic composition of mice cohorts. (A) Stacked column charts illustrating the relative abundances of genera comprising at least 0.1% of the overall composition or phyla (B) of mucosal samples across six cohorts. (C) Stacked column charts illustrating the relative abundances of genera comprising at least 0.1% of the overall composition or phyla (D) of luminal samples across five cohorts. The color legend for phyla is shown in a single column at the right, while the color legend for genera is shown at the bottom of the figure. Intestinal regions are abbreviated as follows: Duodenum—D, jejunum—J, ileum—I, cecum—C, proximal colon—PC, distal colon—DC. If the genus name is unknown, it is labeled by its family name (f); if the family name is also unknown, it is labeled by its order (o).
Microbiome diversity along the GI tract is largely reproduced in SPF cohorts and partially recapitulated in germ-free colonized with feces
We compared bacterial alpha- and beta- diversity along the GI tract (Figs. 3, S4–S5). The colon exhibited increased species richness as assessed by total ASVs compared to the SI in the mucosal and luminal samples of the SPF cohorts (Figs. 3A, S4A). “Interregional” —defined as colon vs. SI—differences in species richness were reproduced in the SPF Gavage mucosal samples (Figs. 3A, S4A). The colon also exhibited greater evenness than the SI in both luminal and two out of three mucosal SPF cohorts (Figs. 3B, S4B). Luminal but not mucosal datasets of the three gavage recipient cohorts reproduced this interregional pattern in evenness (Figs. 3B, S4B).

Interregional and intraregional differences in alpha- and beta-diversity in luminal samples. Violin plots comparing either (A) the total number of ASVs or (B) Pielou’s evenness indices across six intestinal regions in the five luminal datasets. Thick lines indicate the comparison between colon and SI. Thin lines indicate the comparison between DC and each of five other regions. Statistical comparisons were made through fitting the distributions of alpha-diversity indices to linear mixed-effects models with site (encoding SI/colon, or encoding six regions) as a fixed effect and MouseID as a random effect, *P < 0.05, **P < 0.01, ***P < 0.001. PCoA plots illustrating interregional (C) differences, colonic intraregional differences (D), and small intestinal intraregional differences in beta-diversity (E). R2 and P values associated with site were calculated by repeat-measures aware PERMANOVA. Intestinal regions are abbreviated as follows: Duodenum—D, jejunum- J, cecum—C, proximal colon—PC, distal colon—DC.
Significant interregional differences in microbiome beta-diversity were observed across all six cohorts in both luminal and mucosal samples (Figs. 3C, S4C). Significant colonic intraregional differences in microbiome beta-diversity were observed in the three SPF mucosal datasets and the HUM MD Gavage dataset but were not observed in the SPF Gavage or HUM SD Gavage datasets (Fig. S4D). More specifically, the DC was significantly different from the PC and cecum in the UCLA SPF cohorts (P < 10−5 and P < 10−5) and from the cecum in the CS SPF cohort (P = 0.014). Interregional differences in alpha and beta diversity bet ween the colon and SI were recapitulated in the shotgun sequencing data obtained from the jejunum and DC samples (Fig. S5).
Reproducible patterns of region-specific genera in luminal and mucosal microbiota across all cohorts
We identified “region-specific” genera whose abundances were significantly associated with a specific region compared to DC (Figs. 4, S6). Hierarchical clustering was performed to group region-specific genera into clusters with similar biogeographic behavior. The region-specific genera grouped into four color-coded clusters within each of the SPF datasets and the HUM MD Gavage dataset (Figs. 4A-C, S6A–D). Of these four, purple and blue clusters consisted of SI-enriched and colon-enriched genera, respectively (Figs. 4A–C, S6A–D).

Unsupervised learning reveals four clusters of region-specific genera. Heatmaps illustrating the enrichment or depletion of genera within a site (D, J, I, C, PC) relative to DC in luminal samples from (A) UCLA O. SPF, (B) CS SPF, (C) HUM MD gavage, (D) SPF gavage, and (E) HUM SD gavage. For heatmaps A-E, the color of the bar on the left represents cluster membership. The color of each tile represents the effect size, while asterisks within each tile indicate genera which were significantly different following Benjamini-Hochberg multiple hypothesis correction, *q < 0.05. The rows are labeled by genus, with the label color representing the phylum. Genera which are both shared and have the same directionality in at least three datasets are highlighted. Unidentified genera are labeled with the family (f) name, or with the order name (o) if the family is also not known. (F) Upset plot showing the total number of region-specific genera identified for each dataset in the “set size” panel, with the “intersection” panel illustrating the number of region-specific genera either unique to a dataset or shared across datasets as indicated by the dot matrix. Region-specific species (only those which are named) identified from shotgun sequencing for (G) UCLA O. SPF, (H) CS SPF, (I) HUM SD gavage, or (J) SPF gavage datasets are shown as barplots. Species labels are colored according to phylum.
There were 71, 53, 56, 0, and 1 region-specific genera for UCLA O. SPF, CS SPF, HUM MD Gavage, SPF Gavage, and HUM SD Gavage luminal datasets, respectively (Fig. 4F). 22 region-specific genera are shared between UCLA O. SPF and CS SPF datasets, 19 are shared between UCLA O. SPF, CS SPF, and HUM MD Gavage datasets, and 1 genus (Bacteroides) is shared between UCLA O. SPF, CS SPF, HUM MD Gavage, and HUM SD Gavage datasets (Fig. 4F). Among the highlighted region-specific genera which were shared in at least three out of five luminal datasets, members of phylum Bacteroidota were SI-depleted, such as Parabacteroides, Bacteroides, and Alistipes (Figs. 4A–C). Supporting these findings, Bacteroides acidifaciens, Bacteroides caecimuris, Bacteroides sp. L10_4, and Parabacteroides distasonis were among the shared region-specific species identified through shotgun sequencing of UCLA O. SPF and CS SPF samples (Figs. 4G–H). Members of phyla Actinobacteriota and Proteobacteria were SI-enriched, including Bifidobacterium, Escherichia/Shigella, and Parasutterella (Figs. 4A–C). At the species level, phylum Actinobacteriota members Enterorhabdus sp. P55 and an Eggerthellaceae bacterium were jejunum-enriched in both UCLA O. SPF and CS SPF datasets. Other prominent shared region-specific genera included SI-enriched Enterococcus and Streptococcus, and colon-enriched Oscillibacter and Colidextribacter. The genus Akkermansia was SI-enriched in UCLA O. SPF, CS SPF, and HUM MD Gavage datasets (Fig. 4A-C).
Though identification of region-specific genera was absent or poor in the SPF Gavage and HUM SD Gavage 16S rRNA gene amplicon sequencing datasets, four and seven region-specific species were identified following shotgun sequencing of jejunum and DC samples, respectively (Figs. 4I–J, S7). In contrast, there were 46 and 39 interregional genera in the SPF Gavage and HUM SD Gavage cohorts, respectively (Fig. S8).
The numbers of region-specific genera in the intestinal mucosa were 68, 41, 48, 26, 5, and 5 for the UCLA O. SPF, UCLA V. SPF, CS SPF, HUM MD Gavage, HUM SD Gavage, and SPF Gavage datasets, respectively (Fig. S6G). Reduced recapitulation of even interregional genera was observed in all three gavage cohorts (Fig. S9). 45% of region-specific genera (34/75) were shared between UCLA O. SPF and UCLA V. SPF datasets, whereas 47% of region-specific genera (36/79) were shared between UCLA O. SPF and CS SPF datasets (Fig. S6G). Prominent SI-enriched genera with predominantly purple cluster membership common in the three SPF datasets included phylum Actinobacteriota members Enterorhabdus, Desulfovibrio, and Bifidobacterium, as well as the lactic acid bacteria Ligilactobacillus, Lactobacillus, Streptococcus, and HT002 (Fig. S6A–C). Enterorhabdus and Bifidobacterium were identified as ileum-enriched in the SPF Gavage and HUM MD Gavage datasets, respectively, whereas Streptococcus was enriched in all three SI sites in the HUM MD Gavage dataset (Fig. S6D–E). Colon-enriched genera included Bacteroides, Parabacteroides, Oscillibacter alongside another unidentified genus belonging to family Oscillispiraceae, Colidextribacter, an unidentified genus belonging to family Ruminococcaceae, and Intestinimonas (Fig. S6A–C). Several of these were recapitulated in the colonized germ-free mice (Fig. S6D–F).
Regional specificity of microbial metabolic pathways is observed in SPF mice and perturbed in colonized germ-free mice
To evaluate biogeographical distribution of microbial functions, we estimated “gut metabolic module” (GMM) abundances from the sequencing data. Subsequently, we identified GMMs—grouped into broader categories of inputs and outputs to central metabolism—which exhibited either interregional specificity or regional specificity.
In the two SPF luminal datasets, all GMMs in the disaccharide degradation, cross- feeding, and ethanol production categories were SI-enriched relative to the colon (Fig. 5A–B). Additionally, GMMs within the lipolytic fermentation, central metabolism, and proteolytic fermentation categories were SI-enriched in both SPF datasets (Fig. S10A–B). However, in the gavage recipient cohorts, GMMs in these categories were colon-enriched (Fig. S10C–E). Butyrate production was colon-enriched in the SPF mice (Figs. 5A–B, S10A–B). Although this was recapitulated in the SPF Gavage and HUM SD Gavage mice, butyrate production was SI-enriched in the HUM MD Gavage mice (Figs. S10C–E, 6C–E).

Predicted microbial metabolism exhibits interregional specificity. KEGG orthologs predicted from compositional data were grouped into gut metabolic modules, followed by module enrichment analysis. Metabolic modules were subsequently grouped into higher-order categories visualized in these metabolic maps, which were adapted from the GOMIXER pathways mapper. The color of the category indicates whether all pathways were enriched, all depleted, or were differentially enriched in the SI relative to the colon, with the legend shown at the bottom right of the figure. Metabolic maps are shown for luminal samples from (A) UCLA O. SPF, (B) CS SPF, (C) SPF gavage, (D) HUM SD gavage, and (E) HUM MD gavage datasets.

Predicted gut-metabolic modules exhibit region specificity. KEGG orthologs predicted from compositional data were grouped into gut metabolic modules (GMMs) for module enrichment analysis. Significant GMMs that were consistently differentially abundant in proximal regions of the intestines (duodenum—D, jejunum- J, cecum—C, or proximal colon—PC) compared to the DC in at least three out of five luminal cohorts are shown in this figure. GMMs are grouped into higher-order categories for visualization, with carbohydrates shown in (A), proteolytic fermentation in (B), lipolytic fermentation, sugar acid, and nitrate reduction categories shown in (C), cross-feeding and butyrate in (D), and central metabolism in (E). Line graphs depict regression coefficients and their standard errors for each region-DC comparison. Each cohort is represented by a different color line, with the legend for all plots given at the top of the figure. The asterisk indicates that the region—DC comparison was significant (*q < 0.05) following multiple hypothesis correction, while the asterisk highlight color corresponds to the cohort as indicated in the legend. (F) Upset plot showing the total number of region-specific GMMs identified for each dataset in the “set size” panel, with the “intersection” panel illustrating the number of region-specific genera either unique to a dataset or shared across datasets as indicated by the dot matrix.
To investigate whether interregional specificity in GMMs is accompanied by interregional shifts in microbial metabolite production, we utilized a custom pre-trained elastic net model to predict metabolites from the sequencing data. We identified 43 metabolites which significantly distinguished the colon from the SI across all five luminal 16S rRNA gene sequencing datasets (Fig. S11). Among these metabolites, amino acids and amino acid conjugates, in addition to ceramides, were largely SI-enriched (Fig. S11B-C). 177 metabolites were region-specific and shared between UCLA O. SPF and CS SPF cohorts with the same directionality, which were also either not significant, not detected, or possessed the opposite directionality, in at least two of three the gavage recipient cohorts (Figs. S12–S13). Among these metabolites, SI-enrichment of many phospholipid and sphingolipid metabolites distinguished the SPF from gavage recipients (Fig. S12A-B).
Reproducible region-specific luminal GMMs were identified as those showing a consistent direction of enrichment in at least three cohorts. Relative to the DC, lactose and galactose degradation was enriched in the jejunum and ileum of SPF mice, and in the cecum of HUM MD Gavage mice (Fig. 6A). Glycerol degradation II and lactate consumption I in the lipolytic fermentation and cross-feeding categories, respectively, were SI-enriched relative to the DC in UCLA O. SPF, CS SPF, and HUM MD Gavage cohorts (Fig. 6C–D). For CS SPF, HUM SD, and HUM MD Gavage cohorts, rhamnose degradation and histidine degradation were depleted in the jejunum and ileum relative to the DC (Fig. 6A–B). Butyrate production I was depleted in all three SI sites for the UCLA O. SPF, CS SPF, and HUM MD Gavage cohorts (Fig. 6C). Biogeographical patterns of glycerol degradation II, rhamnose degradation, histidine degradation, and butyrate production I were reproduced in the shotgun sequencing data (Fig. S14). GMMs where biogeography diverged between the SPF mice and humanized germ-free mice included acetate to acetyl-coA, which was depleted in the jejunum and ileum relative to the DC of humanized mice but not SPF mice, and the pentose phosphate pathway (oxidative phase), which was jejunum-enriched in both SPF cohorts but depleted in both HUM Gavage cohorts (Fig. 6E). There were no region-specific GMMs in the SPF Gavage cohort (Fig. 6F).
Across the three SPF mucosal datasets, all GMMs in the disaccharide degradation, cross-feeding, and ethanol production categories were SI-enriched (Figs. S15A–C, S16A–C). Only butyrate production was reproducibly colon-enriched across all three SPF cohorts, whereas lipolytic fermentation, and mucin degradation were colon-enriched in at least two SPF mucosal datasets (Figs. S15A–C). There was preservation of distribution of disaccharide degradation, ethanol production, and butyrate production in the SPF Gavage mucosal samples (Figs. S15D, S16D). However, interregional distribution of other categories was absent (Figs. S15D, S16D). In the HUM SD Gavage mucosal samples, metabolism was primarily enhanced in the colon compared to the SI, including carbohydrate degradation and cross-feeding (Figs. S15E, S16E). SI enrichment of all GMMs in the ethanol production categories, SI enrichment of GMMs in the lipolytic fermentation and cross-feeding categories, and colon enrichment of GMMs in the butyrate production categories were recapitulated in the HUM MD Gavage Cohort (Figs. S15F, S16F).
We examined the region-specific GMMs shared across at least three out of six mucosal datasets with the same directionality (e.g. all three enriched or all three depleted relative to the DC) (Fig. S17). Region-specific GMMs present in at least two SPF datasets and recapitulated in at least one colonized germ-free mice dataset included the following: jejunal and ileal enrichment of fructose degradation; SI depletion of rhamnose degradation; jejunal and ileal depletion of xylose; ileal enrichment of lactose and galactose degradation; jejunal enrichment of glycerol degradation I; jejunal depletion of glyoxylate bypass; ileal depletion of glutamine degradation II; SI enrichment of the pyruvate dehydrogenase complex; and SI depletion of butyrate production I (Figs. S17, S18). SI enrichment of some GMMs was apparent in SPF mice but was lacking or even SI-depleted in colonized germ-free mice, including degradation of the simple carbohydrates mannose, ribose, maltose, and lactose, and pentose phosphate pathway (oxidative phase) (Figs. S17, S18). Taken together, these results suggest that many aspects of microbial metabolism are SI-enriched compared to the colon.
Regional specificity of microbial gut-brain pathways is observed in SPF mice and perturbed in colonized germ-free mice
We evaluated whether microbial neuroactive compound metabolism assessed through estimation of “gut-brain modules” (GBMs) also exhibits longitudinal gut biogeography. Analysis of interregional distribution of GBMs in the lumen demonstrated that γ-aminobutyric acid (GABA) degradation, 17-beta-estradiol degradation and inositol synthesis were SI-enriched, whereas tryptophan synthesis, tryptophan degradation, butyrate synthesis I, and butyrate synthesis II were colon-enriched in both UCLA O. SPF and CS SPF cohorts (Fig. S19–A-B). Although these interregional-specific GBMs were recapitulated to different extents in the three gavage recipient cohorts, the directionality of the GBMs varied (Fig. S19-C–E). Predicted metabolites shared among all five cohorts included SI-enrichment of cholesterol, which is a precursor of steroid hormone synthesis, colon enrichment of phosphoinositol, a derivative of inositol synthesis, and SI-enrichment of tryptophan (Fig. S11).
GBM regional specificity agreed well with the interregional comparisons in luminal datasets, with depletion of butyrate synthesis I and tryptophan degradation observed in all three SI sites relative to the DC in the two SPF cohorts and also in the HUM MD Gavage cohort (Fig. 7A-B). Duodenal enrichment of 17-beta-estradiol degradation as well as jejunal and ileal enrichment of GABA degradation were observed in the two SPF cohorts and also in the HUM MD Gavage cohort (Fig. 7D-E). Relative to the DC, jejunal enrichment of acetate degradation was observed in UCLA O. SPF mice, whereas acetate degradation was depleted in the jejunum for both HUM SD Gavage and HUM MD Gavage mice (Fig. 7C). These findings were generally supported by the shotgun sequencing data (Fig. 7G). There were no region-specific GBMs in the SPF Gavage luminal samples (Fig. 7F).

Predicted gut-brain modules exhibit region specificity. KEGG orthologs predicted from compositional data were grouped into gut brain modules (GBMs) for module enrichment analysis. (A–E) line graphs show the regression coefficients and their standard errors for each region—DC comparison, corresponding to five significant GBMs that were consistently differentially enriched in proximal regions of the intestines (duodenum—D, jejunum- J, cecum—C, or proximal colon—PC) compared to the DC in at least three of 5 luminal datasets. Each dataset is represented by a different color line, with the legend for all line plots shown at the top of the figure. The asterisk indicates that the site-DC comparison was significant following multiple hypothesis correction, *q < 0.05, while the asterisk highlight color corresponds to the cohort as indicated in the legend. (F) Upset plot showing the total number of region-specific GBMs identified for each dataset in the “set size” panel, with the “intersection” panel illustrating the number of region-specific genera either unique to a dataset or shared across datasets as indicated by the dot matrix. (G) Barplots showing the jejunal vs. distal colon enrichment of the five GBMs as determined by shotgun sequencing for each of four cohorts.
In terms of the interregional distribution of GBMs in the mucosal samples, nitric oxide degradation and GABA degradation were SI enriched in all three SPF cohorts, with preservation of SI-enrichment of nitric oxide degradation observed in all three gavage recipient cohorts (Fig. S20). Similar to the luminal datasets, tryptophan synthesis, tryptophan degradation, butyrate synthesis I, and butyrate synthesis II were colon-enriched in all three SPF cohorts (Fig. S20A–C). Preservation of colon-enrichment of butyrate synthesis I and butyrate synthesis II was observed in the SPF Gavage cohort, whereas colon-enrichment of tryptophan synthesis, tryptophan degradation, and butyrate synthesis I (but not butyrate synthesis II) was observed in the HUM SD and MD Gavage cohorts (Fig. S20D–F).
There were 18 GBMs which distinguished each of the proximal regions of the intestines from the DC in at least three out of six mucosal datasets with the same directionality (Fig. S21). In agreement with findings at the interregional level, duodenal and ileal depletion of tryptophan degradation, jejunal enrichment of nitric oxide degradation I, and jejunal depletion of butyrate synthesis I stood out as region-specific GBMs shared across the three SPF mucosal datasets and recapitulated in either the SPF Gavage or HUM MD Gavage datasets (Fig. S21). There were no region-specific GBMs in the HUM SD Gavage mucosal samples (Fig. S21E).
Transverse differences in microbiome composition and function across six regions of the GI tract are context-dependent
We assessed differences in microbiome biodiversity between luminal and mucosal samples (“sample type”) within regions of the intestines. Transverse differences in biodiversity were subtle, with only an increase in evenness in cecal luminal samples compared to cecal mucosal samples observed in both SPF cohorts (Fig. S22A-B). Samples from the UCLA O. SPF cohort showed significant sample type differences in microbiome beta-diversity within SI, colon, and each of six regions, but samples from the CS SPF dataset showed significant sample type differences only within the cecum (Fig. S22C–J).
In gavaged mice, the cecal transverse differences in evenness were reproduced only in the HUM MD Gavage mice (Fig. S23A–C). SPF Gavage mice had significant transverse differences in microbiome beta-diversity at the aggregate level in both SI and colon, and additionally within the cecum and PC regions (Fig. S23D–E, S23J–K). HUM SD Gavage mice exhibited significant transverse differences in beta-diversity only within the jejunum, whereas HUM MD gavage exhibited significant transverse differences in both the aggregate SI and colon levels and within all intestinal regions except the ileum (Fig. S23F–I, S23L–O).
When aggregating SI and colon samples, there were sample type specific genera shared between two cohorts, but few were shared across at least three cohorts (Fig. S24, S25). There were fewer sample type specific genera in the SI for CS SPF, SPF Gavage, and HUM SD Gavage (8, 4, and 0, respectively) compared to in the colon (19, 15, and 17, respectively) (Fig. S24, S25). Though mucosal enriched-bacteria in the SI varied between UCLA O. SPF and CS SPF mice, Enterorhabdus, Desulfovibrio, Clostridia UCG-014, and HT002 were reproducibly luminal-enriched (Fig. S24A–B). In both UCLA O. SPF and HUM MD Gavage SI datasets, Escherichia/Shigella, Eubacterium siraeum group, and a genus from family Peptococcaceae were among those that were mucosal-enriched, whereas Turicibacter and a genus from family Lachnospiraceae were among those that were luminal-enriched (Fig. S24A, E). Within the colon, Mucispirillum and a genus from Gastraanaerophilales were mucosal enriched, whereas Streptococcus, Enterorhabdus, and Parasutterella were luminal enriched in both UCLA O. SPF and CS SPF mice (Fig. S25A-B). Escherichia/Shigella was mucosal-enriched in the colons of UCLA O. SPF, HUM SD Gavage, and HUM MD Gavage mice (Fig. S25A, D-E). Different genera belonging to family Lachnospiraceae were identified as mucosal-enriched in the colons of UCLA. O. SPF, SPF Gavage, HUM SD Gavage, and HUM MD Gavage mice (Fig. S25A, C-E).
At the level of individual regions, the quantity of sample type specific genera varied greatly between datasets (Fig. S26). None were shared across three cohorts (Fig. S26E). Of the 22 sample type specific genera shared between UCLA O. SPF and HUM MD Gavage cohorts, Escherichia/Shigella was consistently mucosal-enriched in the duodenum, cecum, PC and DC regions (Fig. S26A, C). Of the five genera shared between UCLA O. SPF and CS SPF cohorts, Akkermansia was consistently mucosal-enriched in the DC, whereas Enterorhabdus was consistently luminal-enriched in the duodenum, ileum and cecum (Fig. S26 A, B).
Given that variation in sample type specific genera was minimized at the aggregate level, we performed gut metabolic enrichment analysis within the SI or colon (Fig. S27, S28). The two SPF cohorts showed luminal enrichment of central metabolism and cross-feeding within the SI and the colon, which was recapitulated in the HUM SD Gavage dataset, partially recapitulated in the SPF Gavage dataset (within the colon only), and not recapitulated in the HUM MD Gavage dataset (Fig. S27, S28). Butyrate metabolism was mucosal-enriched in the colon of both SPF datasets, but this behavior was not recapitulated in any of the gavage recipient cohorts (Fig. S27B and D, S28 B, D, and F). However, polysaccharide degradation was luminal-enriched within the colons of all five cohorts (Fig. S27, S28). We summarize and highlight key findings in Fig. 8.

Summary of key findings on biogeographical distribution of microbes and their predicted functions. Along the longitudinal axis, key interregional findings are summarized in the comparison of small intestine to colon at the top, while select region-specific findings are highlighted in the middle. The direction of the arrow indicates the enrichment (up) or the depletion (down) of its associated feature within the region relative to the reference. These findings align with known oxygen and host dietary macromolecule availability gradients at the bottom of the figure. Along the transverse axis, findings were primarily region-independent, with the lumen exhibiting increased metabolic activity compared to the mucosa. However, within the colon, polysaccharide degradation was increased in the lumen while butyrate production was increased in the mucosa. The degree of overlap of the circles indicate the relative extent to which these longitudinal and transverse biogeographical distributions are reproducible across facilities or recapitulated with fecal orogastric gavage. Figure created with Biorender.
Discussion
Along the longitudinal GI axis, we identified two clusters of bacteria which exhibit dominant biogeographical patterns in SPF mice. Increased oxygen in the SI favors fast-growing, facultative anaerobes [17]. Indeed, genera in the SI-enriched purple cluster were often facultative anaerobes (Lactobacillus, Streptococcus) or exhibited aerotolerance (Desulfovibrio, Enterorhabdus), whereas the colon-enriched blue cluster consisted of obligate anaerobic bacteria (e.g. Parabacteroides, Bacteroides) [27–31]. The mucus-degrading genus Akkermansia was significantly enriched in the SI of luminal SPF cohorts. Akkermansia has been detected in all regions of the human GI tract, though its spatial distribution along the GI tract has not previously been reported [32]. Akkermansia supports host physiology by maintaining epithelial barrier integrity, improving metabolism and glucose homeostasis in obese and diabetic mice, and by exerting anti-inflammatory effects [33–36]. We speculate that the beneficial properties of Akkermansia, especially in response to dietary stress, may be due to its enrichment in the SI.
The SI is the primary site of nutrient digestion and absorption, yet the role of SI microbiota in these processes is often overlooked [37]. Germ-free mice demonstrated resistance to diet-induced obesity and impaired lipid absorption, reversed by jejunal microbiota colonization [38]. Compared to feces, human SI metagenomes and metatranscriptomes exhibited enrichment in simple carbohydrate transport phosphotransferase systems, central metabolism (pentose phosphate pathway), and amino acid metabolism [39]. Our study recapitulates these findings and provides a more detailed spatial context on the biogeographical distribution of these processes. Existing models of microbial carbohydrate metabolism propose that the SI microbiota are primarily responsible for fermentation of simple carbohydrates, and that fermentation of complex polysaccharides inaccessible to human digestive enzymes by the colonic microbiota results in production of short chain fatty acids (SCFAs) [39, 40]. Although the results of our metabolic modeling agree with this model in the case of disaccharides, several monosaccharide degradation GMMs were colon-enriched. Additionally, our findings reveal increased microbial cross-feeding, including increased lactate consumption by the SI microbiota. Lactate production is common among members of the gut microbiota, though its conversion to the SCFAs propionate, acetate, or butyrate has only been noted in several species or requires the cooperation of multiple species [41]. This may account for our observed inconsistencies in acetate/propionate biogeography across SPF cohorts. However, butyrate production was consistently colon-enriched, which may be consistent with widespread butyrate synthesis capabilities among colonic bacteria [42]. We note that there were several discrepancies between the GMM and metabolite distribution patterns identified in the shotgun metagenomics datasets compared to the patterns identified from the 16S rRNA gene amplicon sequencing datasets; this may be attributed to a combination of reduced sample sizes in the shotgun data and imperfections in gene abundance imputation by PICRUSt2 in the 16S rRNA gene amplicon data. Although the PICRUSt2 developers found a strong correlation of ~0.8 between paired predicted KO and metagenome KO in stool samples, the accuracy of imputation likely varies by gene and some may be less accurately predicted than others [43].
Given accumulating evidence supporting modulation of neuroactive metabolites by the gut microbiota, we assessed regional specificity of neuroactive compound metabolism [44–46]. Many region-specific GBMs were identified, including luminal and mucosal enrichment of GABA degradation in SI sites relative to the DC. GABA is synthesized in neuronal cells of the submucosal and myenteric plexi in the GI tract, as well as in mucosal endocrine-like cells [47]. GABA signaling regulates critical aspects of gut physiology, including gut motility, gastric mucosal secretion, and mucosal electrolyte transport [47, 48]. Bacteria are capable of producing and degrading GABA [49, 50]. In a mouse model of essential tremor, low GABA-producing microbes exacerbated behavioral abnormalities, whereas supplementation with high-GABA producing Lactobacillus plantarum alleviated tremor and increased the GABA producing potential in the SI but not colonic microbiome [51]. Our results support the SI as the primary site of GABA regulation by the gut microbiota [52].
One unique aspect of this study is that we assessed the reproducibility of our biogeographical findings across two different facilities. Facility-specific effects often confound results, yet they may be physiologically relevant for microbiome research. The presence of Duncaniella muricolitica and Alistipes okayasuensis in housing facilities determine outcomes in the dextran sodium sulfate colitis mouse model [53]. Despite the distinct microbiota compositions between facilities, we identified shared region-specific taxa, taxonomic clustering behaviors, and metabolic potential along the longitudinal axes. However, transverse differences along the GI tract were largely not reproducible. This may be due to confounding facility-specific environmental variables, including the diet used for standard chow, bedding material, caging conditions, handling stress, and natural environmental microbial variation [54, 55]. These confounding variables could subsequently influence microbial adhesion to the intestinal epithelium. Additionally, larger sample sizes may be necessary to detect transverse differences in gut biogeography, favoring identification of sample type specific bacteria in the UCLA O. SPF cohort (n = 48) and HUM MD Gavage cohorts (n = 45) compared to smaller cohorts. These were limitations of our study.
Biogeography is often overlooked in microbiome causality studies, where phenotypic transmissibility via fecal microbiota transfer is the dominant technique. We first demonstrate that interregional differences in beta-diversity are preserved in SPF Gavage and the humanized mice cohorts. Interregional longitudinal biogeography of microbial composition was recapitulated in gavage recipient mice, but many genera identified in SPF mice as region-specific were not recapitulated; this was observed in both luminal and mucosal samples of SPF Gavage and HUM SD Gavage mice, and in mucosal but not luminal samples of HUM MD Gavage mice. Additionally, biogeographical distribution of predicted region-specific metabolic and neuromodulating potential was perturbed in gavaged mice. One specific example is the pentose phosphate pathway (oxidative phase), which was SI-enriched in SPF mice, not region-specific in SPF Gavage mice, and SI-depleted in the HUM SD Gavage and HUM MD Gavage mice. Another example is SI-enrichment of phospholipids and sphingolipids, which was observed in both SPF mice cohorts but not in gavage recipient cohorts. Gut bacteria have been shown to synthesize both kinds of lipids, which may possibly interact with host inflammatory pathways [56]. Overall, of the three gavage recipient cohorts, the HUM MD Gavage cohort best recapitulated compositional and functional observations of biogeography in the SPF mice. We posit that this may be an advantage of utilizing multiple donor sources compared to one donor source, in alignment with previous studies describing diverse colonization abilities of exogenous microbes [57, 58]. We observed that mice from both humanized cohorts exhibited an expansion of Akkermansia, whereas there was no such expansion in Akkermansia in the SPF gavaged mice, suggesting that biogeographical patterns depend both on reshaping from the host and on the composition of the original material. We acknowledge that our study utilized germ-free mice which were colonized as adults, which may result in residual immunological defects such as elevated serum IgE that could affect the biogeographical distribution of the gut microbiome [59].
Our detailed characterization of the biogeographical distribution of microbes and their predicted functions emphasizes the importance of selecting the appropriate sampling sites for studies of diseases with region-specific dysbiosis. Multi-omics profiling of patients with inflammatory bowel disease revealed distinct microbes and activities distinguishing ileal from colonic Crohn’s [60]. Celiac disease is another example of a biogeography-dependent disorder. Gnotobiotic mice colonized with duodenal aspirates from patients with celiac disease compared to controls with similar genetic risk exhibited reduced gluten metabolism, which may contribute to the duodenal inflammation observed in celiac disease [61]. We additionally report that fecal microbiota transplantation does not fully reinstate spatial organization of the microbiota or their functions along the longitudinal axis of the gut. Alternative approaches to fecal microbiota transplantation such as whole-intestinal microbiota transplantation may be necessary to study diseases with region-specific gut manifestations [62]. Redirecting our focus away from fecal microbiomes will enhance our understanding of host–microbe interactions and benefit the future of microbiome research.
Acknowledgements
The authors thank the UCLA Goodman-Luskin Microbiome Center Microbiome Core, UCLA Technology Center for Genomics & Bioinformatics, and the National Gnotobiotic Rodent Resource Center for their support. RT thanks Jeremy Zerbe for the tutorial on manipulating SVG files. JCY is grateful to Siamion Kurakou for suggesting modifications to make figures color-blind accessible.
Author contributions
J.P.J. conceived and planned the experiments. V.L., N.A.J., C.C., M.H., W.K., Y.Z., F.S., C.K., F.L., and N.J. performed the mouse experiments, collected samples, and contributed to 16S rRNA gene amplicon sequencing data library preparation and acquisition. E.A. curated samples and performed shotgun metagenomics sequencing library preparation. S.B. preprocessed the shotgun sequencing data. J.C.Y. developed software, performed bioinformatics analyses, and wrote the manuscript. T.S.D., J.S., R.T., H.B., and C.A.S. contributed to data analysis, visualization, and writing. J.P.J. supervised J.C.Y. on the software, analyses, data visualization, and additionally wrote the manuscript. J.P.J. and J.B. acquired funds for the experiments. All authors read and approved the final manuscript.
Conflicts of interest
At the time of manuscript submission, V.L. is an employee of Kite Pharma. C.S. is an employee of AbbVie. J.S. is an employee of Abiosciences. J.C.Y. is an employee of Merck. All other authors declare no competing interests.
Funding
This research was funded by the Vatche and Tamar Manoukian Division of Digestive Diseases. J.P.J. was supported by Crohn’s and Colitis Foundation Career Development Award 510956 and VA CDA2 IK2CX001717.
Data availability
Raw data can be accessed in the National Center for Biotechnology Information BioProject database with the identifier PRJNA944800 (http://www.ncbi.nlm.nih.gov/bioproject/944800).
Ethics approval and consent to participate
All mouse experiments were approved by the University of California Los Angeles Animal Research Committee.
Consent for publication
Not applicable.