-
PDF
- Split View
-
Views
-
Cite
Cite
Shuangshuang Cheng, Zhilin Ning, Ke Huang, Yuan Yuan, Xinjiang Tan, Yuwen Pan, Rui Zhang, Lei Tian, Yan Lu, Xiaoji Wang, Dongsheng Lu, Yajun Yang, Yaqun Guan, Dolikun Mamatyusupu, Shuhua Xu, Analysis of sex-biased gene expression in a Eurasian admixed population, Briefings in Bioinformatics, Volume 25, Issue 5, September 2024, bbae451, https://doi.org/10.1093/bib/bbae451
- Share Icon Share
Abstract
Sex-biased gene expression differs across human populations; however, the underlying genetic basis and molecular mechanisms remain largely unknown. Here, we explore the influence of ancestry on sex differences in the human transcriptome and its genetic effects on a Eurasian admixed population: Uyghurs living in Xinjiang (XJU), by analyzing whole-genome sequencing data and transcriptome data of 90 XJU and 40 unrelated Han Chinese individuals. We identified 302 sex-biased expressed genes and 174 sex-biased cis-expression quantitative loci (sb-cis-eQTLs) in XJU, which were enriched in innate immune-related functions, indicating sex differences in immunity. Notably, approximately one-quarter of the sb-cis-eQTLs showed a strong correlation with ancestry composition; i.e. populations of similar ancestry tended to show similar patterns of sex-biased gene expression. Our analysis further suggested that genetic admixture induced a moderate degree of sex-biased gene expression. Interestingly, analysis of chromosome interactions revealed that the X chromosome acted on autosomal immunity-associated genes, partially explaining the sex-biased phenotypic differences. Our work extends the knowledge of sex-biased gene expression from the perspective of genetic admixture and bridges the gap in the exploration of sex-biased phenotypes shaped by autosome and X-chromosome interactions. Notably, we demonstrated that sex chromosomes cannot fully explain sex differentiation in immune-related phenotypes.
Introduction
Sex-biased genes (SBGs) were previously defined as genes with sexually dimorphic expression, including male-biased genes (MGs) or female-biased genes (FGs), depending on which sex shows the higher expression level [1]. SBGs are widespread across human tissues and are the core genes for understanding sexually dimorphic traits, including height [2], brain structure [3], muscle [4], and sex divergence in disease. For example, the incidence of autoimmune diseases is greater in females than in males [5], the immune system is stronger and more sensitive in females than in males [6, 7], and the COVID-19 fatality rate is greater in males than in females [8]. Overall, in humans, ~37% of genes exhibit sex-biased expression in at least one tissue [9]. Many factors, including hormones, environmental exposures, sex-biased epigenetics, and sex-biased genomic regulation, may affect sex-biased gene expression [10]. Genetic factors have a weaker influence but are more valuable to study than other factors [9, 11, 12]. Ancestral backgrounds are essential determinants of the genetic effects observed in gene expression studies [13–15]. However, the influence of ancestral backgrounds on sex-biased gene expression has not been thoroughly investigated. In addition, sex chromosome–autosome interactions play important roles in determining sex characteristics through the X chromosome; however, these interactions are rarely studied via integrated omics studies [16, 17].
Here, we performed the first study on sex-biased gene expression in Xinjiang Uyghurs (XJU), a typical admixed population with two main ancestry sources, Eastern and Western (EAS and EUR) [18–23]. We combined genomic and transcriptomic sequencing data to identify potential genetic factors underlying sex-biased gene expression and investigate how genetic ancestries affect sex-biased gene expression. Instead of using different populations with different ancestral backgrounds, we focused on one admixed population to eliminate confounding factors, including environmental factors, individual lifestyles, and climatic conditions. We established an admixture-induced sex-biased gene expression model and proposed that admixture potentially results in a moderate degree of sex-biased gene expression. To systematically dissect the full picture of genetic regulation, we explored the interactive genetic effects between autosomes and X chromosomes on sex bias-associated phenotypes. Our work provides an extensive understanding of the sex differences of XJU, explores the potential roles of autosomes and X chromosomes in sex-biased phenotypes and regulation, and highlights the importance of ancestral backgrounds in sex bias, providing new insight into the evolutionary process of sex-biased genes.
Results
Sex-biased genes
We identified 302 SBGs in the whole blood of 90 XJU individuals, accounting for 1.02% (302/29 515 genes) of all expressed genes (Table S1 and S2, Fig. S1). Among these SBGs, 139 were MGs and 163 were FGs (Fig. 1a). Only 28 SGBs were located on X chromosomes (Materials and Methods). The expression levels of FGs on X chromosomes were higher than those of MGs on X chromosomes or autosomes (max fold change = 6), possibly due to escape from X chromosome inactivation (XCI) [9, 24]. Consistent with the findings of a previous study [25], most of the SBGs (~90%) mapped to autosomes, indicating that sex-biased genes were not restricted to sex chromosomes but instead widespread throughout the whole genome. In general, FGs outnumbered the MGs. The number of FGs and MGs differed across the chromosomes (Fig. 1b). On half of the chromosomes, the FGs outnumbered MGs; especially on chromosome X, the number of FGs was greater than that on each of the autosomes; and on chromosome 13, MGs were not detected.

SBGs in XJU and SBG comparison between XJU and ancestral source populations. (a) Volcano plot of SBGs in XJU. The x-axis indicates the degree of sex-biased gene expression estimated with log(M/F). The y-axis indicates the log10 transformed P-value of the significant sex-biased genes. Each dot indicates one gene. The red color indicates XJU FGs; the blue color indicates XJU MGs; the circles denote genes on autosomes; and the triangles denote genes on chromosome X. (b) Distribution of XJU SBGs across genomes. (c) Number of respective and overlapping SBGs in the XJU, EAS, and EUR populations.
FGs were enriched in over 100 disease-related pathways, such as SARS-CoV-2 infection, human parainfluenza virus 3 infection, respiratory syncytial virus infection, ovarian cancer progenitor cell response, Epstein–Barr virus infection, human papillomavirus infection, hepatitis C virus infection, interferon signaling, and escaping X inactivation (Figs S2 and S3). MGs were enriched in approximately 20 disease-related pathways, including neutrophil degranulation, acute myeloid leukemia, multiple myeloma, plasma cell maintenance, head and neck cancer, Helicobacter pylori cagA-positive strain infection, and systemic lupus erythematosus (SLE; Fig. S4). In contrast to those of MGs, the functions of FGs were related to viral infection and the immune response, consistent with previous findings that FGs are immune-related [25], which indicated that FGs were tightly correlated and their function was highly related to the innate antiviral response. The FGs demonstrated slightly faster evolutionary rates, more homologous gene numbers, and more recent origins during phyletic evolution than that of the MGs (P = .25, P = .023, and P = 0.51; one-tailed t-test; Figs S5–S7). This reflected that FGs arose relatively recently and that MGs are more conserved in evolutionary history, consistent with previous findings that a particularly fast and strong immune response has evolved in females [26].
Among all the SBGs, four genes made great contributions to the immune-related pathways enrichment: three FGs (OAS1, OAS2, and OAS3) and one MG (MPO). For the three FGs, it was reported that OAS1, OAS2, and OAS3 were in the interferon-inducible oligoadenylate synthetase (OAS) gene cluster and encoded antiviral restriction enzyme activators. The expression could restrict intracellular pathogenic mycobacterial replication and enhance pro-inflammatory cytokine secretion [27]; downregulation of these genes might be associated with worse clinical outcomes in patients with COVID-19, such as elevated D-dimer levels and increased viral loads [23], and increased susceptibility to SARS-CoV in Vietnamese and Chinese individuals [28–30]. For the 1 MG, MPO was reported to be correlated with the microbicidal against in a wide range of organisms [31], which encoded one of the top shared autoantigens involved in coronavirus respiratory infections. The exaggerated increase in the level of the antigens during COVID-19 infection could lead to disruption of autoimmune tolerance and recognition of autoantigens by immune sentinel cells [32]. Therefore, the sex-biased expression of OAS1, OAS2, OAS3, and MPO indicated the key clue for the sex differences in the susceptibility, resistance mechanisms, and prognosis of SARS-CoV or other virus/bacteria-related diseases.
To determine the concordance degree of sex-biased genes, we analyzed the replication ratios of XJU SBGs with EAS SBGs and EUR SBGs found in whole blood samples (Genotype-Tissue Expression; GTEx V7 dataset; Fig. 1c, Tables S3 and S4, Materials and Methods). There were 15 common SBGs, of which 13 were on X chromosomes. Like those of the XJU SBGs, the EAS SBGs and EUR SBGs were also immune-related (Figs S8–S11). In addition, 13 common SBGs were identified to be sex-biased genes in previous studies [9, 25], which were correlated with predominant immune, neurological, and reproductive-related functions and disorders (Table S5). Regarding FGs, only 12.60% (16/127) of the pathways enriched in the EAS FGs overlapped with those enriched in the XJU FGs, while 70.83% (51/72) of the pathways enriched in the EUR FGs overlapped with those enriched in the XJU FGs. Regarding MGs, the replication ratios of the enriched pathways were 66.67% (2/3) in the EAS samples and 13.04% (3/23) in the EUR samples. Our results indicate that SBGs are population-specific and that ethnicity should be fully considered in population SBG analysis.
Sex-biased cis-expression quantitative loci
Diverse factors drive SBGs, such as sex-differentiated transcription factors (TFs) and genetic variants, which explain a portion of these expression differences but cannot be ignored [9, 10, 33–36]. Here, we assessed the sex-biased genetic effect on gene expression in XJU. In total, we identified 174 sex-biased cis-expression quantitative loci (sb-cis-eQTLs; Fig. S12, Materials and Methods) associated with 60 genes (sb-cis-eGenes; Table S6). Only one sb-cis-eQTL was located on the X chromosome. Approximately 25% of the sb-cis-eQTLs were observed with the low frequency (MAF < 0.05), and ~ 6% of the sb-cis-eQTLs were observed with the greater ancestral genetic difference (|${F}_{ST}$| between EAS and EUR > 0.2). Moreover, almost a quarter of the sb-cis-eQTLs were fixed in at least one ancestral source population of XJU, suggesting the population specificity of XJU sb-cis-eQTLs and the possibility of admixture-induced genetic effects on sex-biased gene expression (Fig. S13).
Only a few sb-cis-eGenes exhibited sex-biased expression, which was consistent with the previous reports [9, 37]. Further, we found that the sb-cis-eGenes exhibited a deviated distribution of expression according to sex (Fig. S14), indicating that sb-cis-eGenes tended to be SBGs, likely due to the weak sex-biased genetic effect on gene expression or other functional mechanisms of SBGs.
These sb-cis-eQTLs were enriched in many noncoding elements, including transcription start sites (TSSs; P < 10−16, OR > 1) and enhancer regions (P < 10−16, OR > 1; Fig. S15). In addition, the sb-cis-eGenes showed great enrichment in immunity and metabolism-related pathways, including human cytomegalovirus (HCMV) infection and propanoate metabolism. And some enriched pathways were tumorigenesis-related, including breast cancer, bladder cancer, and thyroid cancer-related pathways (adjusted P < .05; Figs S16 and S17). Besides that, we noticed a significant enrichment including arrhythmogenic right ventricular cardiomyopathy, which was previously reported to demonstrate a sex-biased difference in prevalence, confirming the potential genetic effects of our identified sb-cis-eGenes [38].
The sb-cis-eQTL of PRDM1, rs742108, was previously found to be one of the risk variants of SLE in a genome-wide association study (GWAS) [39]. Another colocalization analysis [40] revealed that this locus is a putative cause variant of SLE (posterior probability of sharing the same causal variant (PP4) = 0.927; Fig. 2a). It is known that rs742108 is a TF binding site in the ENCODE dataset [41, 42] and is located in a 381 bp DNase cluster region predicted to be an androgen receptor (AR) and an estrogen receptor (ESR) TF binding region in the Cis-BP dataset (Fig. 2b) [43]. Previous studies also showed that hormones play a key role in sex-biased expression by activating TFs and their downstream targets in different ways [1, 44, 45]. Thus, we hypothesized that the expression of PRDM1 is sex-biased, potentially mediated by TFs. The rs742108 might recruit more ARs in males, and the AR TF complex might increase the expression level of PRDM1, while rs742108 might recruit more ESRs in females, and the ESR TF complex might suppress PRDM1 expression (Fig. 2c).

Genetic regulation of sex-biased genes in XJU. (a) GWAS colocalization of the sb-cis-eQTL rs742108 and SLE loci. The left half of the figure shows the scatter plot of the P-value of the sb-cis-eQTLs and the SLE-associated GWAS loci, with different colors representing different LD strengths. The right half of the figure shows rs742108 as the most significant point (PP4 = 0.927). (b) This site is located within a 381 bp DNase cluster region with binding motifs for AR and ESR. According to the ENCODE annotation (GRCH37), rs742108 is located at the TF-binding site in the promoter near the PRDM1 gene. The red bar represents the TF-binding site; the light green bar represents the CTCF binding region; and the pink bar represents the gene promoter region. (c) The sex-biased association between rs742108 and PRDM1 and a schematic diagram of putative mechanisms associated with sex. In males and females, the different genotypes of rs742108 might lead to differential binding of the transcription complex with sex-biased TFs on the promoter, which results in different regulatory effects of the same variant in males and females. (d) Sex-biased association of rs118073539 and rs17237074 with OR56B4. The number in brackets above each boxplot indicates the sample size of each group.
The sb-cis-eGene OR56B4 demonstrated adaptation in XJU [46]. OR56B4 is a protein-coding olfactory receptor (OR) family gene. It has been reported that genetic admixture and population history play a role in shaping individuals’ smell, flavor, and food preferences [47]. There were two highly linked single nucleotide polymorphisms (SNPs), rs118073539 and rs17237074 |$\left({R}^2=1\right)$|, identified as sb-cis-eQTLs associated with this gene (Fig. 2d). Both SNPs were fixed in the east ancestral population, and we inferred that the alternative alleles of the two SNPs were derived from the western ancestry of XJU. A higher conservation score for the rs17237074 alternative allele (CADD = 12) suggested the importance of the functional effects of this allele. We observed that alternative alleles were associated with relatively higher expression levels of OR56B4 in males but relatively lower expression levels in females (one-sided t-test, male: P = .216, female: P = .001). These findings indicate the different genetic effects of the derived alleles of western ancestry on sex, possibly contributing to OR adaptation.
Admixture-induced sex-biased expression
At each locus in the admixed group, when one allele was fixed in ancestral source group A, this allele was probably derived from this ancestral source group and was defined as an A-ancestral-induced allele. Approximately one-quarter of the sb-cis-eQTLs were fixed in at least one ancestral source population of XJU. These sb-cis-eQTLs could reflect admixture-induced genetic effects on sex-biased gene expression. There were two admixture-induced sex-biased genetic effects: induced alleles had similar or opposite genetic effects on sex. When ancestral-induced alleles had similar genetic effects on different sexes, the expression levels were upregulated or downregulated in both males and females but at different regulatory degrees; when ancestral-induced alleles had opposite genetic effects, the expression levels were upregulated in one sex but downregulated in the other (Fig. 3a). In total, there were 28 sb-cis-eQTLs with ancestral-induced alleles exhibited similar genetic effects on sex, and 16 sb-cis-eQTLs with ancestral-induced alleles exhibited opposite genetic effects on sex, indicating the effects of ancestral background on SBGs.

Sex-biased gene expression differentiation. (a) Two admixture-induced sex-biased regulatory effects: induced alleles had similar or opposite regulatory effects on sex. (b) Schematic diagram of the ancestral-related subgroup identification at XJU. (c) The number of sex-biased DEGs between the eastern-derived and western-derived groups. The x-axis indicates the ANOVA P-value thresholds, and the y-axis indicates the differential gene numbers. (d) The number of sex-biased DEGs between the eastern-derived group and the EAS group, and between the western-derived group and the EAS group. The x-axis indicates the ANOVA P-value thresholds, which test the differential degree of the sex-biased gene expression, and the y-axis indicates the differential gene numbers. To control for the sample size in each group and each EAS cohort, we conducted a random sampling of the genes 30 times. A comparison of the differential gene numbers between the two groups was conducted with a t-test. **** indicates the highest significance.
For the scenario of the opposite genetic effects, as mentioned above, we detected the regulatory association between the OR56B4 gene and its sb-cis-eQTLs (rs118073539 and rs17237074). The western-derived alleles of its sb-cis-eQTLs were associated the increased OR56B4 expression in males but decreased expression levels in females (Fig. 2d). As supporting evidence of similar genetic effects, we identified the regulatory association between the SERPINB10 gene and its sb-cis-eQTL (rs149120228) was identified. The SERPINB10, located on chromosome 18, showed sex-biased gene expression in XJU. Regarding its sb-cis-eQTL rs149120228, the reference allele G was fixed in the EUR samples; thus, the alternative allele T was possibly derived from XJU eastern ancestry. The T allele caused increased SERPINB10 expression in both males and females, but the degree of increase in males was greater than that in females (Fig. S18). Previous studies have shown that SERPINB10 is tumorigenic and that its specific missense mutations pose a significant risk for prostate cancer [48], indicating the potential regulatory difference across sexes on prostate cancer susceptibility caused by SERPINB10 in XJU.
According to local ancestry inference at the gene level (Supplementary Methods), we measured the ancestral-induced genetic effects on sex-biased gene expression in autosomes. Individuals could be divided into three subgroups according to the local ancestral inference of each gene: the eastern-derived group, the western-derived group, and the admixed group (Fig. 3b, Supplementary Methods). We hypothesize that genetic background determines sex-biased expression. We expected that individuals with different ancestral backgrounds would show differential sex-biased gene expression for the same gene. Indeed, we observed ~354 autosomal genes with differential sex-biased gene expression [analysis of variance (ANOVA), P = .05; Fig. 3c]. We found that ~340 genes were differentially expressed in terms of sex-biased gene expression between the XJU western-derived group and EAS (ANOVA, P = .05), and this number was greater than that between the XJU eastern-derived groups and EAS (one-tailed t-test, P = .05; Fig. 3d), verifying that similar ancestral backgrounds had similar profiles of sex-biased gene expression.
In addition, the degree of sex-biased gene expression in the XJU admixed group tended to be more moderate than that in populations of dominant eastern or western ancestry, i.e. less-admixed populations (Fig. S19). We next determined whether genetic admixture influences the degree of sex-biased gene expression. We proposed a model in which the degree of sex-biased gene expression in groups with different ancestral backgrounds was extreme, while that in the admixed group was moderate (Fig. 4a, Supplementary Methods). We found 321 genes (86%) that satisfied this model (ancestral-related sex-biased genes, P = .05). One example was FGD6, in which the degree of sex-biased gene expression [log2 (F/M) value] was moderate in the XJU admixed group but extreme in populations of dominant eastern or western ancestry (Fig. 4b). To verify this hypothesis, we permutated the genes with random ancestry backgrounds 30 times to identify the degree of sex-biased gene expression and found that only ~30% fit the model across each significance threshold, confirming the model’s robustness (Fig. 4c). Finally, we evaluated the functions of the ancestral-related sex-biased genes. These genes were found to be significantly enriched in 16 GWAS traits, including 10 blood-related and 4 immune-related traits (Fig. 4d), providing possible evidence of phenotypic changes across different sexes induced by admixture in XJU.

Modeling of admixture-induced sex-biased gene expression. (a) Schematic diagram of admixture-induced sex-biased gene expression. (b) Example of sex-biased expression in three ancestral related subgroups of XJU. The degree of sex-biased expression was measured as log(F/M). (c) Verification of the model in XJU. (d) GWAS enrichments of ancestry-related sex-biased genes.
Sex-biased interactions between autosomes and X chromosomes
The sex-biased interactive genetic effects between autosomes and X chromosomes were assessed with trans associations (Materials and Methods). In total, we identified 14 639 genes (sb-trans-eGenes) associated with 326 694 sex-biased trans-expression quantitative loci (sb-trans-eQTLs), which were enriched in several noncoding elements, including TSSs (P < .05, OR > 1) and enhancer regions (P < .05, OR > 1; Fig. S20). Among them, there were 6872 sb-trans-eQTLs located on the X chromosomes associated with 1829 sb-trans-eGenes located on the autosomes (defined as sb-trans-eQTLs-X and sb-trans-eGenes-A, respectively); and 20 186 sb-trans-eQTLs located on the autosomes associated with 474 sb-trans-eGenes on the X chromosomes (defined as sb-trans-eQTLs-A and sb-trans-eGenes-X, respectively). 234 of all identified sb-trans-eGenes were SBGs, including 135 MGs and 99 FGs; 32 of the top 1% significant sb-trans-eGenes were SBGs, including 22 MGs and 10 FGs. We identified one extremely significant protein-coding sb-trans-eGenes-X, BEX1, which was an MG. The high-overlap ratio between sb-trans-eGenes and SBGs and the identification of the top significant protein-coding sb-trans-eGenes reflected the statistical power of the trans association study.
The sb-trans-eGenes-A were found to be predominantly enriched in immune-related pathways, such as COVID-19 SARS-CoV-2 infection (adjusted P = |$6.33\times{10}^{-8}$|) and immunoregulatory interactions (adjusted P = |$1.49\times{10}^{-6}$|; Fig. S21). The sb-trans-eGenes-X were enriched in sex-specific traits related pathways, such as Lopez MBD (methyl-CpG binding domain) targets imprinted and X linked (adjusted P = |$1.09\times{10}^{-7}$|), Rickman metastasis (adjusted P = |$1.15\times{10}^{-4}$|), and Riggings tamoxifen resistance (adjusted P = .03; Fig. S22), which was possibly due to its location on the X chromosomes. These enrichment profiles could not be found in the sex-biased trans associated pairs on autosomes or on X chromosomes, indicating the enrichment bias existence between the sb-trans-eGenes-A and the sb-trans-eGenes-X. Among the top 1% significant sb-trans-eGenes (Fig. S23), there were 4 sb-trans-eGenes-X and 10 sb-trans-eGenes-A. Two sb-trans-eGenes-X, BEX1 and FAM156A, were related to neurological disorders; and seven sb-trans-eGenes-A were involved in immune-related pathways (Fig. 5a). This difference might provide the additional evidence for the different incidences of immune disorders across sexes (Fig. 5b).

Sex-biased interactive regulatory effects between autosomes and X chromosomes in XJU. (a) Distal regulation of sex-biased gene expression between autosomes and X chromosomes. The genes in purple are located on the X chromosomes, and the genes in yellow are on the autosomes. (b) Schematic modeling of the mutual sex-biased gene expression regulation of autosomes and X chromosomes. (c) Schematic representation of MR to estimate the causal effect of BEX1 sex bias expression on AD prevalence using multiple instrumental variables (SNPs). (d) Flowchart of the interaction between autosomal sb-trans-eQTLs and the BEX1 gene on the X chromosomes affecting the incidence of AD in males and females in XJU (created with BioRender.com).
The sb-trans-eGene-X BEX1 is a MG associated with intellectual disability [49]. Its sb-trans-eQTLs-A were enriched in quiescent/low (P = |$3.9\times{10}^{-3}$|, OR = 9.07) and in ZNF genes and repeats (P = |$5.6\times{10}^{-5}$|, OR = 7.92; Fig. S20). Previous reports have shown that sex differences in the expression of BEX1 may contribute to sex differences in the incidence of Alzheimer’s disease (AD), since this gene is significantly downregulated in AD patients, which demonstrates a greater prevalence in women than in men [50]. Mendelian randomization (MR) analysis was performed to estimate the causal effect of BEX1 sex-bias expression (exposure) on AD prevalence (outcome) by genetic variants (instrumental variables, Fig. 5c, Supplementary Methods). Two causal SNPs were identified: rs61871134 on chromosome 10 (P = .012, se = |$2.2\times{10}^{-5}$|) and rs17460623 on chromosome 13 (P = .012, se = 0.013), which had the potential to be the risk loci associated with AD prevalence in females mediated by the lower expression in females of BEX1 (Fig. 5d).
The sb-trans-eGene-A IFI27 was the IFN-stimulated gene, which was strongly correlated with the development of SLE [51]. There were 29 sb-trans-eQTLs-X associated with IFI27, 4 of which were the upstream variants of the long noncoding ribonucleic acids (lncRNA) RP11-320G24.1. Therefore, the IFI27 was inferred to be regulated differently in males and females by the lncRNA RP11-320G24.1 while the expression level differed slightly across sexes. Indeed, the protein-coding gene IFI27 could be regulated by the lncRNA RP11-2B6.2, which reduces chromatin accessibility and SOCS1 transcription, leading to an increased IFN response. These were highly associated with the pathogenesis of SLE nephritis [52]. Our association analysis might detect another protein-coding and lncRNA correlation pair, IFI27 and RP11-320G24.1, likely to be a novel protein-coding and lncRNA pair. These results suggested another potential regulatory mode for SLE pathogeny differences across sexes through IFN-related pathways in XJU, which further resulted in the high incidence of SLE in females.
In addition, the IFI27 expression serves as an early predictor of COVID-19 outcomes, which is upregulated in the blood tissue of COVID-19-infected patients [53]. And the morbidity and mortality rates of COVID-19 are higher in males than that in females [54]. The correlation between IFI27 expression and the high incidence of this infectious disease (e.g. COVID-19) in males and the high incidence of autoimmune disease (e.g. SLE) in females correlated indicated the balanced immunity [55] possibly existed in XJU. Therefore, the interactive genetic effects between autosomes and X chromosomes highlight the full landscape of genetic regulation through sex-biased modulation, leading to sex-biased phenotypic discrimination, even without direct observation of sex-related expression levels.
Discussion
In this study, we performed the first systematic analysis of sex-biased gene expression in blood from an admixed population and revealed the potential effects of ancestry and admixture. We observed a small fraction of SBGs between XJU and its ancestral source populations, reflecting a population-specific manner of gene expression [56, 57]. We argued that full consideration of the ancestral genetic backgrounds of SBGs is necessary. A relatively greater overlap ratio of SBGs between XJU and EAS on the X chromosomes than on autosomes (P = .05) was observed, implying that SBGs are usually conserved on the X chromosomes rather than on autosomes among populations, given the inference that SBGs on the X chromosomes are usually located in XCI regions [9, 58].
In total, 174 sb-cis-eQTLs were identified, and the effects of genetic variations on sex-biased gene expression in XJU were evaluated. Our findings are consistent with previous reports that genetic regulation plays an overall weaker role in sex [9, 11, 12]. A large fraction (25%) of ancestral-related sb-cis-eQTLs were detected. We found that the ancestry background was strongly associated with SBGs in different populations and admixture moderates the number of SBGs, consistent with our previously published results and the moderate expression regulation profiles in XJU [59]. Then, we proposed a model to determine the impact of admixture impact on SBGs and applied this model to XJU with different ancestral source genes; this model could serve as an intuitive method for observing the effects of SBGs and provide us with a new perspective on the changes in admixture events on SBGs. Indeed, we found that most of the genes (86%) satisfied this model, even though the sample size was small and strict thresholds were applied, indicating the consistency and rationale of our admixture-induced expression model.
The influences of autosomal and X chromosome interactions on sex-differentiated traits were explored according to the detected potential sex-biased genetic regulation in XJU. We identified 6872 sb-trans-eQTLs-X associated with 1829 sb-trans-eGenes-A and 20 186 sb-trans-eQTLs-A associated with 474 sb-trans-eGenes-X. Surprisingly, the sb-trans-eGenes encoded on autosomes (sb-trans-eGenes-A) associated with variants on X chromosomes (sb-trans-eQTLs-X) are related to immunity. Previous studies have shown that autosomal genes encoding immunological proteins may have sex-differential effects on immunity [60], and that immune sex expression signature genes located on autosomes can distinguish men and women [61]. Therefore, sex chromosomes cannot fully explain sex differentiation in immune-related phenotypes [62], which has deepened our understanding of the differences in immune responses between men and women. Future studies of sex differences in immune system disorders are no longer limited to the sex chromosomes. Based on that, our enrichment results suggested that autosomes play an essential role in shaping sex-differentiated traits, which was consistent with the previous studies [63–66]. Further, interactive genetic effects between autosomes and X chromosome differences across sex could contribute to the sex-biased incidence of the disease-related phenotypes, which has deepened our understanding of the physiological, psychological, and disease susceptibility differences between males and females and their genetic mechanisms.
Hormones play a critical role in sex-biased expression by differentially activating TFs and their downstream targets in each sex [1, 25, 44, 45]. We hypothesized that the genetic regulators could affect sex bias gene expression affected by sex hormones. i.e., SNPs|$\to$|Sex hormones|$\to$|Gene expression. We focused on the overall influences of genetics on sex-biased gene expression. Therefore, the sex hormones could act as an intermediate confounder. The effect of sex hormones on sex bias gene expressions could be reflected in our current association results. For example, a locus of SLE of PRDM1, rs742108, was a sb-cis-eQTL. The SNP rs742108 might recruit more ARs in males, and the AR TF complex might increase the expression level of PRDM1, while rs742108 might recruit more ESRs in females, and the ESR TF complex might suppress PRDM1 expression.
Our analysis of the ancestry influence on sex-biased gene expression was mainly focused on autosomes. However, we also attempted to conduct the analysis on X chromosomes with Hapmix (Supplementary Methods) [67], and a similar pattern to that of autosomes was observed, indicating the robustness of our models and analysis.
The sample size was sufficient for our study (Supplementary Note), but there is a subtle limitation for the subgroup analysis restriction and rare events detection for sex bias. We used peripheral blood as the sample source in our study, which is the gold standard for such kind of analysis based on our previous work [46, 59]. But we should notice that the expression and enrichment profiles might be biased toward immunity and hematological-related traits, which should be careful in further studies. Further efforts are needed to verify our findings in other tissues.
Materials and Methods
Population samples
Peripheral blood samples of 90 unrelated XJU individuals and 40 unrelated Han Chinese (HAN) individuals, between 18 and 24 years of age, were collected from Xinjiang Uyghur Autonomous Region, China. Each individual was the offspring of a nonconsanguineous marriage of members of the same ethnicity within three generations. Each individual provided informed consent before participation. The procedures performed in the study were following the ethical standards of the Helsinki Declaration of 1975 (revised in 2000), and approved by the Biomedical Research Ethics Committee of Shanghai Institutes for Biological Sciences (no. ER-SIBS-261408).
Detection of sex-biased genes
The RNA expression data of samples was sequenced on the Illumina HiSeq2000 platform. Detailed information on the RNA expression data generation of 90 XJU individuals and 40 HAN individuals was provided in a previous paper [59]. For sex-biased gene detection, we defined female and male donors exclusively according to their sex chromosomes (XX or XY, respectively). There were 56 males and 34 females in the XJU population (males: females, 1.65:1) and 20 males and 20 females in the HAN population (males: females, 1:1). The European sex-biased genes were measured in whole blood samples from the Genotype-Tissue Expression cohort, which included 112 males and 112 females (males: females, 1:1). We labeled HAN as EAS and Europeans from GTEx as EUR throughout the manuscript. XJU, EAS, and EUR, genes with extremely low expression levels were excluded from our analysis. Considering the existence of batch effects of gene expression curated from different datasets, which was mainly due to the different sample sources, sequencing platforms and depths between different curated datasets, our downstream analyses were conducted independently without the need for data integration. Then, we used different thresholds to filter genes in different curated datasets to ensure that the filtered genes were in a similar scale: total normalized read counts of 1 gene <10 in XJU and EAS with about 30× coverage ratio of exon region across the genome; total normalized read counts of 1 gene <50 in EUR with a median sequencing depth of 82.1 million mapped reads per sample.
Then, the sex-biased genes of XJU, EAS, and EUR were identified using DESeq2 [68], which provided the method to test the differential expression with negative binomial generalized linear models [with a threshold of |log2-fold change| > 0.4 and false discovery rate (FDR) ≤ 0.1]. With the sensitivity analysis, we found that the female-to-male ratio had little effect on SBGs detection (Fig. S24). Therefore, we have disregarded the gender ratio in the analysis of SBGs. Note that in our current analysis, we only considered X chromosome, and Y chromosome was excluded. “X chromosomes” refers to the two copies of X in females and the one copy in males.
The genotype data of the XJU individuals were obtained from our previously published paper [46]. The SNPs in non-pseudo autosomal regions of the X chromosomes were excluded from our analysis. In total, 30 061 807 SNPs were included in our analysis.
Detection of sex-biased expression quantitative loci and eGenes
The cis and trans sex-biased eQTLs and eGenes were associated in XJU between genotypes and gene expressions across the whole genomes. For genotype data, heterozygotes were retained when the missing rate was <10% and the Hardy–Weinberg disequilibrium was <10−16 by PLINK and vcftools [69, 70]. Further, the variants with a minor genotype count >2 both in the male and female samples should be kept to gain sufficient statistical power. For expression data, all filtered expressed genes in XJU were applied [59].
To detect cis and trans sex-biased eQTLs, we performed an association analysis between adjusted expression values of genes and SNPs with a linear interaction model in the R package “MatrixEQTL” [71]. The genotype-by-sex (G × S) interaction was added in this model to check the effect sizes of the genotypes and sex on the gene expressions:
where exp denoted the observed expression level of the gene, |$\mathrm{\mu}$| was the mean expression level across the population, |${\beta}_1$| was the effect size of genotype. |${\beta}_2$| was the effect size of sex, |${\beta}_3$| was the effect size of G × S interaction on gene expression [9], and |$\varepsilon \sim N\left(0,{\sigma}_{\varepsilon}^2\right)$|. We eliminated instances where genotype is collinear with sex based on the published research [11].
We considered cis association as follows: SNPs within 100 kb upstream of the TSS and 100 kb downstream from a gene’s transcription end site. The 100-kb gene boundary extensive size was considered for detecting the cis eQTLs with strong effects and reducing the burden of multiple testing. To balance the ratio of false positive and false negative of cis eQTLs detection, we re-conducted multiple testing with FDR after “MatrixEQTL” processing. We retained the top 200 significant cis SNP–gene pairs (FDR ≤ 0.35), within which the SNPs were defined as sb-cis-eQTLs and the genes were defined as sb-cis-eGenes. In total, we identified 174 sex-biased sb-cis-eQTLs associated with 60 sb-cis-eGenes.
For trans association, we explored the sex-biased interactive genetic effects at the whole genome level (including autosomes and X chromosomes), considering the association between the genes and the SNPs more than 5 Mb away from the TSS or located on a different chromosome from the genes. To depict the whole scenario of the interactive genetic effects, we tried to identify as many trans SNP–gene pairs as possible. An SNP was defined as an sb-trans-eQTL when p-value <10−6 based on the previously published work [72] (FDR < 0.015). A gene was defined as a sb-trans-eGene when it was associated with one or more sb-trans-eQTLs. In total, we identified 326 694 sb-trans-eQTLs associated with 14 639 sb-trans-eGenes.
Biographical note
We explored the influence of ancestry on sex differences in the human transcriptome and its genetic effects on a Eurasian admixed population.
Sex-biased expressed genes identified in a Eurasian admixed population
Sex-biased gene expression links to ancestry
Genetic admixture induces a moderate degree of sex-biased gene expression
Autosomes and X chromosomes interact to form sex-biased phenotypes
Acknowledgements
We are grateful to Drs Hao Chen and Haiyi Lou for their constructive comments during the preparation of the manuscript. We thank LetPub (www.letpub.com) for its linguistic assistance during the preparation of this manuscript.
Author Contributions
S.X. conceived and designed the study and supervised the project. Y.Y., Y.G., and D.M. contributed to the sample collection. Y.L. contributed to the DNA and RNA sequencing. D.L. and X.W. developed a pipeline for processing next-generation sequencing (NGS) data. L.T., Y.Y., and X.T. contributed to the expression calling pipeline. Y.P. analyzed the natural selection, contributed to the position and function enrichment pipeline, and conducted local ancestry inference. S.C., Z.N., and K.H. accomplished the major analysis, drafted the manuscript, and prepared the additional materials. All authors discussed the results and implications and commented on the manuscript.
Conflict of interest: None declared.
Funding
This study was supported by the National Key Research and Development Program of China (grant number 2023YFC2605400), the National Natural Science Foundation of China (grant numbers 32288101, 32030020, and 32470649), the Office of Global Partnerships (Key Projects Development Fund). This study was supported by the Computing for the Future at Fudan Computing Platform and the Human Phenome Data Center of Fudan University. The funders had no role in the study design, data collection, and analysis, decision to publish, or preparation of the manuscript.
Data Availability Statement
The sb-eQTLs summary statistics can be accessed at NGDC (https://ngdc.cncb.ac.cn/omix/), accession number: OMIX005404.
References
Author notes
These authors contributed equally to this work.