-
PDF
- Split View
-
Views
-
Cite
Cite
Manuel B Braga-Neto, Joseph M Gaballa, Adebowale O Bamidele, Olga F Sarmento, Phyllis Svingen, Michelle Gonzalez, Guilherme Piovezani Ramos, Mary R Sagstetter, Sayed Obaidullah Aseem, Zhifu Sun, William A Faubion, Deregulation of Long Intergenic Non-coding RNAs in CD4+ T Cells of Lamina Propria in Crohn’s Disease Through Transcriptome Profiling, Journal of Crohn's and Colitis, Volume 14, Issue 1, January 2020, Pages 96–109, https://doi.org/10.1093/ecco-jcc/jjz109
- Share Icon Share
Abstract
The aetiology of Crohn’s disease [CD] involves immune dysregulation in a genetically susceptible individual. Genome-wide association studies [GWAS] have identified 200 loci associated with CD, ulcerative colitis, or both, most of which fall within non-coding DNA regions. Long non-coding RNAs [lncRNAs] regulate gene expression by diverse mechanisms and have been associated with disease activity in inflammatory bowel disease. However, disease-associated lncRNAs have not been characterised in pathogenic immune cell populations.
Terminal ileal samples were obtained from 22 CD patients and 13 controls. RNA from lamina propria CD4+ T cells was sequenced and long intergenic non-coding RNAs [lincRNAs] were detected. Overall expression patterns, differential expression [DE], and pathway and gene enrichment analyses were performed. Knockdown of novel lincRNAs XLOC_000261 and XLOC_000014 was performed. Expression of Th1 or Th17-associated transcription factors, T-bet and RORγt, respectively, was assessed by flow cytometry.
A total of 6402 lincRNAs were expressed, 960 of which were novel. Unsupervised clustering and principal component analysis showed that the lincRNA expression discriminated patients from controls. A total of 1792 lincRNAs were DE, and 295 [79 novel; 216 known] mapped to 267 of 5727 DE protein-coding genes. The novel lincRNAs were enriched in inflammatory and Notch signalling pathways [p <0.05]. Furthermore, DE lincRNAs in CD patients were more frequently found in DNA regions with known inflammatory bowel disease [IBD]-associated loci. The novel lincRNA XLOC_000261 negatively regulated RORγt expression in Th17 cells.
We describe a novel set of DE lincRNAs in CD-associated CD4+ cells and demonstrate that novel lincRNA XLOC_000261 appears to negatively regulate RORγt protein expression in Th17 cells.
1. Introduction
Crohn’s disease [CD] is a chronic inflammatory disease of the gastrointestinal tract. Its pathogenesis is complex, involving immune dysregulation in a genetically susceptible individual in response to the environment. Increased production of Th1 and Th17 cell cytokines and defects in T regulatory cells [Tregs] have been associated with CD.1–3 Large-scale genome-wide association studies [GWAS] identified 200 susceptibility loci (231 independent single nucleotide polymorphisms [SNPs]) associated with inflammatory bowel disease [IBD], 46 of which are specific to CD.4 Notably, most disease-associated SNPs map to non-coding regions of the genome,5,6 complicating requisite gene to function studies, and highlighting the potential relevance of non-coding RNAs in IBD.
Non-coding RNAs [ncRNA] are a significant part of the human transcriptome and can be classified as short [<200 nucleotides] and long [>200 nucleotides, lncRNAs] non-coding RNA. Over 140 000 unique lncRNAs have been identified, although the proportion of those with validated functions remains small.7–9 Whereas most lncRNA act in cis, activating or repressing nearby genes [e.g., Xist], they can also act in trans, affecting distant genes [e.g. Hotair]. Canonical function is presumed to be through the interaction with RNA, DNA, and proteins modulating gene transcription and translation.10 In the immune system, lncRNAs regulate the inflammatory response through activation and repression of immune genes [e.g., lincRNA-Cox2] and T cell differentiation programmes [e.g., lincRNA-MAF-4].11,12 In IBD, lncRNAs have been associated with disease activity, intestinal permeability, and cell-intrinsic functions such as apoptosis and proliferation.13–15 Previous work was performed in whole-tissue biopsies or plasma precluding precise cell-specific pathophysiological mechanisms.
As lncRNAs are expressed in a lineage-specific pattern of T helper cell phenotypes and dysregulation of the adaptive immune system plays a crucial role in Crohn’s disease,12,16,17 we performed comprehensive profiling of lncRNAs of CD-associated CD4+ T cells. We evaluated their differential expression and co-expression with cis-protein coding genes and identified a set of novel lincRNAs associated with pro-inflammatory gene networks. We found differentially expressed lincRNA to map statistically more frequently within known IBD susceptibility loci, suggesting a critical role in the pathophysiology of human IBD. Furthermore, we demonstrate that novel lincRNA XLOC_000261 appears to negatively regulate RORγt protein expression in Th17 cellular differentiation.
2. Materials and Methods
2.1. Selection of Crohn’s disease patients and controls
Patients with active Crohn’s disease, undergoing intestinal surgical resection, were included in this study. Clinical data from 22 patients with Crohn’s disease were obtained by reviewing electronic medical records at the Mayo Clinic, Rochester, MN, USA. Demographic information, including sex, age, age at disease onset, and age at time of surgical resection was obtained. Furthermore, data on medication use and Crohn’s disease phenotype was obtained. Intestinal tissue was obtained during colonoscopy on 13 age- and sex-matched controls. Eight control patients were undergoing colon cancer screening, three were being evaluated for diarrhea, and two for iron deficiency anaemia.
2.2. CD4+ cell isolation and RNA sequencing
Isolation of lamina propria [LP] CD4+ lymphocytes has previously been described.9 Briefly, resection specimens or mucosal biopsies of the terminal ileum were obtained from 22 patients with active Crohn’s Disease [inflamed tissue; histopathology information in Supplementary File 1, available as Supplementary data at ECCO-JCC online] and 13 age- and sex-matched healthy control individuals. Isolation of CD4+ LP cells was performed using magnetic bead sorting [CD4+ T Cell Isolation kit, Miltenyi Biotec] and two passes through the LS column on the MACS magnet separator. Purification was confirmed by flow cytometry. Total RNA was extracted using Exiqon’s miRCURY RNA Isolation Kit. Illumina TruSeq RNA library preparation protocol was used for RNA-seq library preparation. The paired-end sequencing was carried out using the Illumina HiSeq 2000 sequencer at 50bps [3–4 samples per lane], which generated 42–113 million pair-end reads for each sample.
2.3. Known and novel lincRNA detection
Protein coding genes were analysed and reported in our previous paper.18 MAPR-Seq v1.219 was used and sequence reads aligned to the human genome build 37 using TopHat [2.0.6] with Bowtie [0.12.7].20,21 HTSeq was used to perform gene quantification with UCSC refgene annotation.22 Differentially expressed genes were identified using edgeR23 where genes with false discovery rate [FDR] ≤0.05, fold change ≥1.5, and at least one group RPKM mean value ≥1, were claimed as differentially expressed significantly [DEGs]. These DEGs were used to compare lincRNAs as described below.
LincRNAs were detected through another internally developed pipeline UClncR.24 As the RNA-seq data in this study were unstranded, only long intergenic non-coding RNAs [lincRNAs] could be reliably quantified and detected for this study. Therefore, transcription from overlapping lncRNAs from opposite strands or in coding regions could not be distinguished. The raw RNA-seq data in fastq format were aligned to human reference GRCh37 by HISAT2 [2-2.0.4] with dta option.25 Transcripts were constructed by StringTie.26 GENCODE gene annotation [Release 19 for GRCh37.p13], the last for the reference genome GRCH37, was used to guide alignment and quantify known lincRNAs. Novel lincRNA candidates were predicted through multiple steps. Assembled transcript candidates that do not overlap with any known genes in the GENCODE annotation were defined as novel transcript candidates. Those with coding length >200 bases, protein coding potential <0.1, expression level >20th percentile of known lincRNAs for single exon and fully reconstruction fraction estimation [FRFE] rate27 >0.1 for multiple exon transcripts, and overlap with a repetitive region of the genome less than 5%, were named as novel lincRNA candidates. These predicted novel lincRNAs from each sample were then merged across all samples using Cuffmerge function in Cufflinks.28 This merged master novel transcript annotation was then combined with known lincRNA annotation and used to extract gene expression by feature Counts [v.1.4.6].29 All lincRNAs were annotated relative to their nearby protein coding genes with distance to transcript start site [TSS]. RPKM [reads per kilobase per million mapped reads]30 normalised expression was also generated for global pattern visualisation and correlative analysis.
2.4. Differential expression and integrative analysis
Differentially expressed lincRNAs in CD4+ cells between CD patients and age/sex matched controls were identified using edgeR [3.16.5].23 In the analysis, known lincRNAs without any reads in any of the samples were first removed; the TMM, weighted trimmed mean of M-values, was applied for normalisation; the dispersion parameter for each gene was estimated by an empirical Bayes method for the negative binomial distributed count data; and the exact test as proposed was performed to identify the differentially expressed lincRNAs. To focus on the lincRNAs that were more reliably expressed and differentially expressed, we applied the filters of false discovery rate [FDR] <0.05, fold change >1.5, and average read count per million [CPM] across all samples >1. For data visualisation such as principal components analysis [PCA], or unsupervised clustering, log2 RPKM normalised data30 were used.
Pathway and gene enrichment analysis was conducted using Bioconductor package RITAN [https://bioconductor.org/packages/release/bioc/html/RITAN.html] and DAVID [https://david.ncifcrf.gov/].
2.5. Validation of novel lincRNAs
Peripheral blood mononuclear cells were obtained from three healthy adult donors and the buffy coat was isolated following Ficoll gradient centrifugation. Intestinal resection specimens or mucosal biopsies of the terminal ileum were obtained from an independent group of six patients with active Crohn’s disease and four age- and sex-matched healthy control individuals. Subsequently, CD4+ cells were selected using a magnetic separation kit [Miltenyi Biotec]. Total RNA was extracted according to the protocol included with the RNeasy® Plus Mini Kit [Qiagen]. Approximately 20 ng of total RNA per reaction was converted to cDNA using random primers included with the GoScript™ Reverse Transcription System [Promega]. Validation of the following lincRNAs was performed: XLOC_000261 [nearest protein-coding gene: BATF]; XLOC_000639 [nearest protein-coding gene: RBPJ]; and XLOC_000014 [nearest protein-coding gene IL12RB2].
For quantitative polymerase chain reaction [PCR] performed on intestinal LP CD4+ cells, 2 μl of reverse transcription products were amplified in a 96-well thermal cycler using with iTaq Universal SYBR Green Supermix [BioRad]. The following primers were used: [XLOC_000261: Forward 5’-GCAGGATTTCTCCCATCAA-3’; Reverse 5’-CTTGGGACTTAGCCTCAGATATTC-3’; XLOC_000639: Forward 5’-CCATACACCTCATCCCTCTTTC-3’; Reverse 5’-CACAGGAGTGACGTGATACAA-3’; XLOC_000014: Forward 5’- GGCCTCTACAGATGGTGATTTC-3’; Reverse 5’-ACGGGATTGTAGGTGTTGTTC-3’]. GADPH was used as housekeeping gene [GAPDH: Forward 5’- CCAGGGCTGCTTTTAACTCT-3’; Reverse: 5’-GGACTCCACGACGTACTC-3’]. For conventional PCR, the same set of primers were used. The following PCR thermocycler conditions were used to amplify cDNA: 95°C for 3 min, 35 cycles with denaturation at 95°C for 30 s, annealing at 55°C for 30s, and extension at 72°C for 60 s, followed by 72°C for 5 min. Subsequently, 30 μL of PCR products were loaded onto a 2.0% agarose gel prepared with ethidium bromide and electrophoresed at 70 v for approximately 90 min. Gels were then imaged using a UV transilluminator to visualise DNA bands and confirm the existence of the predicted novel lincRNAs [Supplementary Figure 1, available as Supplementary data at ECCO-JCC online].
2.6. In vitro T cell differentiation and flow cytometry
Peripheral blood mononuclear cells were obtained from healthy adult donors and the buffy coat was isolated following Ficoll gradient centrifugation. CD4+ cells were then selected using a magnetic separation kit [Miltenyi Biotec]. Subsequently, CD45RA+ cells were isolated using a magnetic separation kit [Miltenyi Biotec] and stimulated with mouse anti-human CD28 [1 µg/mL; BioLegend] and plate-bound anti-CD3 [2 µg/mL; BioLegend]. The following conditions were used for induction of T-cell subtypes: Th1: IL-12 [10 ng/mL; Peprotech]; Th2: IL-4 [10 ng/mL; Peprotech]; Treg: TGF-β1 [5 ng/mL; Peprotech]; and IL-2 [2 U/mL; Peprotech]; Th17: IL-1β1[20 ng/mL; Peprotech]; TGF-β1 [5 ng/mL; Peprotech]; IL-6 [50 ng/mL; Peprotech]; IL-21 [100 ng/mL; Peprotech]; IL-23 [20 ng/mL; Peprotech], mouse anti-human IFN-γ [10 µg/mL; BD Biosciences]; anti-IL4 [2.5 µg/mL]. After 5 days, cells were stimulated with mouse anti-human CD28 and plate-bound anti-CD3. At Day 7, differentiation was confirmed by flow cytometry and cells were transfected. After 24 h, cells were harvested and flow cytometry was performed. Briefly, cells were fixed and permeabilised using True-Nuclear Transcription Factor Buffer Set [BioLegend]. Subsequently, cells were stained with fluorochrome tagged monoclonal antibodies [Biolegend] for FOXP3, T-bet, GATA3, or RORγt, and analysed by flow cytometry.
2.7. Nucleofection
Nucleofection was performed using Amaxa Human T Cell Nucleofector Kit [Lonza], according to the manufacture’s protocol with T-020 programme. Briefly, 5 × 106 cells were resuspended in 100 μL of Nucleofection solution containing either non-targeting siRNA [Dharmacon] or siRNA-targeting XLOC_000261 [sense: 5’-UGGCAUGCAUUGAUGACUUUU-3’; antisense: 5’P- AAGUCAUCAAUGCAUGCCAUU-3’] XLOC_000014 [sense: 5’-GAUAUUAAUAGGAGGAAAUU-3’; antisense: 5’P-UUCUCCUCCUAUUAAUAUCUU-3’] at 0.32 µM. Pre-warmed RPMI medium was then added and cells were plated in a 12-well CD3-bound plate with CD28 [1 μg/mL; BioLegend]. Silencing of XLOC_000261 or XLOC_000014 was confirmed after 24 h [Supplementary Figure 2, available as Supplementary data at ECCO-JCC online]. Flow cytometry for RORγt or T-bet was performed 24 h after nucleofection.
2.8. Statistical Analysis
Data are expressed as the means ± SEM. Statistical analysis performed using GraphPad Prism 5.0 software (San Diego, CA). Student t test was used to compare 2 groups. P < .05 was the minimum requirement for a statistically significant difference.
3. Results
3.1. Patient demographics
A total of 22 patients with active Crohn’s disease and 13 age- and sex-matched healthy controls were included. There was no difference between the two groups regarding gender [54.5% vs 46.1%, respectively, were females; chi square p = 0.90] or mean age [49.3 vs 39.2 years, respectively; p = 0.11]. Most patients had ileocolonic and stricturing disease. Nine patients had moderate disease activity on microscopic scoring, and seven had severe disease [Table 1]. Detailed information on each individual case is described in Supplementary File 1.
Characteristics . | . | Crohn’s disease [N = 22] . |
---|---|---|
Age at time of surgery [yrs], median [IQR] | 34.5 [23.7–58.2] | |
Age at time of IBD diagnosis [yrs], median [IQR] | 24.5 [16.7–34.7] | |
Male gender, n [%] | 10 [45.4%] | |
Disease location [CD] | L1, ileal, n [%] | 5 [22.7%] |
L2, colonic, n [%] | 0 [0%] | |
L3, ileocolonic, n [%] | 17 [77.3%] | |
Unknown | 0 [0%] | |
History of perianal disease, n [%] | 6 [27.2%] | |
Disease behaviour [CD] | B1, non-stricturing, non-penetrating, n [%] | 0 [0%] |
B2, stricturing, n [%] | 13 [59.1%] | |
B3, penetrating, n [%] | 9 [40.9%] | |
Medication usage [ever used] | TNF-alpha inhibitor | 18 [81.8%] |
Thiopurine | 10 [45.4%] | |
Methotrexate | 3 [13.6%] | |
Budesonide | 4 [18.2%] | |
Small bowel [CD] histological severity | Inactive | 0 [0%] |
Mild | 4 [18.2%] | |
Moderate | 9 [40.9%] | |
Severe | 7 [31.8%] | |
Missing | 2 [9.1%] |
Characteristics . | . | Crohn’s disease [N = 22] . |
---|---|---|
Age at time of surgery [yrs], median [IQR] | 34.5 [23.7–58.2] | |
Age at time of IBD diagnosis [yrs], median [IQR] | 24.5 [16.7–34.7] | |
Male gender, n [%] | 10 [45.4%] | |
Disease location [CD] | L1, ileal, n [%] | 5 [22.7%] |
L2, colonic, n [%] | 0 [0%] | |
L3, ileocolonic, n [%] | 17 [77.3%] | |
Unknown | 0 [0%] | |
History of perianal disease, n [%] | 6 [27.2%] | |
Disease behaviour [CD] | B1, non-stricturing, non-penetrating, n [%] | 0 [0%] |
B2, stricturing, n [%] | 13 [59.1%] | |
B3, penetrating, n [%] | 9 [40.9%] | |
Medication usage [ever used] | TNF-alpha inhibitor | 18 [81.8%] |
Thiopurine | 10 [45.4%] | |
Methotrexate | 3 [13.6%] | |
Budesonide | 4 [18.2%] | |
Small bowel [CD] histological severity | Inactive | 0 [0%] |
Mild | 4 [18.2%] | |
Moderate | 9 [40.9%] | |
Severe | 7 [31.8%] | |
Missing | 2 [9.1%] |
IQR, interquartile range; yrs, years; IBD, inflammatory bowel disease; CD, Crohn’s disease; TNF, tumour necrosis factor.
Characteristics . | . | Crohn’s disease [N = 22] . |
---|---|---|
Age at time of surgery [yrs], median [IQR] | 34.5 [23.7–58.2] | |
Age at time of IBD diagnosis [yrs], median [IQR] | 24.5 [16.7–34.7] | |
Male gender, n [%] | 10 [45.4%] | |
Disease location [CD] | L1, ileal, n [%] | 5 [22.7%] |
L2, colonic, n [%] | 0 [0%] | |
L3, ileocolonic, n [%] | 17 [77.3%] | |
Unknown | 0 [0%] | |
History of perianal disease, n [%] | 6 [27.2%] | |
Disease behaviour [CD] | B1, non-stricturing, non-penetrating, n [%] | 0 [0%] |
B2, stricturing, n [%] | 13 [59.1%] | |
B3, penetrating, n [%] | 9 [40.9%] | |
Medication usage [ever used] | TNF-alpha inhibitor | 18 [81.8%] |
Thiopurine | 10 [45.4%] | |
Methotrexate | 3 [13.6%] | |
Budesonide | 4 [18.2%] | |
Small bowel [CD] histological severity | Inactive | 0 [0%] |
Mild | 4 [18.2%] | |
Moderate | 9 [40.9%] | |
Severe | 7 [31.8%] | |
Missing | 2 [9.1%] |
Characteristics . | . | Crohn’s disease [N = 22] . |
---|---|---|
Age at time of surgery [yrs], median [IQR] | 34.5 [23.7–58.2] | |
Age at time of IBD diagnosis [yrs], median [IQR] | 24.5 [16.7–34.7] | |
Male gender, n [%] | 10 [45.4%] | |
Disease location [CD] | L1, ileal, n [%] | 5 [22.7%] |
L2, colonic, n [%] | 0 [0%] | |
L3, ileocolonic, n [%] | 17 [77.3%] | |
Unknown | 0 [0%] | |
History of perianal disease, n [%] | 6 [27.2%] | |
Disease behaviour [CD] | B1, non-stricturing, non-penetrating, n [%] | 0 [0%] |
B2, stricturing, n [%] | 13 [59.1%] | |
B3, penetrating, n [%] | 9 [40.9%] | |
Medication usage [ever used] | TNF-alpha inhibitor | 18 [81.8%] |
Thiopurine | 10 [45.4%] | |
Methotrexate | 3 [13.6%] | |
Budesonide | 4 [18.2%] | |
Small bowel [CD] histological severity | Inactive | 0 [0%] |
Mild | 4 [18.2%] | |
Moderate | 9 [40.9%] | |
Severe | 7 [31.8%] | |
Missing | 2 [9.1%] |
IQR, interquartile range; yrs, years; IBD, inflammatory bowel disease; CD, Crohn’s disease; TNF, tumour necrosis factor.
3.2. Novel and known lincRNAs identified in CD4+ T cells from patients and controls
From the CD4+ T cells of 35 individuals [22 patients and 13 controls], we found 960 novel lincRNAs not annotated in GENECODE v19, among which 710 were multi-exon and 250 single-exon lincRNAs. The relatively fewer number of single-exon lincRNAs was attributable to the stringent filtering criteria we applied to minimise potential artefactual noise. We used 20 percentile expression or above relative to known single-exon lincRNAs to name a novel single-exon lincRNA. This was necessary as single-exon lincRNAs tend to be less reliable and can be artefacts such as unspliced pre-mRNA or gene extensions,31,32 and our stringent filtering aimed to minimise these noises. Proportionally, more multiple exon novel lincRNAs were detected in the patients with CD than in the controls [81.4% vs 65.6%, respectively; p value = 0.001] [Figure 1A]. Among the 7114 annotated or known lincRNAs in the Gencode annotation v19, 5442 had at least one read mapped in at least one of the 35 samples, and 3880 reached or exceeded the minimum expression required to define the novel lincRNAs. Compared with these known lincRNAs with the similar minimum expression as the novel ones, the novel lincRNAs had overall higher expression [Figure 1B]. The average expression for many lincRNAs appeared to be lower in the cases compared with the controls [Figure 1C].
![Novel vs known lincRNAs in CD4+ cells. A. Number of multi-exon [ME] vs single exon [SE] novel lincRNAs in each sample. B. Average expression of known vs novel lincRNAs across all samples. As all lincRNAs with expression in at least one sample were included for known ones, their expression is lower than that of newly predicted ones in which certain expression and coverage are needed to be discovered. C. Average expression in cases and controls. The red dots are for novel lincRNAs and green are for known lincRNAs.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0001.jpeg?Expires=1749456840&Signature=AQL850OcRcg2DGL~~Z2ZZXi6FdlVaI~bQ-XKACGNxBzQXrv18FgTGeBDjVzDHT9E~6uDsWjuq5w6QurDjBe5MV-XL7UZYPMZgOc77AiUcPr110IYRu4RSgfpDFOEmnQklrZGEzD7KZzQ-Oj6P6rYoAUXFFDlG13Ta5idNWU6vQNd8IpFqWcUMy58looP7UduKo6fQooG3crCIhQWgRbWrvQP2s5tZobrK1J43ex516rVctk46NBXhyQXyoYIojJEaNfBsy7xNr4SCrRaNTLKRYLoOsAhOl5U546iRskyvM6stc6xzzTGAe7mGGKOH7As993uRJ4B2CdAxf3Yqm6plA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Novel vs known lincRNAs in CD4+ cells. A. Number of multi-exon [ME] vs single exon [SE] novel lincRNAs in each sample. B. Average expression of known vs novel lincRNAs across all samples. As all lincRNAs with expression in at least one sample were included for known ones, their expression is lower than that of newly predicted ones in which certain expression and coverage are needed to be discovered. C. Average expression in cases and controls. The red dots are for novel lincRNAs and green are for known lincRNAs.
3.3. lincRNA expression patterns differentiate patients from control subjects
We first conducted unsupervised clustering and principal component analysis [PCA] using all lincRNAs, and then separately for known and novel lincRNAs. In both cases, patient and control samples formed two distinct clusters [Figure 2A, B, and C], suggesting unique expression profiles associated with the disease process. Although this study was not powered to address clinical heterogeneity between patients, Figure 2A demonstrates no clustering related to anti-tumour necrosis factor [TNF], steroid use, or disease activity. To gain further biological relevance, we used the PCA loading vectors to identify the top 10 lincRNAs responsible for the most variance to each of the first three principal components [PC]. PC1 accounted for 30% of variance and clearly separated the case from the control samples. Concordant with gene set enrichment analysis presented below, the lincRNA XLOC_000639, located upstream of recombining binding protein suppressor of hairless gene [RBPJ] associated with the NOTCH signalling pathway, emerged as a key driver for PC1. Subsequently, we performed differential expression analysis between cases and controls. At false discovery rate [FDR] <0.05, 1832 lincRNAs were expressed differentially up [n = 712] and down [n = 1120] in patient samples relative to controls. Additional filters [fold change greater than 1.5 and mean expression in CPM across samples greater than 1] were applied to generate a list of high-confidence differentially expressed lincRNAs [n = 1792 Figure 3A; and Supplementary Table 1, available as Supplementary data at ECCO-JCC online], of which the majority [1103, 61.6%] were downwardly expressed in patient samples [Figure 3B].
![Global lincRNA expression separates patients from controls. A. Unsupervised clustering using all lincRNAs for all samples. Controls are represented in blue font, whereas patients are represented in black font. Disease activity in the patients is shown as highlighted in green [mild], yellow [moderate], severe [red], or unknown [not highlighted]. Finally, patients receiving tumour necrosis factor [TNF] inhibitors within a month or less of surgical resection are labelled with circle, and patients on steroids are labelled with a triangle. B. Principal components analysis [PCA] plot using all lincRNAs. C. PCA plot by novel lincRNAs only.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0002.jpeg?Expires=1749456840&Signature=gzcPBkrjG5Cw-mvQIorHvO6v4K48eQ4N7D1E3bFXAnUjBtciu42StI2FTldbCjcHhbNdRMybulLetzxbD2~JQmcQWkoU150bTIoMS03jdgRWFIbe4fIxgaQOmHq59xQa~ILwC-8BB1ro4M8ATewRj52wV8bFdvn-o4aeUV0k-Wa-tnnslH62KYQswFYQ-eJDJx~EWi4G4LHqJI4cJMa~qd32L-xv4JWpDzyRhN2kAwzfk8a3P9uMLxHiXWPF7nAPK2o-tBgGE5kFHZm-Jv4HecdzNlbxTWDmQkTGdbfzY1ycs9zQ98eopTBSOH87mduQmhZYDjkEZS8Y-II42ttsow__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Global lincRNA expression separates patients from controls. A. Unsupervised clustering using all lincRNAs for all samples. Controls are represented in blue font, whereas patients are represented in black font. Disease activity in the patients is shown as highlighted in green [mild], yellow [moderate], severe [red], or unknown [not highlighted]. Finally, patients receiving tumour necrosis factor [TNF] inhibitors within a month or less of surgical resection are labelled with circle, and patients on steroids are labelled with a triangle. B. Principal components analysis [PCA] plot using all lincRNAs. C. PCA plot by novel lincRNAs only.
![A. Expression mean vs log2-fold change of differentially expressed lincRNAs [red]. B. Heatmap of the differentialy expressed lincRNAs. Most of lincRNAs are down-expressed in the CD4+ cells of the patient samples.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0003.jpeg?Expires=1749456840&Signature=zfrmNfdaHzjN9LtldDtgsdmvPCUaTaqD74kevWjbzoPMqwc1AqryU7VxlYBKGOu2FMNuXYucK76StglPSoJnXDWr3rj4R~PUxuwJvE71~B4Ovwb3q6trDpvNJ0Vc7S2JQ22BYHEaYgGlG0xryEDPRurdmZ8QKm9hC00VPGlv6qe0TSJxrNFHEFgR6JgSu8~jGLrNb7qUyDXx95Qaqq75VP4jH5N2D8SvIbCrpss~0gzwGXnrB9AjdFwnBTU6DFUNdkd7reh0-Hmh2s4Hyj5iEZZ62h0ZH3hR45Sl6bZ6h1aIO3dSZNkvpZQgY-EciiZDd414~WOxTudfmUEu5mtCyA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
A. Expression mean vs log2-fold change of differentially expressed lincRNAs [red]. B. Heatmap of the differentialy expressed lincRNAs. Most of lincRNAs are down-expressed in the CD4+ cells of the patient samples.
3.4. Relationship of lincRNAs and their cis protein-coding genes
To characterise a potential interaction and biological relevance between the lincRNAs and protein-coding genes, we performed two reciprocal analyses. First, we generated a protein-coding gene list from the above differentially expressed lincRNAs that are within 10K upstream and 2K downstream of the transcription start sites [TSS] of protein-coding genes, which led to 1734 gene candidates. Gene set and pathway enrichment analysis for this gene list demonstrated significant enrichment of IL2_STAT5_SIGNALING and TNFA_SIGNALING_VIA_NFKB of MSigDB Hallmark gene sets, data supportive of a biologically relevant role for lincRNAs in Crohn’s disease pathophysiology. Significantly enriched gene sets [MSigDB and KEGG canonical pathways] are demonstrated in Supplementary Table 1, available as Supplementary data at ECCO-JCC online.
We then performed the reciprocal analysis beginning with the list of differentially expressed protein-coding genes [DEGs, n = 5727 using same criteria as lincRNAs above]. Again, comparing the genomic coordinates of these DEGs with those of lincRNAs, we identified that 528 DEGs had lincRNAs within 10K upstream or 2K downstream of their TSSs [associated with 600 unique lincRNAs]. We further checked the differential expression of these lincRNAs, and 295 of them [49%] were differentially expressed as well, which were associated with 267 protein-coding genes [51%] [Figure 4A]. The vast majority of these significant pairs [91%] were differentially expressed in the same direction [either both up- or down-regulated, Figure 4A]. Among the significantly changed lincRNAs [n = 295], 79 were novel and 216 were known [Supplementary Table 2, available as Suplementary data at ECCO-JCC online]. The top 10 differentially expressed pairs of lincRNAs and protein-coding genes are shown in Table 2 and Table 3, respectively. We validated three novel lincRNAs [XLOC_000261; XLOC_000639; XLOC_000014] associated with the top 10 upregulated DEG protein-coding genes by qPCR, as demonstrated in Figure 5. This was also demonstrated by conventional PCR [Supplementary Figure 1, available as Supplementary data at ECCO-JCC online]. Gene set enrichment analysis [RITAN] for the 267 protein-coding genes using MSigDB Hallmark datasets showed significant enrichment within the TNF, KRAS, UV response, HYPOXIA, Inflammatory, and IL6_JAK_STAT3 signalling gene sets [Figure 4B]. When the protein-coding genes associated with the 79 novel lincRNAs were analysed separately, there was significant enrichment of Notch signalling [p = 0.016], cytokine-cytokine receptor interaction [p = 0.017], and chemokine signalling pathway [p = 0.033].
Top 10 differentially expressed lincRNAs and their associated nearby protein-coding genes.
Up-regulated . | ||||||
---|---|---|---|---|---|---|
lincRNAID . | LogFC . | Novel . | Chromosome . | No. exons . | Nearest protein-coding gene . | LogFC . |
ENSG00000268734.1 | 5.897144 | FALSE | chr19 | 1 | FCAR | 0.981776 |
ENSG00000255801.1 | 4.662458 | FALSE | chr12 | 3 | CLEC4E | 4.07331 |
XLOC_000639 | 4.6103 | TRUE | chr4 | 5 | RBPJ | 1.125349 |
ENSG00000223414.2 | 4.365519 | FALSE | chr6 | 6 | PDE10A | 1.511632 |
XLOC_000300 | 4.364112 | TRUE | chr15 | 3 | BLM | 1.415258 |
XLOC_000450 | 4.215082 | TRUE | chr2 | 3 | GNLY | 5.501274 |
XLOC_000245 | 4.210475 | TRUE | chr13 | 1 | ZMYM5 | 0.609866 |
ENSG00000259721.1 | 4.009044 | FALSE | chr15 | 1 | GREM1 | 4.393926 |
ENSG00000226012.1 | 3.843327 | FALSE | chr21 | 2 | KCNJ15 | 1.35121 |
XLOC_000588 | 3.612975 | TRUE | chr3 | 2 | MUC4 | 2.170115 |
Downregulated | ||||||
lincRNAID | logFC | Novel | Chromosome | No. exons | Nearest protein-coding gene | logFC |
ENSG00000227588.2 | -5.5388 | FALSE | chr3 | 3 | CNTN4 | -3.39985 |
ENSG00000229233.1 | -4.70435 | FALSE | chr7 | 2 | SCIN | -3.79371 |
XLOC_000841 | -4.39681 | TRUE | chr8 | 1 | SGK223 | -1.11759 |
ENSG00000261012.2 | -4.36301 | FALSE | chr2 | 1 | APOB | -5.5889 |
ENSG00000272050.1 | -4.293 | FALSE | chr20 | 1 | GATA5 | -3.25843 |
ENSG00000270988.1 | -4.29017 | FALSE | chr8 | 1 | KBTBD11 | -2.3771 |
XLOC_000756 | -4.18568 | TRUE | chr6 | 2 | FRMD1 | -4.11944 |
ENSG00000254000.1 | -4.18514 | FALSE | chr8 | 2 | FER1L6 | -1.284 |
ENSG00000253696.2 | -4.09888 | FALSE | chr8 | 3 | KBTBD11 | -2.3771 |
ENSG00000251370.1 | -4.08083 | FALSE | chr5 | 3 | SEMA5A | -3.04607 |
Up-regulated . | ||||||
---|---|---|---|---|---|---|
lincRNAID . | LogFC . | Novel . | Chromosome . | No. exons . | Nearest protein-coding gene . | LogFC . |
ENSG00000268734.1 | 5.897144 | FALSE | chr19 | 1 | FCAR | 0.981776 |
ENSG00000255801.1 | 4.662458 | FALSE | chr12 | 3 | CLEC4E | 4.07331 |
XLOC_000639 | 4.6103 | TRUE | chr4 | 5 | RBPJ | 1.125349 |
ENSG00000223414.2 | 4.365519 | FALSE | chr6 | 6 | PDE10A | 1.511632 |
XLOC_000300 | 4.364112 | TRUE | chr15 | 3 | BLM | 1.415258 |
XLOC_000450 | 4.215082 | TRUE | chr2 | 3 | GNLY | 5.501274 |
XLOC_000245 | 4.210475 | TRUE | chr13 | 1 | ZMYM5 | 0.609866 |
ENSG00000259721.1 | 4.009044 | FALSE | chr15 | 1 | GREM1 | 4.393926 |
ENSG00000226012.1 | 3.843327 | FALSE | chr21 | 2 | KCNJ15 | 1.35121 |
XLOC_000588 | 3.612975 | TRUE | chr3 | 2 | MUC4 | 2.170115 |
Downregulated | ||||||
lincRNAID | logFC | Novel | Chromosome | No. exons | Nearest protein-coding gene | logFC |
ENSG00000227588.2 | -5.5388 | FALSE | chr3 | 3 | CNTN4 | -3.39985 |
ENSG00000229233.1 | -4.70435 | FALSE | chr7 | 2 | SCIN | -3.79371 |
XLOC_000841 | -4.39681 | TRUE | chr8 | 1 | SGK223 | -1.11759 |
ENSG00000261012.2 | -4.36301 | FALSE | chr2 | 1 | APOB | -5.5889 |
ENSG00000272050.1 | -4.293 | FALSE | chr20 | 1 | GATA5 | -3.25843 |
ENSG00000270988.1 | -4.29017 | FALSE | chr8 | 1 | KBTBD11 | -2.3771 |
XLOC_000756 | -4.18568 | TRUE | chr6 | 2 | FRMD1 | -4.11944 |
ENSG00000254000.1 | -4.18514 | FALSE | chr8 | 2 | FER1L6 | -1.284 |
ENSG00000253696.2 | -4.09888 | FALSE | chr8 | 3 | KBTBD11 | -2.3771 |
ENSG00000251370.1 | -4.08083 | FALSE | chr5 | 3 | SEMA5A | -3.04607 |
Top 10 differentially expressed lincRNAs and their associated nearby protein-coding genes.
Up-regulated . | ||||||
---|---|---|---|---|---|---|
lincRNAID . | LogFC . | Novel . | Chromosome . | No. exons . | Nearest protein-coding gene . | LogFC . |
ENSG00000268734.1 | 5.897144 | FALSE | chr19 | 1 | FCAR | 0.981776 |
ENSG00000255801.1 | 4.662458 | FALSE | chr12 | 3 | CLEC4E | 4.07331 |
XLOC_000639 | 4.6103 | TRUE | chr4 | 5 | RBPJ | 1.125349 |
ENSG00000223414.2 | 4.365519 | FALSE | chr6 | 6 | PDE10A | 1.511632 |
XLOC_000300 | 4.364112 | TRUE | chr15 | 3 | BLM | 1.415258 |
XLOC_000450 | 4.215082 | TRUE | chr2 | 3 | GNLY | 5.501274 |
XLOC_000245 | 4.210475 | TRUE | chr13 | 1 | ZMYM5 | 0.609866 |
ENSG00000259721.1 | 4.009044 | FALSE | chr15 | 1 | GREM1 | 4.393926 |
ENSG00000226012.1 | 3.843327 | FALSE | chr21 | 2 | KCNJ15 | 1.35121 |
XLOC_000588 | 3.612975 | TRUE | chr3 | 2 | MUC4 | 2.170115 |
Downregulated | ||||||
lincRNAID | logFC | Novel | Chromosome | No. exons | Nearest protein-coding gene | logFC |
ENSG00000227588.2 | -5.5388 | FALSE | chr3 | 3 | CNTN4 | -3.39985 |
ENSG00000229233.1 | -4.70435 | FALSE | chr7 | 2 | SCIN | -3.79371 |
XLOC_000841 | -4.39681 | TRUE | chr8 | 1 | SGK223 | -1.11759 |
ENSG00000261012.2 | -4.36301 | FALSE | chr2 | 1 | APOB | -5.5889 |
ENSG00000272050.1 | -4.293 | FALSE | chr20 | 1 | GATA5 | -3.25843 |
ENSG00000270988.1 | -4.29017 | FALSE | chr8 | 1 | KBTBD11 | -2.3771 |
XLOC_000756 | -4.18568 | TRUE | chr6 | 2 | FRMD1 | -4.11944 |
ENSG00000254000.1 | -4.18514 | FALSE | chr8 | 2 | FER1L6 | -1.284 |
ENSG00000253696.2 | -4.09888 | FALSE | chr8 | 3 | KBTBD11 | -2.3771 |
ENSG00000251370.1 | -4.08083 | FALSE | chr5 | 3 | SEMA5A | -3.04607 |
Up-regulated . | ||||||
---|---|---|---|---|---|---|
lincRNAID . | LogFC . | Novel . | Chromosome . | No. exons . | Nearest protein-coding gene . | LogFC . |
ENSG00000268734.1 | 5.897144 | FALSE | chr19 | 1 | FCAR | 0.981776 |
ENSG00000255801.1 | 4.662458 | FALSE | chr12 | 3 | CLEC4E | 4.07331 |
XLOC_000639 | 4.6103 | TRUE | chr4 | 5 | RBPJ | 1.125349 |
ENSG00000223414.2 | 4.365519 | FALSE | chr6 | 6 | PDE10A | 1.511632 |
XLOC_000300 | 4.364112 | TRUE | chr15 | 3 | BLM | 1.415258 |
XLOC_000450 | 4.215082 | TRUE | chr2 | 3 | GNLY | 5.501274 |
XLOC_000245 | 4.210475 | TRUE | chr13 | 1 | ZMYM5 | 0.609866 |
ENSG00000259721.1 | 4.009044 | FALSE | chr15 | 1 | GREM1 | 4.393926 |
ENSG00000226012.1 | 3.843327 | FALSE | chr21 | 2 | KCNJ15 | 1.35121 |
XLOC_000588 | 3.612975 | TRUE | chr3 | 2 | MUC4 | 2.170115 |
Downregulated | ||||||
lincRNAID | logFC | Novel | Chromosome | No. exons | Nearest protein-coding gene | logFC |
ENSG00000227588.2 | -5.5388 | FALSE | chr3 | 3 | CNTN4 | -3.39985 |
ENSG00000229233.1 | -4.70435 | FALSE | chr7 | 2 | SCIN | -3.79371 |
XLOC_000841 | -4.39681 | TRUE | chr8 | 1 | SGK223 | -1.11759 |
ENSG00000261012.2 | -4.36301 | FALSE | chr2 | 1 | APOB | -5.5889 |
ENSG00000272050.1 | -4.293 | FALSE | chr20 | 1 | GATA5 | -3.25843 |
ENSG00000270988.1 | -4.29017 | FALSE | chr8 | 1 | KBTBD11 | -2.3771 |
XLOC_000756 | -4.18568 | TRUE | chr6 | 2 | FRMD1 | -4.11944 |
ENSG00000254000.1 | -4.18514 | FALSE | chr8 | 2 | FER1L6 | -1.284 |
ENSG00000253696.2 | -4.09888 | FALSE | chr8 | 3 | KBTBD11 | -2.3771 |
ENSG00000251370.1 | -4.08083 | FALSE | chr5 | 3 | SEMA5A | -3.04607 |
Top 10 differentially expressed protein-coding genes and their associated lincRNAs.
Up-regulated . | |||||
---|---|---|---|---|---|
protein-coding gene . | logFC . | Chromosome . | Associated lincRNA . | Novel . | logFC . |
GNLY | 5.501274 | chr2 | XLOC_000450 | TRUE | 4.215082 |
IL12RB2 | 4.413091 | chr1 | XLOC_000014 | TRUE | 2.459516 |
GREM1 | 4.393926 | chr15 | ENSG00000259721.1 | FALSE | 4.009044 |
CLEC4E | 4.07331 | chr12 | ENSG00000255801.1 | FALSE | 4.662458 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000251408.1 | FALSE | 3.129591 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000249896.1 | FALSE | 3.612352 |
CCR8 | 3.728164 | chr3 | XLOC_000595 | TRUE | 3.337482 |
HOXB13 | 3.474395 | chr17 | ENSG00000242407.1 | FALSE | 2.949072 |
BATF | 3.170261 | chr14 | XLOC_000261 | TRUE | 3.499493 |
ZNF812 | 2.987816 | chr19 | XLOC_000416 | TRUE | 2.144413 |
Downregulated | |||||
Protein coding gene | logFC | Chromosome | Associated lincRNA | Novel | logFC |
G6PC | -6.01619 | chr17 | ENSG00000213373.3 | FALSE | -2.81641 |
APOB | -5.5889 | chr2 | ENSG00000261012.2 | FALSE | -4.36301 |
COL2A1 | -4.52718 | chr12 | ENSG00000258203.1 | FALSE | -2.68737 |
OSR2 | -4.22478 | chr8 | ENSG00000271930.1 | FALSE | -3.3486 |
PAQR5 | -4.19728 | chr15 | ENSG00000259504.1 | FALSE | -1.46608 |
FRMD1 | -4.11944 | chr6 | XLOC_000756 | TRUE | -4.18568 |
DAB1 | -4.04047 | chr1 | ENSG00000227935.1 | FALSE | -2.76251 |
ALDOB | -3.85788 | chr9 | XLOC_000895 | TRUE | -2.97737 |
SCIN | -3.79371 | chr7 | ENSG00000229233.1 | FALSE | -4.70435 |
PHYHIPL | -3.78832 | chr10 | XLOC_000135 | TRUE | -3.22042 |
Up-regulated . | |||||
---|---|---|---|---|---|
protein-coding gene . | logFC . | Chromosome . | Associated lincRNA . | Novel . | logFC . |
GNLY | 5.501274 | chr2 | XLOC_000450 | TRUE | 4.215082 |
IL12RB2 | 4.413091 | chr1 | XLOC_000014 | TRUE | 2.459516 |
GREM1 | 4.393926 | chr15 | ENSG00000259721.1 | FALSE | 4.009044 |
CLEC4E | 4.07331 | chr12 | ENSG00000255801.1 | FALSE | 4.662458 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000251408.1 | FALSE | 3.129591 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000249896.1 | FALSE | 3.612352 |
CCR8 | 3.728164 | chr3 | XLOC_000595 | TRUE | 3.337482 |
HOXB13 | 3.474395 | chr17 | ENSG00000242407.1 | FALSE | 2.949072 |
BATF | 3.170261 | chr14 | XLOC_000261 | TRUE | 3.499493 |
ZNF812 | 2.987816 | chr19 | XLOC_000416 | TRUE | 2.144413 |
Downregulated | |||||
Protein coding gene | logFC | Chromosome | Associated lincRNA | Novel | logFC |
G6PC | -6.01619 | chr17 | ENSG00000213373.3 | FALSE | -2.81641 |
APOB | -5.5889 | chr2 | ENSG00000261012.2 | FALSE | -4.36301 |
COL2A1 | -4.52718 | chr12 | ENSG00000258203.1 | FALSE | -2.68737 |
OSR2 | -4.22478 | chr8 | ENSG00000271930.1 | FALSE | -3.3486 |
PAQR5 | -4.19728 | chr15 | ENSG00000259504.1 | FALSE | -1.46608 |
FRMD1 | -4.11944 | chr6 | XLOC_000756 | TRUE | -4.18568 |
DAB1 | -4.04047 | chr1 | ENSG00000227935.1 | FALSE | -2.76251 |
ALDOB | -3.85788 | chr9 | XLOC_000895 | TRUE | -2.97737 |
SCIN | -3.79371 | chr7 | ENSG00000229233.1 | FALSE | -4.70435 |
PHYHIPL | -3.78832 | chr10 | XLOC_000135 | TRUE | -3.22042 |
Top 10 differentially expressed protein-coding genes and their associated lincRNAs.
Up-regulated . | |||||
---|---|---|---|---|---|
protein-coding gene . | logFC . | Chromosome . | Associated lincRNA . | Novel . | logFC . |
GNLY | 5.501274 | chr2 | XLOC_000450 | TRUE | 4.215082 |
IL12RB2 | 4.413091 | chr1 | XLOC_000014 | TRUE | 2.459516 |
GREM1 | 4.393926 | chr15 | ENSG00000259721.1 | FALSE | 4.009044 |
CLEC4E | 4.07331 | chr12 | ENSG00000255801.1 | FALSE | 4.662458 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000251408.1 | FALSE | 3.129591 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000249896.1 | FALSE | 3.612352 |
CCR8 | 3.728164 | chr3 | XLOC_000595 | TRUE | 3.337482 |
HOXB13 | 3.474395 | chr17 | ENSG00000242407.1 | FALSE | 2.949072 |
BATF | 3.170261 | chr14 | XLOC_000261 | TRUE | 3.499493 |
ZNF812 | 2.987816 | chr19 | XLOC_000416 | TRUE | 2.144413 |
Downregulated | |||||
Protein coding gene | logFC | Chromosome | Associated lincRNA | Novel | logFC |
G6PC | -6.01619 | chr17 | ENSG00000213373.3 | FALSE | -2.81641 |
APOB | -5.5889 | chr2 | ENSG00000261012.2 | FALSE | -4.36301 |
COL2A1 | -4.52718 | chr12 | ENSG00000258203.1 | FALSE | -2.68737 |
OSR2 | -4.22478 | chr8 | ENSG00000271930.1 | FALSE | -3.3486 |
PAQR5 | -4.19728 | chr15 | ENSG00000259504.1 | FALSE | -1.46608 |
FRMD1 | -4.11944 | chr6 | XLOC_000756 | TRUE | -4.18568 |
DAB1 | -4.04047 | chr1 | ENSG00000227935.1 | FALSE | -2.76251 |
ALDOB | -3.85788 | chr9 | XLOC_000895 | TRUE | -2.97737 |
SCIN | -3.79371 | chr7 | ENSG00000229233.1 | FALSE | -4.70435 |
PHYHIPL | -3.78832 | chr10 | XLOC_000135 | TRUE | -3.22042 |
Up-regulated . | |||||
---|---|---|---|---|---|
protein-coding gene . | logFC . | Chromosome . | Associated lincRNA . | Novel . | logFC . |
GNLY | 5.501274 | chr2 | XLOC_000450 | TRUE | 4.215082 |
IL12RB2 | 4.413091 | chr1 | XLOC_000014 | TRUE | 2.459516 |
GREM1 | 4.393926 | chr15 | ENSG00000259721.1 | FALSE | 4.009044 |
CLEC4E | 4.07331 | chr12 | ENSG00000255801.1 | FALSE | 4.662458 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000251408.1 | FALSE | 3.129591 |
JAKMIP1 | 3.915142 | chr4 | ENSG00000249896.1 | FALSE | 3.612352 |
CCR8 | 3.728164 | chr3 | XLOC_000595 | TRUE | 3.337482 |
HOXB13 | 3.474395 | chr17 | ENSG00000242407.1 | FALSE | 2.949072 |
BATF | 3.170261 | chr14 | XLOC_000261 | TRUE | 3.499493 |
ZNF812 | 2.987816 | chr19 | XLOC_000416 | TRUE | 2.144413 |
Downregulated | |||||
Protein coding gene | logFC | Chromosome | Associated lincRNA | Novel | logFC |
G6PC | -6.01619 | chr17 | ENSG00000213373.3 | FALSE | -2.81641 |
APOB | -5.5889 | chr2 | ENSG00000261012.2 | FALSE | -4.36301 |
COL2A1 | -4.52718 | chr12 | ENSG00000258203.1 | FALSE | -2.68737 |
OSR2 | -4.22478 | chr8 | ENSG00000271930.1 | FALSE | -3.3486 |
PAQR5 | -4.19728 | chr15 | ENSG00000259504.1 | FALSE | -1.46608 |
FRMD1 | -4.11944 | chr6 | XLOC_000756 | TRUE | -4.18568 |
DAB1 | -4.04047 | chr1 | ENSG00000227935.1 | FALSE | -2.76251 |
ALDOB | -3.85788 | chr9 | XLOC_000895 | TRUE | -2.97737 |
SCIN | -3.79371 | chr7 | ENSG00000229233.1 | FALSE | -4.70435 |
PHYHIPL | -3.78832 | chr10 | XLOC_000135 | TRUE | -3.22042 |
![A. Differentially expressed lincRNAs vs nearby protein-coding genes. Almost all of them [91%] are in the same change directions [quadrants I and III]. Red are lincRNAs in upstream [253, <10K of TSS] of protein-coding genes and green are for those in downstream [42, <2K of TSS]. B. Gene set enrichment analysis with associated pathways. TSS, transcript start site.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0004.jpeg?Expires=1749456840&Signature=lIoU6X2TQz31TE2m9VOKzLCeLoMVwCRWFNtU3-edxh9wmh9lN6OvtGjkCxas5He2LvEgtiXAodQmqzQ~tc8QIdx8lEQm012yst4oMN1BKHvW1Ve-1sSAATTAFk2zQkd0FkbIAs7RktWLfDO4mz2VEu7GTbn0q2XS02RTQ4tBxDjMDGoSkvnOhH2V41TtGN3o7U2q20BqKE4or5JYZF2ePjyGpYD6bsAbj4rgokKAn3fn2OLoJqUiKAy2REXQeDu5gktNSo-cSzXG3q~CP~0XbRtAfn3ZllEChKgUmjAy-GiPk2SH4m3Tn7O7eEonry68W5YyEmOKokCD-A4BZ0kdbA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
A. Differentially expressed lincRNAs vs nearby protein-coding genes. Almost all of them [91%] are in the same change directions [quadrants I and III]. Red are lincRNAs in upstream [253, <10K of TSS] of protein-coding genes and green are for those in downstream [42, <2K of TSS]. B. Gene set enrichment analysis with associated pathways. TSS, transcript start site.
![Quantitative polymerase chain reaction [PCR] for lincRNAs XLOC_000261, XLOC_000639 and XLOC_000014 are overexpressed in CD4+ cells from intestinal tissue in Crohn’s disease patients [CD], when compared with normal healthy controls [N]; Data in mean ± standard error [S.E.] for four healthy controls and six patients with CD; *p <0.05 by Student’s t test.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0005.jpeg?Expires=1749456840&Signature=HFjeOPcFFmotGnkXkRVz5Lf8IySRT6gU6-ZtvtnFuNsDjY9Md~yX-I9-to-G6Dquq8D~mxYxjXcv9~ql79ViBTxi3dkEDo3CUZ3NkvFIC30fy0~CIrdDF4fG6nb0LB4-VOW5i0LhAIpzAsOCeWgC-V97xby-5MuxuKe6wTKy~-PIaofOdqh~Gn1Z7xUV~ckIcho72LdU3VEsoD9lmJxXEuLkYnQImViaI6Uvaq0qY4EdttuiOxiCg~AljvxtALuwj1QG0n9Oov7Psrr37u9VuRW~5LqZiflJljAf~IgCbpOIXUOISMXWtB5swfXQFcGlKrcVg0gxFi0t2nWSRbmd1Q__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Quantitative polymerase chain reaction [PCR] for lincRNAs XLOC_000261, XLOC_000639 and XLOC_000014 are overexpressed in CD4+ cells from intestinal tissue in Crohn’s disease patients [CD], when compared with normal healthy controls [N]; Data in mean ± standard error [S.E.] for four healthy controls and six patients with CD; *p <0.05 by Student’s t test.
The significantly changed lincRNAs were scattered in all chromosomes, as shown in Figure 6A. The RBPJ and basic leucine zipper ATF-like transcription factor [BATF] protein-coding genes and their respective nearby lincRNAs XLOC_000639 and XLOC_000261 are illustrated in Figure 6B and 6C. Both lincRNAs are located upstream of these protein-coding genes and significantly up-regulated in the patients’ CD4+ cells, and so are their respective protein-coding genes.
![Distribution of changed lincRNA and protein gene pairs in the genome. A: They are scattered in all chromosomes and in the similar change directions. B. Example of a lincRNA-protein gene pair [XLOC_000639-RBPJ]. C. Example of a lincRNA-protein gene pair [XLOC_000261-BATF].](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0006.jpeg?Expires=1749456840&Signature=m-dDHeWX-JNDlgG7xRz1YoS5yf6G7vipbL6o4hNPdekfzSzOnILos9IAQBUcwc0usp5alOwtlfyOKyV6Y3OCzN9lORuxu773qakklnA-~Js5XSo1gwlxu4-Mtu9k9ABotadmshxdsDt9bNGCgyPiXtR9TWgFQcyF1kewOsMBCl6TyIHQ2XrRu0saCO052BQMOX7qa8vCXZiwRGLlBHVeN3avBa0REiYOAewLIG3BHmb4h0oLdKIpev~4xrDwTGunt3o71bgWtWdu2RM-bvzdjNDm0NU6xCvaj776l6jplAFleZI~q4PRaIenLq37GZXga86B3PRsrdbqB3rKGDxluA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Distribution of changed lincRNA and protein gene pairs in the genome. A: They are scattered in all chromosomes and in the similar change directions. B. Example of a lincRNA-protein gene pair [XLOC_000639-RBPJ]. C. Example of a lincRNA-protein gene pair [XLOC_000261-BATF].
3.6. GWAS IBD hits in the coding regions of lincRNAs
Genome-wide association studies have identified and further validated 200 loci [231 independent SNPs] associated with Crohn’s disease, ulcerative colitis, or both, in a large trans-ancestry association study.4 As the majority of disease-associated SNPs map to non-coding DNA, we tested the potential of the IBD loci [defined as +/- 150KB of the 201 top SNPs in the loci for European population from Liu and colleagues4] more likely overlapping with the Crohn’s-associated lincRNA set we identified. We found 184 SNP loci overlapping with at least one of the 6402 lincRNAs in our final differential expression analysis dataset [Figure 7]. Indeed, we found that the significantly changed lincRNAs between cases and controls were overly represented in the overlapping regions compared with those not differentially expressed [37.5% vs 27.5%, chi square test p-value = 0.0003]. When we limited the loci for Crohn’s disease only, the distribution was similar although not statistically significant [35.4% vs 27.9%, p-value = 0.31] due to a smaller number of CD loci [46 loci], lacking statistical power to be significant.
![Inflammatory bowel disease [IBD]-associated single nucleotide polymorphism [SNP] loci and overlapping differentially expressed lincRNA. A total of 1791 differentially expressed lincRNA [out of 1832], with fold change greater than 1.5 and average expression greater than 1 CPM [count per million], were identified.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0007.jpeg?Expires=1749456840&Signature=ONooHLEJy39eNnSbMOI114z4iD~AjpEVun1mbGKkytkb93wP~b85TwvJa5C06aIecvinkyz313mA8G7bSfi4YF4If3WlqMzHbkJrlloHOaNIaiLYNnMRKysqWff6UJsBm2ulTI3SZKzjdd5TuC7cQgr-8Y7mySF3wSuHR3B0ha7-QLKPVgrRaERz01IvcXEW6WZ3qqO5V1wNPnwmO~4hRiHPZzbARm1l3xWaIgU0JOuRSZtPL-ONe~nOc7Kw2xQI5v7llb7L2L1L5SKVr8DXuo77ooCSKOAX-BCaZEjL6xHT5jRi1BtioMNeSIUusFjT3WPUQjv2FTlCQRaAkcmrZQ__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Inflammatory bowel disease [IBD]-associated single nucleotide polymorphism [SNP] loci and overlapping differentially expressed lincRNA. A total of 1791 differentially expressed lincRNA [out of 1832], with fold change greater than 1.5 and average expression greater than 1 CPM [count per million], were identified.
3.7. Knockdown of XLOC_000261 enhances frequency of RORγt+ T cells
We evaluated the expression of XLOC_000261 [nearest protein-coding gene: BATF] and XLOC_000014 [nearest protein-coding gene: IL12RB2] mRNA in human T cell subsets, Treg, Th1, Th2, and Th17, induced in vitro from naïve CD4+T cells [CD45 RA+ cells]. Flow cytometric analysis confirmed the expression of FOXP3, T-bet, GATA3, and RORγt in Tregs, Th1, Th2, and Th17 cells, respectively [data not shown]. We found lincRNA XLOC_000261 to be highly expressed in Th17 cells and Tregs, whereas lincRNA XLOC_000014 enriched to Th1 and Th17 cells [Figure 8]. We found that knockdown of XLOC_000261 in Th17 cells induced significant increase in RORγt expression as measured by flow cytometry [Figure 9], suggesting that XLOC_000261 is a negative regulator of the Th17-defining transcription factor RORγt and can impact on the pro-inflammatory capacity of Th17 cells. Highlighting the importance of functional validation of lncRNAs, knockdown of XLOC_000014 in Th1 cells did not cause significant change in expression transcription factor T-bet [Figure 10]. However, only mild knockdown of XLOC_000014 was achieved [Supplementary Figure 2, available as Supplementary data at ECCO-JCC online], which may at least partly explain why change in expression of T-bet was not observed.
![Quantitative polymerase chain reaction [PCR] for XLOC_000261 and XLOC_000014 in T cell subtypes. A. Quantitative PCR for XLOC_000261 in induced Treg [iTreg], Th1, Th2, and Th17 cells. B. Quantitative PCR for XLOC_000014 in induced Treg [iTreg], Th1, Th2, and Th17 cells. Data represent three independent experiments, standard error of the mean [S.E.M]; *p <0.05 [unpaired Student’s t test].](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0008.jpeg?Expires=1749456840&Signature=MsiKZ8SjC4MjkuTOE2kdFChFnii1jEpP1GwUx1L-7gnKC1hBrjMFUaOheCgBuyHddXtTd4RW0OpQ2UcMlHtW0GQAq2Ip6IH-xmw~3GnzjWnpl~d61kFmCoHAOxv3KBemfjeDEBBzUoD7mFR8WDVI9ZiHSwsdmeScOYIDDsasA7pJFbPPbEMDm8Y7510bIq2KTyZPjqzhCldCDaRZuP0ACsbHLMlX4Ia-LMjW0e~gw-VYUIBEF-~hpYiPPCJ-CUiRsPXtBCsjRPBgFU3jyKMS2OS~VG47MTvhnXOjX3amYyoRlHWOdnrrMPac6E2hqgKH18Wxg75AYaqbP-FKGWwWcg__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Quantitative polymerase chain reaction [PCR] for XLOC_000261 and XLOC_000014 in T cell subtypes. A. Quantitative PCR for XLOC_000261 in induced Treg [iTreg], Th1, Th2, and Th17 cells. B. Quantitative PCR for XLOC_000014 in induced Treg [iTreg], Th1, Th2, and Th17 cells. Data represent three independent experiments, standard error of the mean [S.E.M]; *p <0.05 [unpaired Student’s t test].
![Flow cytometry for RORγt in Th17 cells. Expression of RORγt in Th17 cells 24 h after transfection with non-targeting siRNA [CONTROL], represented in left column, or siXLOC_000261 [siRNA], represented in middle column, from Donor #1 [A], Donor #2 [B], Donor #3 [C], and Donor #4 [D]. Mean percentage of RORγt represented as histograms in the right column [E], where red is control and blue is siRNA. Data represent four independent experiments, standard error of the mean [S.E.M]; *p <0.05 [unpaired Student’s t test].](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0009.jpeg?Expires=1749456840&Signature=pqqBynhOpQ2vkPxs0Rpc9E2yFc0FT06h8rxiehb1d2iCLHxqABPiveCRUCycp74dU3nkzoeXZ5rKxaBQVw4M43h00~W8xRQsie7VHceroM~y45ln-5W0euy1cEYC6walS1HKS3OcUIO2dEl84nUDJeFb9LySBZK229yImEPPH7jdyt4DIsYS8KP~UvCC3kd7VVLCFSoW~hnouKIa~hwYk0pDaTOfbCKaSzq3h1jsaBtgUhW2ColgL6TICVYnYcP9e9CkCDktYVRknHPWYpxz7gFSay3TXW5AoNTMV3a~CezcoeD6d2ofItJwm1l8UBFbwFVOemK7aYa6Ppc1fv0p2g__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Flow cytometry for RORγt in Th17 cells. Expression of RORγt in Th17 cells 24 h after transfection with non-targeting siRNA [CONTROL], represented in left column, or siXLOC_000261 [siRNA], represented in middle column, from Donor #1 [A], Donor #2 [B], Donor #3 [C], and Donor #4 [D]. Mean percentage of RORγt represented as histograms in the right column [E], where red is control and blue is siRNA. Data represent four independent experiments, standard error of the mean [S.E.M]; *p <0.05 [unpaired Student’s t test].
![Flow cytometry for T-bet in Th1 cells. Expression of T-bet in Th1 cells 24 h after transfection with non-targeting siRNA or siXLOC_000014 from Donor #1 [A], Donor #2 [B], Donor #3 [C], and Donor #4 [D]. Mean percentage of T-bet represented as histograms in the right column [E], where red is control and blue is siRNA. Data represent four independent experiments, standard error of the mean [S.E.M]; *p <0.05 [unpaired Student’s t test].](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ecco-jcc/14/1/10.1093_ecco-jcc_jjz109/1/m_jjz109f0010.jpeg?Expires=1749456840&Signature=baOQArNiWMkxXHvjjUd1THdqttSWvCadV6RZx3ChvvXyz5csi~OB1qQZ8l1JfXuQ0VSSwEKqSkgh8vWOppEFgoHhjPQUwzFxMNcd~QM5xk5p0XXeC-gGKYR0wWjJDT4Ymc2q3pUP~HTdGDY2o3QFL9FEHApJ0n84Vv1bH7zbv90mwMjSAafFvNnHHhfcXPHKs2YcBdnZ7hH1gG5p-6-NLVerv48Q-c1gEeRWjAVf-HlL8rb1Ofwp61160iOMN3hFAtdyCZOyVXMV8ElkpTEG25WInfqmQqnVwaqTvCRBqchPSnmd5Z2NGJU2blgsWMD0Eu4gnBM9PbKdjrhH6hJt7w__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Flow cytometry for T-bet in Th1 cells. Expression of T-bet in Th1 cells 24 h after transfection with non-targeting siRNA or siXLOC_000014 from Donor #1 [A], Donor #2 [B], Donor #3 [C], and Donor #4 [D]. Mean percentage of T-bet represented as histograms in the right column [E], where red is control and blue is siRNA. Data represent four independent experiments, standard error of the mean [S.E.M]; *p <0.05 [unpaired Student’s t test].
4. Discussion
The significant contribution of this study is the identification of novel lincRNAs and the identification of differentially expressed lincRNAs involved in the regulation of multiple pro-inflammatory gene networks in CD-associated CD4 + T cells, as opposed to previous studies conducted on the tissue mixture of biopsy or bulky tissues and known lncRNAs only.6,33 We also validated a subset of the novel lincRNAs in an independent cohort. Both known and novel lincRNAs clearly separated patients from controls by clustering or PCA analysis, and a higher proportion of differentially expressed lincRNAs overlapped regions of IBD susceptibility loci. Furthermore, we demonstrate that the novel lincRNA XLOC_000261 negatively regulates expression of RORγt in Th17 cells. This is the first report of the expression and potential biological relevance of lincRNAs in CD-associated immune cells.
The clear separation of CD4+ cells between patient and control samples by PCA or unsupervised clustering is in agreement with previous studies of lncRNA expression in plasma and intestinal biopsies from IBD patients6,14,15; yet our focus on CD4+ T cell isolation allowed deeper insight into novel T cell specific pathophysiological mechanisms. Discovery of novel lncRNAs relies upon a stringency criterion as outlined in the Methods section, which requires a relatively high level of expression. However, once a novel lncRNA has been identified, validation is then possible using a highly sensitive technique such as PCR even when the expression is quite low, as was evident in peripheral blood mononuclear cells. Interested in the contribution of inflammation to a putative Crohn’s specific signal, we performed a small independent subanalysis of isolated CD4+ cells from inflamed and non-inflamed tissue at the time of surgical resection, and found that both patient-derived samples generally cluster together and separate from control samples [Supplementary Figure 3 and Supplementary Table 3, available as Supplementary data at ECCO-JCC online]. However, caution is warranted in the interpretation of the uniformity of lincRNA signal in inflamed/non-inflamed samples, given small sample size and the use of the patient as their own control. Thus, we cannot formally exclude the possibility of contribution of inflammation to a Crohn’s specific signal.
Our bioinformatics analysis demonstrated enrichment of pro-inflammatory pathways, including TNF-alpha, IL-6, Th2, and T-cell receptor signalling consistent with previous work6,15; however the significance of the Notch signalling pathway is not previously described. Other examples of lncRNAs regulating human T cell differentiation exist. The linc-MAF-4 promotes a Th1 cellular development by suppressing the expression of MAF; and the TH2-LCR lncRNA cluster, selectively expressed in Th2 cells, regulates expression of IL-4, IL-5, and IL-13.12,16 Regarding the Notch pathway, recombining binding protein suppressor of hairless gene [RBPJ], a key nuclear effector in Notch signalling, was up-regulated more than 2-fold in CD4+ cells of the patients, likely regulated by one of the top 10 most up-regulated lincRNAs, novel lincRNA XLOC_000639 located upstream of RBPJ [Figure 6A]. Of relevance to T regulatory [Treg] cell biology, Notch signalling has been shown to play a role in CD4+ T cell differentiation and the survival and function of peripheral Treg cells34–36; yet its role in immune pathogenesis of IBD has not been explored. We have identified a novel lincRNA, XLOC_000261, relevant to Th17 biology in the intestine of Crohn’s patients, an immune-mediated disease known to be driven by Th17 pathway. These data highlight the need and impact of discovery forms of experimentation capable of identifying candidate proteins for deeper, more mechanistic disease-relevant analysis.
From the treatment-naive early-onset paediatric patients with Crohn’s disease, a recent study found that lincRNA LINC01272 was significantly up-regulated in the patients and may play an important role in the myeloid pro-inflammatory process.33 Our data showed even a much higher change [24- vs 9-fold] for this lincRNA [ENSG00000224397.1], which clearly demonstrated the benefit of purified cell specific profiling. Of note, that study studied known lincRNAs only.
In addition, we show that several of known and novel differentially expressed lncRNAs described in this study are found more than expected in the IBD susceptibility loci, raising the possibility that these genetic variants may affect gene expression and regulation through lncRNAs. This is in agreement with previous studies in CD and ulcerative colitis in whole-tissue biopsies, which have identified several lncRNAs [e.g., IFNG-AS1] near protein-coding genes associated with pro-inflammatory response, such as IFN-gamma, overlapping IBD susceptibility loci.14 As lincRNAs have been shown to be cell type- and stage-specific, investigating isolated CD4+ from intestinal mucosa adds critical insight into potential identification of functionally relevant long non-coding RNAs to immune-mediated pathogenesis of CDs.
In conclusion, we performed a comprehensive profiling of long intergenic non-coding RNAs in Crohn’s disease-associated CD4+ T cells from intestinal mucosa through RNA sequencing. We identified novel differentially expressed lincRNAs and found that they enrich to pro-inflammatory and Notch signalling pathways and were more frequently found to overlap IBD susceptibility loci. In addition, we demonstrate a functional role of lincRNA XLOC_000261 in the differention of Th17 cells. Taken together, these findings suggest that lincRNAs may play a role in regulating the immune system in Crohn’s disease. Further studies are warranted to validate and determine functional roles of these lincRNAs.
Funding
This work was supported by National Institutes of Health [grant number RO1 AI089714] and the National Institute of General Medical Sciences [grant number NIGMS GM075148].
Conflict of Interest
The authors declare no conflict of interest.
Author Contributions
MBBN was involved in the concept and design, data collection, analysis and interpretation of the data, drafting first draft of manuscript, and final approval of the article. JMG was involved in data collection, analysis and interpretation of the data, drafting first draft of manuscript, and final approval of the article. OS was involved in the data collection, editing for important intellectual content, and final approval of the article. MG was involved in the data collection, editing for important intellectual content, and final approval of the article. PS was involved in the data collection, editing for important intellectual content, and final approval of the article. GPR was involved in data collection, editing for important intellectual content, and final approval of the article. MRS was involved in the data collection, editing for important intellectual content, and final approval of the article. AOB was involved in the interpretation of the data, editing for important intellectual content, and final approval of the article. SA was involved in the data collection, editing for important intellectual content, and final approval of the article. ZS was involved in the concept and design, analysis and interpretation of the data, editing for important intellectual content, study supervision, and final approval of the article. WAF was involved in the concept and design, analysis and interpretation of the data, editing for important intellectual content, study supervision, and final approval of the article.