Background: Genetic studies have been tremendously successful in identifying genomic regions associated with a wide variety of phenotypes, although the success of these studies in identifying causal genes, their variants, and their functional impacts have been more limited. Methods: We identified 145 genes from IBD-associated genomic loci having endogenous expression within the intestinal epithelial cell compartment. We evaluated the impact of lentiviral transfer of the open reading frame (ORF) of these IBD genes into the HT-29 intestinal epithelial cell line via transcriptomic analyses. Comparing the genes whose expression was modulated by each ORF, as well as the functions enriched within these gene lists, identified ORFs with shared impacts and their putative disease-relevant biological functions. Results: Analysis of the transcriptomic data for cell lines expressing the ORFs for known causal genes such as HNF4a, IFIH1 and SMAD3 identified functions consistent with what is known for these genes. These analyses also identified two major clusters of genes with shared impact on the transcriptome: Cluster 1 contained the known IBD causal genes IFIH1, SBNO2, NFKB1 and NOD2, as well as genes from other IBD loci (ZFP36L1, IRF1, GIGYF1, OTUD3, AIRE and PITX1), whereas Cluster 2 contained the known causal gene KSR1 and implicated DUSP16 from another IBD locus. Our analyses of these clusters highlighted how multiple IBD gene candidates impact on epithelial structure and function, including the protection of the mucosa from intestinal microbiota. Conclusions: This functional screen, based on expressing IBD genes within an appropriate cellular context, in this instance intestinal epithelial cells, resulted in changes to the cell’s transcriptome that are relevant to their endogenous biological function(s). This not only helped in identifying likely causal genes within genetic loci but also provided insight into their biological functions. Furthermore, this work has highlighted the central role of intestinal epithelial cells in IBD pathophysiology.
Overall design
This dataset includes 426 samples for 145 ORFs, mostly in independently transducted triplicates. Empty vector controls are also included.