Biology & Bioinformatics
Carissa Fong, carissa_fong1@baylor.edu
Baylor University, with Dr. Mary Lauren Benton
Bioinformatics
Predicting Gene Expression in S. cerevisiae From Random Promoter Sequences Using Machine Learning Methods
Phenotypic variation in eukaryotes is largely determined by gene regulation. While the protein-encoding areas of the genome are responsible for creating the compounds that result in a specific phenotype, the regulatory regions of the genome add further complexity that determine the observed phenotype. One such region is the promoter, which is critical for the expression of all genes. However, the "regulatory code" of the promoter -that is, how its sequence relates to the expression level of its associated gene -is not very well understood. In the present study, we aim to obtain a better understanding of the relationship between a promoter's sequence and strength by utilizing machine learning methods to predict the level of gene expression from a yeast promoter sequence. We also compare the performances of various machine learning algorithms to determine the optimal strategy for predicting gene regulation. We find that the models trained on 3-mer data performed decently well in both classification (accuracy= 0.69, Fl = 0.69) and regression (RMSE = 1.95, R2 = 0.32); however, there is still much room for improvement. In the future, we plan to test nonlinear models such as neural networks to determine if they are better able to capture the relationship between sequence and expression. We also hope to train and test our models on a more powerful machine to improve runtimes and enable the use of larger datasets. Further research is needed to uncover whether models trained on yeast promoters can be extrapolated to other organisms, including humans.
Sabrina Hardin, sabrina.hardin@my.utsa.edu
University of Texas at San Antonio, with Dr. Brian P. Hermann
Biology
Determining Function of Id4-CreERT2 through Expression Profiling and Functional Characterization
Spermatogenesis is the complex procedure through which numerous spermatozoa are generated daily within the male testis. Spermatozoa are generated continually throughout the man’s lifetime are dependent on specialized stem cells located within the testis called Spermatogonial Stem Cells (SSCs). Undifferentiated SSCs are found within the gonads and can either go through a process of self-renewal to create more SSCs or differentiation to initiate spermatogenesis. Currently, there remains minimal research regarding the biological mechanism for the reason SSCs either self-renew or differentiate. Studies found that inhibitor of differentiation (Id) molecules function in regulatory roles in numerous species. It was found that Id4+ cells have a positive correlation to the number of SSCs. CreERT2 is a tamoxifen inducible, Cre-estrogen receptor fusion protein which helps researchers induce and track Id4 within the testes to correlate the number of auto renewing or differentiating SSCs. However, when transgenes are produced and genes like Id4 are genetically manipulated, it is important for researchers to validate that the Id4 gene is maintaining its original function. This is done by processing of the mRNA, DNA Genotyping and LacZ Staining of different tissues within Id4+ and Id4- mice to determine if the Id4 gene is being expressed in the tissues it would normally which are the sex organs as well as the colon. This is done to validate past and future research using Id4-CreERT2 and SSCs. The current findings are that the LacZ Staining has shown that the Testes, Epididymis and Colon are showing positive results for Id4 presence.
Yaseen Arab, yassenarab@gmail.com
Baylor University, with Dr. Mary Lauren Benton
Biomedical Engineering/Bioinformatics
GeneRegulate Tool
Gene regulatory elements, such as enhancers and silencers, play a significant role in regulating the degree of gene expression. Identifying these regulatory elements can provide a great understanding of how genes are expressed differently across different tissues and offer possible therapeutic targets for precision medicine. Understanding the regulatory landscape that controls gene expression is important in advancing precision medicine. GeneRegulate solves this need by connecting enhancer and silencer data with gene expression, to gain a better understanding of the key regulatory components that are associated with tissue-specific expression. Our tool allows researchers to analyze complex genomic data across 6 human tissues. GeneRegulate offers two options: users can upload their own gene lists, enhancers, and silencers or use our preloaded datasets. We process the data in Python, connecting regulatory elements with their associated genes using an existing set of chromatin loop annotations. GeneRegulate users can easily analyze regulatory landscapes for individual genes across multiple tissues, displaying the result with a variety of data visualization options, enabling researchers to easily identify tissue-specific expression patterns and regulatory trends. GeneRegulate proves to be a powerful tool for studying gene regulation and tissue-specific expression. Analyzing these data can increase our understanding of regulatory mechanisms, with potential implications for precision medicine and drug development. As GeneRegulate evolves, our goal is to add more genome data and analyze more regulatory elements. By doing this, we aim to uncover new findings in gene regulation and contribute to the advancement of precision medicine.
Session Location
- Foster 203
Session Date/Time
- Thursday, 10:00 - 11:00am
Session Type
- Oral Student Presentations
- Student Presentations