Neural network analysis of mRNA secondary structure across transcriptomes, 2010

Collection:
Atlanta University and Clark Atlanta University Theses and Dissertations
Title:
Neural network analysis of mRNA secondary structure across transcriptomes, 2010
Creator:
Lockhart, Edward Ronald, Jr.
Contributor to Resource:
Seffens, William
Chaudhary, Jaideep
Odero-Marah, Valerie
Date of Original:
2010-12-01
Subject:
Degrees, Academic
Dissertations, Academic
Location:
United States, Georgia, Fulton County, Atlanta, 33.749, -84.38798
Medium:
dissertations
theses
Type:
Text
Format:
application/pdf
Description:
Degree Type: dissertation
Degree Name: Doctor of Philosophy (PhD)
Date of Degree: 2010
Granting Institution: Clark Atlanta University
Department/ School: School of Arts and Sciences, Biology
This study examines mRNAs of less than 5000 base pairs in size, to determine the effects of base composition on folding free energy. Statistical analysis between the native mRNA and its randomized sequences was conducted, and when comparing mRNAs in human, chimp, chicken, mouse, and several other transcriptomes, we found that the native mRNAs were more stable (greater negative free energy of folding). It has been found that when length and base composition are conserved, native mRNA sequences are more stable than random mRNA sequences. More stable folding conformations have greater negative free energy values. This negative bias in free energies can be statistically measured as a Z-score which normalizes for sequence length. In an effort to determine if sequence patterns correlate with secondary structure, a neural network (JavaNNS) was trained using three training sets (Negative-Z, Near Zero-Z, Positive-Z) separately to compare the effect of neural network learning from the folding characteristics of the gene sequences. The training sets were typically allowed to run for up to 100,000 generations, and the resulting sum square errors were periodically saved. We found that the negative Z-score training set gives lower neural network sum square errors than the positive Z-score training set, and the Z-scores near zero have the highest training error. This indicates that there are more detectable sequence patterns in genes with more secondary structure than in genes exhibiting more positive Z-scores.
Metadata URL:
http://hdl.handle.net/20.500.12322/cau.td:2010_lockhart_edward_r_jr
Language:
eng
Holding Institution:
Clark Atlanta University
Rights:
Rights Statement information

Locations