Conformational changes are crucial for a wide range of biological processes including bio-molecular folding and the operation of key cellular machinery. The transcription complex, translation complex, and RNA Induced Silencing Complex (RISC) are some cellular machines of particular interest because of their roles in gene expression. Misregulation of these machines is a major factor in cancer and other human diseases. Extensive genetic, biochemical, and structural experiments have been performed to understand these systems. Probing the mechanisms of these complexes at atomic resolution is very difficult experimentally and without these details it is impossible to understand the fundamental chemistry they perform.
Computer simulations may complement such experiments by providing dynamic information at an atomic level. These data are presently inaccessible through experimental observations, and can lead to predictions that will be tested experimentally. Our research is mainly focused on understanding and manipulating fundamental biological processes associated with conformational changes by developing and applying novel computational tools which bridge the gap between experiments and simulations.
Ongoing projects in the group:
Simulating biologically relevant timescales at atomic resolution is a challenging task since typical atomistic simulations are orders of magnitude shorter. In order to overcome this timescale gap, we will develop and apply novel computational tools.
Multi-resolution Markov State Models (MSMs) Markov State Models (MSMs) provide one means of overcoming this gap without sacrificing atomic resolution by extracting long time dynamics from short simulations. MSMs coarse grain space by dividing conformational space into long-lived, or metastable, states. This is equivalent to coarse graining time by integrating out fast motions within metastable states. Such separation of timescales ensures that the model is Markovian, in that the probability of being in a given state at time t+∆t depends only on the state at time t. This property allows MSMs to be built from many short simulations (tens to hundreds of nanoseconds), and then propagated to give global long timescale dynamics
Folding kinetics and
thermodynamics from MSMs
By varying the degree of coarse graining one can vary the resolution of an MSM; therefore, MSMs are inherently multi-resolution. We have developed a new algorithm Super-level-set Hierarchical Clustering (SHC), to our knowledge, the first algorithm focused on constructing MSMs at multiple resolutions. The key insight of this algorithm is to generate a set of super levels covering different density regions of phase space, then cluster each super level separately, and finally recombine this information into a single MSM. SHC is able to produce MSMs at different resolutions using different super density level sets.
We continue working on this direction by developing new tools for conformational dynamics based on topological and Geometric data mining tools in collaboration with Prof. Yuan Yao (Math, Peking U) and Dr. Jian Sun (Computer Science, Princeton U.).
Generalized Ensemble Sampling(GES) Methods. Understanding conformational changes in biomolecules is challenging because it is difficult to sample from the rugged and high-dimensional free energy landscapes. Thus, simulations have a tendency to become trapped in various local minima for extended periods of time. GES methods, such as simulated tempering and parallel tempering were developed to overcome this trapping problem by inducing a random walk in temperature or modified Hamiltonian space such that time spent at high temperatures or modified Hamiltonian facilitates barrier crossing but canonical sampling of the free energy landscape at physiological temperatures or original Hamiltonian is still achieved. Thus, GES methods have been widely used in studying conformational changes. However recently it has been shown GES does not obviously speed up the convergence for systems, where the relevant barrier is dominated by entropyic and sampling is limited by a conformational search instead of crossing energy barriers. We are interested in developing new algorithms to improve the efficiency of GES methods.
Elucidating the mechanism of transcription is crucial for understanding fundamental cellular processes and many human diseases because of the key role it plays in gene regulation. However, elucidating such mechanisms with sufficient chemical detail (atomic resolution) is difficult for experimental techniques. For instance, x-ray crystallography is one of the few experimental techniques that can give insight into the atomic details of s system but x-ray structures are only static snapshots and leave the mechanisms by which RNA polymerase II (Pol II) oscillates between the different conformational states a mystery (such as the translocation process described by the figure below).
We use computer simulations to study the dynamics of Pol II transcription elongation at atomic resolution, filling the gaps between the known states identified by X-ray crystallography. We are applying and developing novel computational algorithms such as MSMs to overcome the main challenge here for computer simulations: reaching biologically relevant timescales which are orders of magnitude longer than most atomistic simulations.
Biomolecular folding is another area where conformational change is important. For example, proteins have to undergo conformational changes to fold into native structures in order to perform their function. When proteins misfold, they can aggregate and form fibrils. This has been suggested to be the cause of many human diseases such as the Huntington’s disease.
We are interested in developing novel computational tools to elucidate folding/misfolding thermodynamics and kinetics of biomolecules. In particular, we aim to perform quantitatively comparison with experiments. The connections between our simulations results and experimental observables are made by collaborating with Prof. Wei Zhuang (Chinese Academic of Science) who specializes in computing spectrum.