The CYCLIZE Program
The CYCLIZE program can be divided into four main parts. The first
part is defining the "cyclization parameters", which describes
how to perform calculation. The second part is defining the
"sequence parameters", which describes how the chain on which
to perform the simulations is constructed. The third part is actually calling the C programs
and running the simulation. The fourth part is analyzing the results.
1) Defining the "cyclization parameters" hash
The cyclization parameters hash (usually called %cyclize_parameters)
contains all the information necessary for running the Monte Carlo
simulation EXCEPT information on the chain itself. For example:
%cyclize_parameters = (
"whole_chains" => 1e8,
"nrad_stats" => 1e7,
"icalcs" => 100,
"radial_cutoff" => 60,
"axial_cutoff" => 40,
"torsional_cutoff" => 36,
"nkeepers" => 10,
)
These cyclization parameters will perform a calculation with:
2) Generating the "sequence parameters" hash
The "sequence parameters" hash (usually called %seq) contains the
information necessary to construct the particular chain being studied.
Ultimately, it is assumed that all DNA chain information can be
represented in terms of seven values per base pair step:
tilt, tilt flex, roll, roll flex, twist, twist flex, and
rise per helix axis (dz).
For instance, two basepairs of B-form DNA at 25 degrees C might look
like this using these values:
num tilt flex roll flex twist flex dz
1 B 0.000 4.842 0.000 4.842 34.450 4.388 3.400
2 B 0.000 4.842 0.000 4.842 34.450 4.388 3.400
CYCLIZE holds this information in a sequence parameters hash.
A number of PERL subroutines have been written to fascilitate
creating the %seq hash, click on them to see how to use them, or
look at some of the samples scripts.
3) Running the Monte Carlo simulation and analysis
In order to perform the Monte Carlo simulation, the "cyclize
parameters" and "sequence parameters" hashes must be defined.
Once they have been built, the PERL subroutine "Cyclize" will
perform the MC simulation
This PERL subroutine calls two C programs to perform the
simulation and analysis, generate_chains and analyze_chains.
Below is a description of how to use each program (note: you needn't
know any of this, you can simply call Cyclize and the subroutine
will call these programs appropriately):
generate_chains
This program creates uses the sequence.in file as input and create two
files called "chain_1.dat" and "chain_2.dat" (assuming 2 partitions)
that contains "nchains" half-chains generated using Monte Carlo.
USAGE:
generate_chains file seed nchains nparts
Where file = DNA params file
seed = random number generator seed (odd 6 digit integer)
nchains = total number of chains to generate
nparts = total number of chain partitions to generate
analyze_chains
This program takes as input the chain_1.dat and chain_2.dat and
combines the chains following the algorith described in the "method"
section of this manual. Ultimately, the total number of whole chains
analyzed will be (nchains^2) / nicalcs.
USAGE:
analyze_chains file nchains nparts nrad_stats nkeepers rcut acut tcut
Where: file = DNA parameters file
nchains = number of half-chains to analyze
nicalcs = number of independent calculations
nrad_stats = how many whole chains to keep radial stats?
nkeepers = number of cyclized chains to store to disk
rcut = radial cutoff (in angstroms)
acut = axial cutoff (in degrees)
tcut = torsional cutoff (in degrees)
4) Further analysis of the results
A number of additional programs/scripts are provided to fascilitate
further analysis of the results after finishing with the MC step:
Jon Lapham
Yale University
Department of Chemistry - Crothers Lab