Menu

Tree [417539] master /
 History

HTTPS access


File Date Author Commit
 examples 2011-10-10 Fredrik Lysholm Fredrik Lysholm [7b11a4] Initial commit
 gen 2012-08-17 Fredrik Lysholm Fredrik Lysholm [e2b6c3] several bugfixes including fix of handling nflo...
 src 2012-08-21 Fredrik Lysholm Fredrik Lysholm [c59dc2] Added stable output
 AUTHOR 2011-10-19 Fredrik Lysholm Fredrik Lysholm [aa4f1e] added
 LICENSE 2011-10-10 Fredrik Lysholm Fredrik Lysholm [7b11a4] Initial commit
 README 2012-01-12 Fredrik Lysholm Fredrik Lysholm [654950] Newline added
 fragsim 2012-08-24 Fredrik Lysholm Fredrik Lysholm [417539] Fixed U/C notation
 makefile 2012-08-21 Fredrik Lysholm Fredrik Lysholm [c59dc2] Added stable output
 test454sim 2012-08-21 Fredrik Lysholm Fredrik Lysholm [c59dc2] Added stable output

Read Me

#
# INFORMATION about 454sim
#

454sim process standard FASTA either by command line argument or on standard in 
and will generate reads, one for each FASTA entry present in the file starting 
from the first base and until either the sequence ends or the simulated read 
ends (due to quality deteriorates or all flows have been spent).

As a consequence it is very useful to utilize a script which will generate 
fragments from a larger genome as is done in the preparation face with 454 
sequencing. We include a perl-script in all zip-files called “fragsim” 
which will produce fragments given an genome which could then be further 
feed through 454sim.

454sim takes a number of parameters visible via the --help flag:
-a   Processors to use count (default=8)
-n   Number of flows to simulate per sample (default=800)
-g   generation to simulate (available=GS,FLX,Ti, default=Ti)
-d   directory with generations (default=gen)
-o   output file (default=none specified)
-i   simulation info file (default=none written)


Typical run-line in a Linux environment:

./fragsim -c 1000000 -l 1000 genome.fasta | ./454sim -o genome.sff

or

./fragsim -c 1000000 -l 1000 genome.fasta > genome.fragments.fasta
./454sim -o genome.sff genome.fragments.fasta

or (compressing intermediate data)

./fragsim -c 1000000 -l 1000 genome.fasta | gzip > genome.fragments.fasta.gz
zcat genome.fragments.fasta.gz | ./454sim -o genome.sff

The above example will generation 1 million reads from genome.fasta and store 
the output in genome.sff (the first example will squash intermediate fragment 
output while the second/third will store the output in genome.fragments.fasta.)



#
# Generation files
#

The generation files are by default found in the gen folder. 454sim is shipped 
with a couple of generation files, for example ti.gen, which describes the the 
Roche Titantium 454 chemistry/instrument. A more detailed explanation of 
parameters and their values can be found in the included generations files, 
like the ti.gen file.


#
# Compiling
#

# gcc
make

# intel
make CC=intel

# gcc 32-bit (on 64-bit system, 32-bit is used by default on a 32-bit system)
make CC=gcc32



#
# Testing with test454sim
#

A test script is provided in order to perform simple testing of 454sim with 
default or custom parameters.

test454sim takes the following parameters:

-b <binary>          default = ./454sim
-d <output-details>  default = 2 [0 = simple, 1 = normal, 2 = detailed]
-f <fasta>           default = examples/example.fragments.fasta
-p <454sim-param>    default = ''                                               # these parameters are passed on to 454sim (use quotes)

-s <sff-out>         default = /tmp/454sim.test.sff
-n <info-out>        default = /tmp/454sim.test.sff.txt                         # this file is analysed


An example run just testing 454sim (on Linux with Perl in /usr/bin/perl):
./test454sim -d0

Detailed info of the GS chemistry with custom FASTA
./test454sim -d2 -f[my-fasta-file] -p'-gGS'

An example run testing 454sim (Windows version) using wine:
./test454sim -b'wine 454sim.exe'


Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.