INTEGRATE is a tool calling gene fusions with exact fusion junctions and genomic breakpoints by combining RNA-Seq and WGS data. It is highly sensitive and accurate by applying a fast split-read mapping algorithm based on Burrow-Wheeler transform.
INTEGRATE can be downloaded at https://sourceforge.net/projects/integrate-fusion/files/
INTEGRATE_0_1c.tar.gz contains the source code.
test-data.tar.gz contains input data for a test case; it also contains Example.pdf which walks through the commands to run the test case.
Reference_Manual_0.1c_5_1_2014.pdf walks through download, compile, data preparation, command lines, and output.
annot.ucsc.txt contains gene annotation of UCSC gene track downloaded from UCSC Genome Browser http://http://genome.ucsc.edu/. You can use your own annotation or download other tracks from UCSC Genome Browser, see [annotation].
For details of Installation of INTEGRATE, go to [Installation].
(1) Get annotation from UCSC Genome Browser http://genome.ucsc.edu
For details of downloading annotation, go to [annotation].
annot.ucsc.txt, containing UCSC genes can be download at https://sourceforge.net/projects/integrate-fusion/files/.
(2) Get accepted_hits.bam and unmapped.bam
by mapping tumor RNA-Seq reads with tools like TopHat http://tophat.cbcb.umd.edu/.

Note: please use TopHat 2.
(3) Get dna.tumor.bam
by mapping WGS tumor reads with tools like BWA http://bio-bwa.sourceforge.net/.
Note: BWA turns soft-clip on by default. Make sure soft-clip is turn on if using other tools
(4) If there are normal WGS reads, get dna.normal.bam
(5) index all the BAM files with samtools index (refer to http://samtools.sourceforge.net/).
(6) build BWTs
mkdir ./bwts
Integrate mkbwt reference.fasta*
Note: This step takes about 20-30 minutes, but only needs to run once.
Integrate fusion (options) reference.fasta annotation.txt directory_to_bwt accepted_hits.bam unmapped.bam (dna.tumor.bam dna.normal.bam)
For details of options, go to [options]
For details of output files, go to [output]
OCT 09 2014 INTEGRATE version 0.1e
Fixed an efficiency issue in loading transcripts. Fixed an issue that causing failing to designate output files.
MAY 28 2015 INTEGRATE version 0.2.0
Supporting BAMs from multiple reads alignment tools, go to [alignment-tools]. More output formats added [output].
JUL 14 2016 INTEGRATE version 0.2.5
a. Format of annotation file updated.
For details of downloading annotation file, go to [annotation].
An example of annotation file, annot.ensembl.txt, containing Ensembl genes with this updated format can be download at https://sourceforge.net/projects/integrate-fusion/files/. This is created using UCSC Genome Browser. [annotation] also contains an alternative way of creating annotation file using a GTF file.
b. A parameter -minDel is added for WGS reads or WES reads.
WES reads can be used to get some SVs if WGS reads are not available. However, due to the high coverage of WES reads, some FP encompassing DNA reads may be introduced using previous versions. This parameter is added to address this.
c. More robust code for CMake compilation.
This addressed some compilation issues reported by some users when compiling the gtest and samtools on certain systems.
d. Updated bedpe format for gene fusions.
This updated version of bedpe supports the DREAM SMC-RNA challenge.
AUG 26 2016 INTEGRATE version 0.2.6
Fixed an error introduced to 0.2.5 which incorrecty outputs junctions at intron |--- exon.
A detailed walking trough of an example is at [example].
Wiki: Installation
Wiki: alignment-tools
Wiki: annotation
Wiki: example
Wiki: options
Wiki: output