LoFreq Activity

Fast and sensitive variant-calling from sequencing data

Brought to you by: nnnagara, onde

Activity for LoFreq

3 years ago
Luca Mologni posted a comment on discussion General Discussion

Hi all, I am analyzing amplicon deep-seq data, going for rare variants. While I've always run LoFreqFilter with default strand bias filtering (multiple test - FDR), now I would like to filter on a specific SB threshold. Can you advise on reasonable value? Should I consider SB=0 as the only true variants? For example, how do you see this: DP=12219;AF=0.009657;SB=5;DP4=822,11273,11,107 thanks Luca
4 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Catherine, most variant callers produce rather arbitrary variant scores. LoFreq's variant quality scores are"proper" error probabilities converted into a Phred score. The error probabilities are computed using a poisson binomial distribution, which takes all multiple quality scores (mapping quality, alignment quality, base quality) into account. If you look up the definition of Phred scores you will see that Q20 corresponds to an error probability of 0.01, Q30 to 0.001 etc. 49314 is simply the...
4 years ago
Catherine Arnold posted a comment on discussion General Discussion

Hi, I am using LoFreq in combination with another SNV caller. The other SNV caller lists quality scores as Q20, Q30, etc. My LoFreq output is giving a number string with the highest being 49314 in the QUAL column. How is this score calculated and how does it compare to a Phred score call like Q20?
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Dear Eugenia, thanks for you patience, while waiting for a reply. Source quality was a rather experimental attempt to add one more error source to LoFreq's core: it tries to account for contamination/mismappings etc. by looking at the amount of mismatches in a read (think of it as a variation of mapping quality). An accumulation of mismatches in a particular read leads to a penalty. However, you will want to ignore known variants, during the mismatch counting and for this you can for example use...
7 years ago
Eugenia Zarza posted a comment on discussion General Discussion

Hi, I would like to know what 'Source quality' means, and how the -s and -S options affect its computation. I'm trying to call human variants, including indels. As suggested in the online documentation, I would like to use dbSNP, however NCBI holds several databases and I'm not sure which one to use. I hope that understanding what 'source quality' means, will help me decide what is the most suitable database for my current problem. Thank you, Eugenia
7 years ago
Camilo posted a comment on discussion General Discussion

Thanks for the answers.
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

That's only indirectly possible. You can run it with all filters off on the region of interest: lofreq call -r sq:start-end --no-default-filter -a 1 Andreas On 23 May 2018 at 11:53, Camilo cvillaman@users.sourceforge.net wrote: Thanks for the answers, they are very helpful. I have a final question, though. Is there a way to check why a possible variant is not being called by LoFreq? Source quality and ignore VCF in single tumor sample. https://sourceforge.net/p/lofreq/discussion/general/thread/cdeddc89/?limit=25#e4b0/2978/3285/2592/75a0...
7 years ago
Camilo posted a comment on discussion General Discussion

Thanks for the answers, they are very helpful. I have a final question, though. Is there a way to check why a possible variant is not being called by LoFreq?
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Oh I see. In general, using source quality will give you more conservative calls. There is a chance that it will undercall in mutational hotspots. Variants in the "ignore vcf" file are just used to tune the source quality computation. Normally reads with lots of variants get a low source quality, however, variants listed in the aforementioned file are ignored for this. These variants are not used to mask final calls! Hope this answers the question, Andreas On 22 May 2018 at 23:14, Camilo cvillaman@users.sourceforge.net...
7 years ago
Camilo posted a comment on discussion General Discussion

I'm running lofreq call to call the variants, not lofreq somatic, and since I'm using human samples, according to the online documentation I should be using -s (source quality) in combination with -S.
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Camilo, -S won't mask variants. It just affect the somatic variant quality score. In fact, adding dbSNP here should have increased the quality of this call. What happens if you run it without the extra -S? Also, there is not (lowercase) '-s' option. Was that a typo? Best, Andreas On 18 May 2018 at 23:13, Camilo cvillaman@users.sourceforge.net wrote: Hello, I'm using LoFreq to call variants on some human tumor samples. We had analized those samples beforehand, so I had an idea about which variants...
7 years ago
Camilo posted a comment on discussion General Discussion

Hello, I'm using LoFreq to call variants on some human tumor samples. We had analized those samples beforehand, so I had an idea about which variants should be called. The samples had some variants reported on dbSNP and according to the recommendations in the home page I decided to enable the -s flag and use -S with a dbSNP VCF file, however, those variants weren't being called. So, I've been wanting to ask: Does the -S option mask/remove the variants on the positions on the file?
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Steve, for duplicate marking (if needed) you can use any tools of your choice, e.g. sambamba. For realignment you can use LoFreq's own realigned lofreq viterbi (requires resorting afterwards). For base quality calibration you can still use GATK or alternatively Lacer https://www.biorxiv.org/content/early/2017/04/25/130732. You should get decent results even without recalibration. Best, Andreas On 16 May 2018 at 22:04, Steve stevekm@users.sourceforge.net wrote: In the documentation for LoFreq,...
7 years ago
Steve posted a comment on discussion General Discussion

In the documentation for LoFreq, it is suggested: For Illumina data, we suggest that you preprocess your BAM files by following GATK’s best practice protocol, i.e. that you mark duplicates (not for very high coverage data though), realign indels and recalibrate base qualities with GATK (BQSR). The latter will also add indel qualities, which is needed for indel calling (alternatively use lofreq indelqual). However, GATK has upgraded to version 4, and has dropped many of these tools since they've been...
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Steve, yes, these variants are not filtered, even though if you just look at the pvalue/quality, they should be. The reason is that strand-bias is a messy beast and we use some hacks: No one really knows why it happens (AFAIK). In viral amplicon data (for which LoFreq was originally designed) we often saw cases, where simply due to the ultra high coverage, you'd get very high p-values even though nothing seem wrong with these variants if you were to evaluate them by eye (plenty of coverage for...
7 years ago
Steve posted a comment on discussion General Discussion

Thanks Andreas. a significance threshold of 0.01 I was looking in the source code and saw here: https://github.com/CSB5/lofreq/blob/master/src/lofreq/lofreq_filter.c#L1093 if (! no_defaults) { if (cfg.sb_filter.mtc_type==MTC_NONE && ! cfg.sb_filter.thresh) { LOG_VERBOSE("%s\n", "Setting default SB filtering method to FDR"); cfg.sb_filter.mtc_type = MTC_FDR; cfg.sb_filter.alpha = 0.001; } Does this mean that the default Strand Bias filter is at a p-value of 0.001? (cfg.sb_filter.alpha = 0.001) As...
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Steve, sure. The basics are explained in the NAR paper (Wilm, 2012): We compute a poisson-binomial distribution taking error probabilities at each pileup site into consideration and derive a p-value from that. Error probabilities were originally just converted base qualities (because that's what they are). In later LoFreq versions we merged base alignment, mapping and base quality into one error probability per base. The logic goes like this: either the read is misaligned (mapping quality) or...
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Steve, the strand-bias p-values is turned into a phred-quality, whose upper bound depends on the precision of the float. In practice it can get much higher then 1900. The fact that you see phred values <60 in other programs is simply because it's mostly arbitrary capped there. Andreas On 4 May 2018 at 03:50, Steve stevekm@users.sourceforge.net wrote: I have another question about the SB score values from the .vcf output. It is my understanding that these values are Phred quality scores, which...
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Steve, not sure why the actually quality filtering is not mentioned there. Let me look into this. Anyway, the main filtering step is working on the variant qualities (which are converted p-values) and it's by default based on Bonferroni correction and a significance threshold of 0.01 Best, Andreas On 4 May 2018 at 07:12, Steve stevekm@users.sourceforge.net wrote: The FAQ page for LoFreq says Do I need to filter LoFreq predictions? You usually don't. Predicted variants are already filtered using...
7 years ago
Steve posted a comment on discussion General Discussion

The FAQ page for LoFreq says Do I need to filter LoFreq predictions? You usually don't. Predicted variants are already filtered using default parameters (which include coverage, strand-bias, snv-quality etc). However, I do not see any details about what these default filtering parameters are. Is there a description anywhere? When I try to run lofreq filter --verbose, the only output I get is: Setting default SB filtering method to FDR Setting default minimum coverage to 10 What other criteria are...
7 years ago
Steve modified a comment on discussion General Discussion

I have another question about the SB score values from the .vcf output. It is my understanding that these values are Phred quality scores, which usually are in the range of 0 - 50. However, I am getting many with values of 500 - 1900. Is this expected? And if SB=0 mean no strand bias, then this means that these regions are extremely strand biased? Also, in this thread you state: 2147483647: This corresponds to a p-value close to zero, i.e. a highly significant SNV. What is the meaning of 2147483647...
7 years ago
Steve posted a comment on discussion General Discussion

I have another question about the SB score values from the .vcf output. It is my understanding that these values are Phred quality scores, which usually are in the range of 0 - 50. However, I am getting many with values of 500 - 1900. Is this expected? And if SB=0 mean no strand bias, then this means that these regions are extremely strand biased?
7 years ago
Steve modified a comment on discussion General Discussion

As sources of errors, it takes base-qualities, mapping qualities etc into account. Thanks for this. However I was wondering if there was a more thorough explanation of each of the values that are used in calculation of the 'QUAL' score values that are output in the VCF? I did not see it covered in the publication (maybe I missed it?) and wasn't able to figure out what was going on in the source code.
7 years ago
Steve posted a comment on discussion General Discussion

As sources of errors, it takes base-qualities, mapping qualities etc into account. Thanks for this. However I was wondering if there was a more thorough explanation of each of the values that are used in calculation of the 'QUAL' score values that are output in the VCF?
7 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Francisco, LoFreq doesn't have an AF filter. The default filter is based on variant quality only. It furthermore actually doesn't report genotypes. Taken together this makes it likely that your collaborator post-processed the vcf file somehow. Hope this helps, Andreas On 24 March 2018 at 13:45, Francisco De La Vega ribozyme@users.sourceforge.net wrote: I have received a VCF from LowFeq form a collaborator that used it for calling SNVs from a cfDNA targeted sequencing assay at a high depth of coverage....
7 years ago
Francisco De La Vega posted a comment on discussion General Discussion

I have received a VCF from LowFeq form a collaborator that used it for calling SNVs from a cfDNA targeted sequencing assay at a high depth of coverage. They develop scripts to use UMIs in the adapters to error correct the aligned reads and then produce a BAM file to feed to LowFreq. The aim is to detect somatic variants in the range of 0.5-2% VAF. However, it appears LowFreq not adding the PASS filter tag to variants under ~2% VAF. Further, since these variants are not passed, the genotypes are reported...
8 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Nils, in short: the BAM file was created with a different reference. The checkref subcommand checks whether the reference fasta given on the command line matches the one given in the BAM header. In your case the BAM header contains a sequence named "1", which is not part of the fasta file. Hope this helps, Andreas On 13 November 2017 at 06:05, Nils Engel nils321@users.sf.net wrote: Hi, I have a problem using lofreq with human sequencing data and hg19 or GRCh38 reference sequences ( downloaded...
8 years ago
Nils Engel posted a comment on discussion General Discussion

Hi, I have a problem using lofreq with human sequencing data and hg19 or GRCh38 reference sequences ( downloaded from NCBI with manually changed file extension .fna -> .fa). I guess it might be a problem with improper file format or index. I get an output as follows: nils321$ lofreq checkref GRCh38_latest_genomic.fa 1214474-H8.bam [fai_load] build FASTA index. [fai_fetch_seq] The sequence "1" not found FATAL(samutils.c|checkref:653): Failed to fetch sequence 1 from fasta file Failed An fasta index...
8 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hello, DP4 only lists the reference and variant base counts. There are usually other bases present as well, which are taken into account for computing AF. Hoping this explains the discrepancy, Andreas On 26 October 2017 at 04:58, siva siva80@users.sf.net wrote: Hi I have several variants (especially those with almost hom-alt allele) that have different allele fraction estimates from DP4 and the AF= tag. for example DP=4088;AF=0.872798;SB=171;DP4=9,33,3329,685 Here from DP4, the AF can be estimated...
8 years ago
siva posted a comment on discussion General Discussion

Hi I have several variants (especially those with almost hom-alt allele) that have different allele fraction estimates from DP4 and the AF= tag. for example DP=4088;AF=0.872798;SB=171;DP4=9,33,3329,685 Here from DP4, the AF can be estimated to be about 0.98189 which is very different from what is published in the AF= tag. Could you please explain?
8 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hello, strand-bias is defined as in samtools: reference and alternate base counts on forward and reverse strand are used as input for Fisher's exact test. This tries to quantify in how far the reference and alternate counts on forward and reverse strand differ, i.e. you'll get high p-values if you have lots of reference bases on one and lots of alternate bases on the other strand. It does not test however whether both, reference and alternate bases, are mainly on the same strand. I hope this explanation...
8 years ago
Kiril Dimitrov posted a comment on discussion General Discussion

Hello, we have analyzed some viral genomes where the strand bias has been estimated as zero. In these results, we have noticed that when the value is zero for the forward or the reverse strands that have the alternate base, the SB=0. Is it that in most cases when in the alternative strands tha value is zero, the SB=0 will be zero (implying no bias) when there actually is bias just by looking at the DP4 data? And maybe such results should not be considered at all? And then is the last example, where...
9 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Erik, hard to tell from this output. Might be because of strand bias. Could you...
9 years ago
Erik Reckase posted a comment on discussion General Discussion

Can someone tell me why a call was not made at this location? lofreq call -f /var/www/hg19.fa...
10 years ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Jessica, these are SNVs that show significant strand bias (sb) and are therefore...
10 years ago
jessica preston posted a comment on discussion General Discussion

Hello, I'm sorry but I can't seem to find this information in the manual. Can you...
1 decade ago
LoFreq released /lofreq_star-2.1.2_macosx.tgz
1 decade ago
LoFreq released /lofreq_star-2.1.2_linux-x86-64.tgz
1 decade ago
LoFreq released /lofreq_star-2.1.2.tar.gz
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Erik, when you switch of default filtering in the call subcommand[s] LoFreq will...
1 decade ago
Erik Reckase posted a comment on discussion General Discussion

I have a dataset that I am processing with the --bed flag set to a list of mutations...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Sorry, I know what's happening: the filters will only affect the actual SNV calling...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi gmy, that is indeed a bit strange. Which exact LoFreq version are you using? Would...
1 decade ago
gmy posted a comment on discussion General Discussion

Hi, Andreas Sorry for late reply. The corresponding output of lofreq is : gi|57116681|ref|NC_000962.2|...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi gmy, I would strongly encourage you to stick to default parameters in LoFreq,...
1 decade ago
gmy posted a comment on discussion General Discussion

Hi, I want to filter bases with quality below 20. And I use command like this "lofreq...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

Yes actually I went through the results back and forth and it seems I do not have...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Chris, LoFreq results are filtered already relatively stringent (1% p-value threshold...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

Hi Andreas, I have used your Lofreq to on my normal/tumor pair to retrieve INDELS....
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hey Chris, if you get a final vcf file, then there is no need to rerun LoFreq. Whether...
1 decade ago
chris.cornor modified a comment on discussion General Discussion

Hi Andreas, Thank you very much for the reply. I would like to let you know that...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

Hi Andreas, Thank you very much for the reply. I would like to let you know that...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hey Chris, yes, the file somatic_final_minus-dbsnp.snvs.vcf.gz is not there, because...
1 decade ago
chris.cornor modified a comment on discussion General Discussion

Hi Andreas, Thank you for your reply. I would like to say that I had a successful...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

Hi Andreas, Thank you for your reply. I would like to say that I had a successful...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Chris, When you call somatic SNVs then you only need to look at the file that...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

Also I carefully noticed some log comments where I see these comments WARNING [2015-01-15...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

I am watching 3 different outout, 2 is of snvs ( one is relaxed and other is stringent)...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

Yes the tool is working now, I modified the bed file of the company to a general...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Oh ok. That looks like an extension of the bed format. LoFreq (and samtools) expect...
1 decade ago
chris.cornor modified a comment on discussion General Discussion

The format of the bed file looks like this head -5 S03723314_Covered.bed browser...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

The format of the bed file looks like this head -5 S03723314_Covered.bed browser...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

I have used the S03723314_Covered.bed file which you can download from the agilent...
1 decade ago
chris.cornor posted a comment on discussion General Discussion

I am not sure how I can share the bed file, can i host it up anywhere? its almost...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Chris, this looks like an unhandled error triggered in the bed reading function....
1 decade ago
chris.cornor posted a comment on discussion General Discussion

Hi Andreas, I am trying to use lofreq for somatic indel calling using the somatic...
1 decade ago
LoFreq released /lofreq_star-2.1.1_macosx.tgz
1 decade ago
LoFreq released /lofreq_star-2.1.1_linux-x86-64.tgz
1 decade ago
LoFreq released /lofreq_star-2.1.1.tar.gz
1 decade ago
Andreas Wilm modified a wiki page

Home
1 decade ago
Andreas Wilm created a blog post

Moved website and blog to github: http://csb5.github.io/lofreq/
1 decade ago
Andreas Wilm modified a blog post

Release LoFreq 2.1
1 decade ago
Andreas Wilm modified a blog post

Release LoFreq 2.1
1 decade ago
Andreas Wilm modified a blog post

Release LoFreq 2.1
1 decade ago
Andreas Wilm created a blog post

Release LoFreq 2.1
1 decade ago
Andreas Wilm renamed a blog post

Release LoFreq 2.1
1 decade ago
LoFreq released /lofreq_star-2.1.0.tar.gz
1 decade ago
LoFreq released /lofreq_star-2.1.0_linux-x86-64.tgz
1 decade ago
LoFreq released /lofreq_star-2.1.0_macosx.tgz
1 decade ago
jessica preston posted a comment on discussion General Discussion

OK great, thanks!
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Jessica, the strand-bias test checks whether the proportion of bases on forward...
1 decade ago
jessica preston posted a comment on discussion General Discussion

Hi, I am running Lofreq on data that has been run through the program SeqPrep, which...
1 decade ago
Andreas Wilm created a blog post

LoFreq as Docker container
1 decade ago
Andreas Wilm created a blog post

Alpha testers for release 2.1 needed
1 decade ago
Andreas Wilm modified a wiki page

LoFreq-Star-Best-Practices
1 decade ago
Andreas Wilm created a blog post

Performance issues when using bed-file with many regions
1 decade ago
Joon posted a comment on discussion General Discussion

Thank, Andreas. I will try again with the suggested argument. Joon
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Joon There are at least two things going on here. One of the errors seems to come...
1 decade ago
Joon posted a comment on discussion General Discussion

Hi, While running LoFreq, I've got the following error. /cm/local/apps/sge/var/spool/usnee1-lph001-n062/job_scripts/6960547:...
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Brian, thanks for pointing this out! Parsing of the "--cons-as-ref" option was...
1 decade ago
Brian Ondov posted a comment on discussion General Discussion

It actually seems to have to do with the --cons-as-ref option. Without that, it works....
1 decade ago
Andreas Wilm posted a comment on discussion General Discussion

Hi Brian, this is very likely caused by an error in the argument list, i.e. wrong...
1 decade ago
Brian Ondov posted a comment on discussion General Discussion

Thanks for posting the release. Version 2.0.0 (Linux) does get past the previous...
1 decade ago
Andreas Wilm modified a wiki page

LoFreq-Star-Usage
1 decade ago
Andreas Wilm modified a wiki page

LoFreq-Star-Usage
1 decade ago
Andreas Wilm modified a wiki page

LoFreq-Star-Best-Practices
1 decade ago
Andreas Wilm modified a wiki page

LoFreq-Star-Installation
1 decade ago
Andreas Wilm modified a wiki page

LoFreq-Star-Installation
1 decade ago
Andreas Wilm modified a wiki page

LoFreq-Star-Installation
1 decade ago
Andreas Wilm created a blog post

Release of final 2.0.0