Activity for LoFreq

  • Luca Mologni Luca Mologni posted a comment on discussion General Discussion

    Hi all, I am analyzing amplicon deep-seq data, going for rare variants. While I've always run LoFreqFilter with default strand bias filtering (multiple test - FDR), now I would like to filter on a specific SB threshold. Can you advise on reasonable value? Should I consider SB=0 as the only true variants? For example, how do you see this: DP=12219;AF=0.009657;SB=5;DP4=822,11273,11,107 thanks Luca

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Catherine, most variant callers produce rather arbitrary variant scores. LoFreq's variant quality scores are"proper" error probabilities converted into a Phred score. The error probabilities are computed using a poisson binomial distribution, which takes all multiple quality scores (mapping quality, alignment quality, base quality) into account. If you look up the definition of Phred scores you will see that Q20 corresponds to an error probability of 0.01, Q30 to 0.001 etc. 49314 is simply the...

  • Catherine Arnold Catherine Arnold posted a comment on discussion General Discussion

    Hi, I am using LoFreq in combination with another SNV caller. The other SNV caller lists quality scores as Q20, Q30, etc. My LoFreq output is giving a number string with the highest being 49314 in the QUAL column. How is this score calculated and how does it compare to a Phred score call like Q20?

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Dear Eugenia, thanks for you patience, while waiting for a reply. Source quality was a rather experimental attempt to add one more error source to LoFreq's core: it tries to account for contamination/mismappings etc. by looking at the amount of mismatches in a read (think of it as a variation of mapping quality). An accumulation of mismatches in a particular read leads to a penalty. However, you will want to ignore known variants, during the mismatch counting and for this you can for example use...

  • Eugenia Zarza Eugenia Zarza posted a comment on discussion General Discussion

    Hi, I would like to know what 'Source quality' means, and how the -s and -S options affect its computation. I'm trying to call human variants, including indels. As suggested in the online documentation, I would like to use dbSNP, however NCBI holds several databases and I'm not sure which one to use. I hope that understanding what 'source quality' means, will help me decide what is the most suitable database for my current problem. Thank you, Eugenia

  • Camilo Camilo posted a comment on discussion General Discussion

    Thanks for the answers.

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    That's only indirectly possible. You can run it with all filters off on the region of interest: lofreq call -r sq:start-end --no-default-filter -a 1 Andreas On 23 May 2018 at 11:53, Camilo cvillaman@users.sourceforge.net wrote: Thanks for the answers, they are very helpful. I have a final question, though. Is there a way to check why a possible variant is not being called by LoFreq? Source quality and ignore VCF in single tumor sample. https://sourceforge.net/p/lofreq/discussion/general/thread/cdeddc89/?limit=25#e4b0/2978/3285/2592/75a0...

  • Camilo Camilo posted a comment on discussion General Discussion

    Thanks for the answers, they are very helpful. I have a final question, though. Is there a way to check why a possible variant is not being called by LoFreq?

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Oh I see. In general, using source quality will give you more conservative calls. There is a chance that it will undercall in mutational hotspots. Variants in the "ignore vcf" file are just used to tune the source quality computation. Normally reads with lots of variants get a low source quality, however, variants listed in the aforementioned file are ignored for this. These variants are not used to mask final calls! Hope this answers the question, Andreas On 22 May 2018 at 23:14, Camilo cvillaman@users.sourceforge.net...

  • Camilo Camilo posted a comment on discussion General Discussion

    I'm running lofreq call to call the variants, not lofreq somatic, and since I'm using human samples, according to the online documentation I should be using -s (source quality) in combination with -S.

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Camilo, -S won't mask variants. It just affect the somatic variant quality score. In fact, adding dbSNP here should have increased the quality of this call. What happens if you run it without the extra -S? Also, there is not (lowercase) '-s' option. Was that a typo? Best, Andreas On 18 May 2018 at 23:13, Camilo cvillaman@users.sourceforge.net wrote: Hello, I'm using LoFreq to call variants on some human tumor samples. We had analized those samples beforehand, so I had an idea about which variants...

  • Camilo Camilo posted a comment on discussion General Discussion

    Hello, I'm using LoFreq to call variants on some human tumor samples. We had analized those samples beforehand, so I had an idea about which variants should be called. The samples had some variants reported on dbSNP and according to the recommendations in the home page I decided to enable the -s flag and use -S with a dbSNP VCF file, however, those variants weren't being called. So, I've been wanting to ask: Does the -S option mask/remove the variants on the positions on the file?

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Steve, for duplicate marking (if needed) you can use any tools of your choice, e.g. sambamba. For realignment you can use LoFreq's own realigned lofreq viterbi (requires resorting afterwards). For base quality calibration you can still use GATK or alternatively Lacer https://www.biorxiv.org/content/early/2017/04/25/130732. You should get decent results even without recalibration. Best, Andreas On 16 May 2018 at 22:04, Steve stevekm@users.sourceforge.net wrote: In the documentation for LoFreq,...

  • Steve Steve posted a comment on discussion General Discussion

    In the documentation for LoFreq, it is suggested: For Illumina data, we suggest that you preprocess your BAM files by following GATK’s best practice protocol, i.e. that you mark duplicates (not for very high coverage data though), realign indels and recalibrate base qualities with GATK (BQSR). The latter will also add indel qualities, which is needed for indel calling (alternatively use lofreq indelqual). However, GATK has upgraded to version 4, and has dropped many of these tools since they've been...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Steve, yes, these variants are not filtered, even though if you just look at the pvalue/quality, they should be. The reason is that strand-bias is a messy beast and we use some hacks: No one really knows why it happens (AFAIK). In viral amplicon data (for which LoFreq was originally designed) we often saw cases, where simply due to the ultra high coverage, you'd get very high p-values even though nothing seem wrong with these variants if you were to evaluate them by eye (plenty of coverage for...

  • Steve Steve posted a comment on discussion General Discussion

    Thanks Andreas. a significance threshold of 0.01 I was looking in the source code and saw here: https://github.com/CSB5/lofreq/blob/master/src/lofreq/lofreq_filter.c#L1093 if (! no_defaults) { if (cfg.sb_filter.mtc_type==MTC_NONE && ! cfg.sb_filter.thresh) { LOG_VERBOSE("%s\n", "Setting default SB filtering method to FDR"); cfg.sb_filter.mtc_type = MTC_FDR; cfg.sb_filter.alpha = 0.001; } Does this mean that the default Strand Bias filter is at a p-value of 0.001? (cfg.sb_filter.alpha = 0.001) As...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Steve, sure. The basics are explained in the NAR paper (Wilm, 2012): We compute a poisson-binomial distribution taking error probabilities at each pileup site into consideration and derive a p-value from that. Error probabilities were originally just converted base qualities (because that's what they are). In later LoFreq versions we merged base alignment, mapping and base quality into one error probability per base. The logic goes like this: either the read is misaligned (mapping quality) or...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Steve, the strand-bias p-values is turned into a phred-quality, whose upper bound depends on the precision of the float. In practice it can get much higher then 1900. The fact that you see phred values <60 in other programs is simply because it's mostly arbitrary capped there. Andreas On 4 May 2018 at 03:50, Steve stevekm@users.sourceforge.net wrote: I have another question about the SB score values from the .vcf output. It is my understanding that these values are Phred quality scores, which...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Steve, not sure why the actually quality filtering is not mentioned there. Let me look into this. Anyway, the main filtering step is working on the variant qualities (which are converted p-values) and it's by default based on Bonferroni correction and a significance threshold of 0.01 Best, Andreas On 4 May 2018 at 07:12, Steve stevekm@users.sourceforge.net wrote: The FAQ page for LoFreq says Do I need to filter LoFreq predictions? You usually don't. Predicted variants are already filtered using...

  • Steve Steve posted a comment on discussion General Discussion

    The FAQ page for LoFreq says Do I need to filter LoFreq predictions? You usually don't. Predicted variants are already filtered using default parameters (which include coverage, strand-bias, snv-quality etc). However, I do not see any details about what these default filtering parameters are. Is there a description anywhere? When I try to run lofreq filter --verbose, the only output I get is: Setting default SB filtering method to FDR Setting default minimum coverage to 10 What other criteria are...

  • Steve Steve modified a comment on discussion General Discussion

    I have another question about the SB score values from the .vcf output. It is my understanding that these values are Phred quality scores, which usually are in the range of 0 - 50. However, I am getting many with values of 500 - 1900. Is this expected? And if SB=0 mean no strand bias, then this means that these regions are extremely strand biased? Also, in this thread you state: 2147483647: This corresponds to a p-value close to zero, i.e. a highly significant SNV. What is the meaning of 2147483647...

  • Steve Steve posted a comment on discussion General Discussion

    I have another question about the SB score values from the .vcf output. It is my understanding that these values are Phred quality scores, which usually are in the range of 0 - 50. However, I am getting many with values of 500 - 1900. Is this expected? And if SB=0 mean no strand bias, then this means that these regions are extremely strand biased?

  • Steve Steve modified a comment on discussion General Discussion

    As sources of errors, it takes base-qualities, mapping qualities etc into account. Thanks for this. However I was wondering if there was a more thorough explanation of each of the values that are used in calculation of the 'QUAL' score values that are output in the VCF? I did not see it covered in the publication (maybe I missed it?) and wasn't able to figure out what was going on in the source code.

  • Steve Steve posted a comment on discussion General Discussion

    As sources of errors, it takes base-qualities, mapping qualities etc into account. Thanks for this. However I was wondering if there was a more thorough explanation of each of the values that are used in calculation of the 'QUAL' score values that are output in the VCF?

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Francisco, LoFreq doesn't have an AF filter. The default filter is based on variant quality only. It furthermore actually doesn't report genotypes. Taken together this makes it likely that your collaborator post-processed the vcf file somehow. Hope this helps, Andreas On 24 March 2018 at 13:45, Francisco De La Vega ribozyme@users.sourceforge.net wrote: I have received a VCF from LowFeq form a collaborator that used it for calling SNVs from a cfDNA targeted sequencing assay at a high depth of coverage....

  • Francisco De La Vega Francisco De La Vega posted a comment on discussion General Discussion

    I have received a VCF from LowFeq form a collaborator that used it for calling SNVs from a cfDNA targeted sequencing assay at a high depth of coverage. They develop scripts to use UMIs in the adapters to error correct the aligned reads and then produce a BAM file to feed to LowFreq. The aim is to detect somatic variants in the range of 0.5-2% VAF. However, it appears LowFreq not adding the PASS filter tag to variants under ~2% VAF. Further, since these variants are not passed, the genotypes are reported...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Nils, in short: the BAM file was created with a different reference. The checkref subcommand checks whether the reference fasta given on the command line matches the one given in the BAM header. In your case the BAM header contains a sequence named "1", which is not part of the fasta file. Hope this helps, Andreas On 13 November 2017 at 06:05, Nils Engel nils321@users.sf.net wrote: Hi, I have a problem using lofreq with human sequencing data and hg19 or GRCh38 reference sequences ( downloaded...

  • Nils Engel Nils Engel posted a comment on discussion General Discussion

    Hi, I have a problem using lofreq with human sequencing data and hg19 or GRCh38 reference sequences ( downloaded from NCBI with manually changed file extension .fna -> .fa). I guess it might be a problem with improper file format or index. I get an output as follows: nils321$ lofreq checkref GRCh38_latest_genomic.fa 1214474-H8.bam [fai_load] build FASTA index. [fai_fetch_seq] The sequence "1" not found FATAL(samutils.c|checkref:653): Failed to fetch sequence 1 from fasta file Failed An fasta index...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hello, DP4 only lists the reference and variant base counts. There are usually other bases present as well, which are taken into account for computing AF. Hoping this explains the discrepancy, Andreas On 26 October 2017 at 04:58, siva siva80@users.sf.net wrote: Hi I have several variants (especially those with almost hom-alt allele) that have different allele fraction estimates from DP4 and the AF= tag. for example DP=4088;AF=0.872798;SB=171;DP4=9,33,3329,685 Here from DP4, the AF can be estimated...

  • siva siva posted a comment on discussion General Discussion

    Hi I have several variants (especially those with almost hom-alt allele) that have different allele fraction estimates from DP4 and the AF= tag. for example DP=4088;AF=0.872798;SB=171;DP4=9,33,3329,685 Here from DP4, the AF can be estimated to be about 0.98189 which is very different from what is published in the AF= tag. Could you please explain?

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hello, strand-bias is defined as in samtools: reference and alternate base counts on forward and reverse strand are used as input for Fisher's exact test. This tries to quantify in how far the reference and alternate counts on forward and reverse strand differ, i.e. you'll get high p-values if you have lots of reference bases on one and lots of alternate bases on the other strand. It does not test however whether both, reference and alternate bases, are mainly on the same strand. I hope this explanation...

  • Kiril Dimitrov Kiril Dimitrov posted a comment on discussion General Discussion

    Hello, we have analyzed some viral genomes where the strand bias has been estimated as zero. In these results, we have noticed that when the value is zero for the forward or the reverse strands that have the alternate base, the SB=0. Is it that in most cases when in the alternative strands tha value is zero, the SB=0 will be zero (implying no bias) when there actually is bias just by looking at the DP4 data? And maybe such results should not be considered at all? And then is the last example, where...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Erik, hard to tell from this output. Might be because of strand bias. Could you...

  • Erik Reckase Erik Reckase posted a comment on discussion General Discussion

    Can someone tell me why a call was not made at this location? lofreq call -f /var/www/hg19.fa...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Jessica, these are SNVs that show significant strand bias (sb) and are therefore...

  • jessica preston jessica preston posted a comment on discussion General Discussion

    Hello, I'm sorry but I can't seem to find this information in the manual. Can you...

  • LoFreq LoFreq released /lofreq_star-2.1.2_macosx.tgz

  • LoFreq LoFreq released /lofreq_star-2.1.2_linux-x86-64.tgz

  • LoFreq LoFreq released /lofreq_star-2.1.2.tar.gz

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Erik, when you switch of default filtering in the call subcommand[s] LoFreq will...

  • Erik Reckase Erik Reckase posted a comment on discussion General Discussion

    I have a dataset that I am processing with the --bed flag set to a list of mutations...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Sorry, I know what's happening: the filters will only affect the actual SNV calling...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi gmy, that is indeed a bit strange. Which exact LoFreq version are you using? Would...

  • gmy gmy posted a comment on discussion General Discussion

    Hi, Andreas Sorry for late reply. The corresponding output of lofreq is : gi|57116681|ref|NC_000962.2|...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi gmy, I would strongly encourage you to stick to default parameters in LoFreq,...

  • gmy gmy posted a comment on discussion General Discussion

    Hi, I want to filter bases with quality below 20. And I use command like this "lofreq...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    Yes actually I went through the results back and forth and it seems I do not have...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Chris, LoFreq results are filtered already relatively stringent (1% p-value threshold...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    Hi Andreas, I have used your Lofreq to on my normal/tumor pair to retrieve INDELS....

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hey Chris, if you get a final vcf file, then there is no need to rerun LoFreq. Whether...

  • chris.cornor chris.cornor modified a comment on discussion General Discussion

    Hi Andreas, Thank you very much for the reply. I would like to let you know that...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    Hi Andreas, Thank you very much for the reply. I would like to let you know that...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hey Chris, yes, the file somatic_final_minus-dbsnp.snvs.vcf.gz is not there, because...

  • chris.cornor chris.cornor modified a comment on discussion General Discussion

    Hi Andreas, Thank you for your reply. I would like to say that I had a successful...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    Hi Andreas, Thank you for your reply. I would like to say that I had a successful...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Chris, When you call somatic SNVs then you only need to look at the file that...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    Also I carefully noticed some log comments where I see these comments WARNING [2015-01-15...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    I am watching 3 different outout, 2 is of snvs ( one is relaxed and other is stringent)...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    Yes the tool is working now, I modified the bed file of the company to a general...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Oh ok. That looks like an extension of the bed format. LoFreq (and samtools) expect...

  • chris.cornor chris.cornor modified a comment on discussion General Discussion

    The format of the bed file looks like this head -5 S03723314_Covered.bed browser...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    The format of the bed file looks like this head -5 S03723314_Covered.bed browser...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    I have used the S03723314_Covered.bed file which you can download from the agilent...

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    I am not sure how I can share the bed file, can i host it up anywhere? its almost...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Chris, this looks like an unhandled error triggered in the bed reading function....

  • chris.cornor chris.cornor posted a comment on discussion General Discussion

    Hi Andreas, I am trying to use lofreq for somatic indel calling using the somatic...

  • LoFreq LoFreq released /lofreq_star-2.1.1_macosx.tgz

  • LoFreq LoFreq released /lofreq_star-2.1.1_linux-x86-64.tgz

  • LoFreq LoFreq released /lofreq_star-2.1.1.tar.gz

  • Andreas Wilm Andreas Wilm modified a wiki page

    Home

  • Andreas Wilm Andreas Wilm created a blog post

    Moved website and blog to github: http://csb5.github.io/lofreq/

  • Andreas Wilm Andreas Wilm modified a blog post

    Release LoFreq 2.1

  • Andreas Wilm Andreas Wilm modified a blog post

    Release LoFreq 2.1

  • Andreas Wilm Andreas Wilm modified a blog post

    Release LoFreq 2.1

  • Andreas Wilm Andreas Wilm created a blog post

    Release LoFreq 2.1

  • Andreas Wilm Andreas Wilm renamed a blog post

    Release LoFreq 2.1

  • LoFreq LoFreq released /lofreq_star-2.1.0.tar.gz

  • LoFreq LoFreq released /lofreq_star-2.1.0_linux-x86-64.tgz

  • LoFreq LoFreq released /lofreq_star-2.1.0_macosx.tgz

  • jessica preston jessica preston posted a comment on discussion General Discussion

    OK great, thanks!

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Jessica, the strand-bias test checks whether the proportion of bases on forward...

  • jessica preston jessica preston posted a comment on discussion General Discussion

    Hi, I am running Lofreq on data that has been run through the program SeqPrep, which...

  • Andreas Wilm Andreas Wilm created a blog post

    LoFreq as Docker container

  • Andreas Wilm Andreas Wilm created a blog post

    Alpha testers for release 2.1 needed

  • Andreas Wilm Andreas Wilm modified a wiki page

    LoFreq-Star-Best-Practices

  • Andreas Wilm Andreas Wilm created a blog post

    Performance issues when using bed-file with many regions

  • Joon Joon posted a comment on discussion General Discussion

    Thank, Andreas. I will try again with the suggested argument. Joon

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Joon There are at least two things going on here. One of the errors seems to come...

  • Joon Joon posted a comment on discussion General Discussion

    Hi, While running LoFreq, I've got the following error. /cm/local/apps/sge/var/spool/usnee1-lph001-n062/job_scripts/6960547:...

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Brian, thanks for pointing this out! Parsing of the "--cons-as-ref" option was...

  • Brian Ondov Brian Ondov posted a comment on discussion General Discussion

    It actually seems to have to do with the --cons-as-ref option. Without that, it works....

  • Andreas Wilm Andreas Wilm posted a comment on discussion General Discussion

    Hi Brian, this is very likely caused by an error in the argument list, i.e. wrong...

  • Brian Ondov Brian Ondov posted a comment on discussion General Discussion

    Thanks for posting the release. Version 2.0.0 (Linux) does get past the previous...

  • Andreas Wilm Andreas Wilm modified a wiki page

    LoFreq-Star-Usage

  • Andreas Wilm Andreas Wilm modified a wiki page

    LoFreq-Star-Usage

  • Andreas Wilm Andreas Wilm modified a wiki page

    LoFreq-Star-Best-Practices

  • Andreas Wilm Andreas Wilm modified a wiki page

    LoFreq-Star-Installation

  • Andreas Wilm Andreas Wilm modified a wiki page

    LoFreq-Star-Installation

  • Andreas Wilm Andreas Wilm modified a wiki page

    LoFreq-Star-Installation

  • Andreas Wilm Andreas Wilm created a blog post

    Release of final 2.0.0

1 >
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.