-
Notifications
You must be signed in to change notification settings - Fork 355
Description
First I want to say I enjoy reading your code. It's well-designed, well-organized and readable. Thank you for sharing this with the world!
I've been working on customizing the code and came across a memory allocation problem. I think I tracked down the cause and a fix will be pull-requested soon.
More specifically, after making the program on Github CodeSpace, I ran the following command for an output that contains an --overlapped_out option. I got the output of the stats and an error at the end, which will abort the program before the output on another computer.
$ ./fastp -i ./testdata/R1.fq -I ./testdata/R2.fq -m -o ./testdata/outm1.fq -O ./testdata/outm2.fq --merged_out ./testdata/outmerged2.fq --overlapped_out ./testdata/outstrictmerged2.fq
Read1 before filtering:
total reads: 9
total bases: 1208
Q20 bases: 1078(89.2384%)
Q30 bases: 1005(83.1954%)
Read2 before filtering:
total reads: 9
total bases: 1359
Q20 bases: 1100(80.9419%)
Q30 bases: 959(70.5666%)
Merged and filtered:
total reads: 8
total bases: 1552
Q20 bases: 1336(86.0825%)
Q30 bases: 1256(80.9278%)
Filtering result:
reads passed filter: 16
reads failed due to low quality: 0
reads failed due to too many N: 0
reads failed due to too short: 2
reads with adapter trimmed: 0
bases trimmed due to adapters: 0
reads corrected by overlap analysis: 7
bases corrected by overlap analysis: 14
Duplication rate: 55.5556%
Insert size peak (evaluated by paired-end reads): 187
Read pairs merged: 8
% of original read pairs: 88.8889%
% in reads after filtering: 100%
free(): invalid pointer
Aborted (core dumped)
This seems to result from a duplicate deletion of pointers generated in src/peprocessor.cpp
line 462 Read* overlappedRead = new Read(r1->mName, new string(r1->mSeq->substr(max(0,ov.offset)), ov.overlap_len), r1->mStrand, new string(r1->mQuality->substr(max(0,ov.offset)), ov.overlap_len));
Since Read() takes in the string name, overlappedRead takes the same r1->mName and r1->mStrand as the r1 and or1 object created a while above. Then overlappedRead is recycled to be deleted later.
line 464-
recycleToPool1(tid, overlappedRead);
At the cleaning up phase of the function processPairEnd(), or1 (and r1) are also recycled.
line 580-
if(or1) {
recycleToPool1(tid, or1);
or1 = NULL;
}
These two objects or1 and overlappedOut will provide pointers for the memory release. However, since their or1->mName and overlappedOut->mName are pointing to the same address (same for ->mStrand), they will be deleted twice, which will be captured by the system and throw the memory error.
A way to fix it is to designate overlappedRead->mName and overlappedRead->mStrand to nullptr /*before recycling*/ and then delete this read item right after use. A revised pull request is out now.
line 463-
overlappedRead->appendToString(overlappedOut);
overlappedRead->mName = nullptr;
overlappedRead->mStrand = nullptr;
delete overlappedRead;
//recycleToPool1(tid, overlappedRead);
Happy to hear any feedback!
Best,
genommm