MarDRe

MarDRe is a de novo MapReduce-based parallel tool to remove duplicate and near-duplicate DNA reads through the clustering of single-end and paired-end sequences from FASTQ/FASTA datasets. This tool allows bioinformatics to avoid the analysis of not necessary reads, reducing the time of subsequent procedures with the dataset.

MarDRe is the Big Data counterpart of ParDRe (link above), which employs HPC technologies (i.e., hybrid MPI/multithreading) to reduce runtime on multicore systems. Instead, MarDRe takes advantage of the MapReduce programming model to significantly improve ParDRe performance on distributed systems, especially on cloud-based infrastructures. Written in pure Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for Big Data processing.

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow MarDRe

MarDRe Web Site

Other Useful Business Software

Your top-rated shield against malware and online scams | Avast Free Antivirus

Browse and email in peace, supported by clever AI

Our antivirus software scans for security and performance issues and helps you to fix them instantly. It also protects you in real time by analyzing unknown files before they reach your desktop PC or laptop — all for free.

Free Download

Rate This Project

User Reviews

Be the first to post a review of MarDRe!

Additional Project Details

Operating Systems

Linux

Intended Audience

Information Technology, Healthcare Industry, Science/Research

User Interface

Console/Terminal, Command-line

Programming Language

Java

Related Categories

Java Bio-Informatics Software, Java Big Data Tool

Registered

2017-01-30

Similar Business Software

OmicsBox

OmicsBox is a leading bioinformatics solution that offers end-to-end data analysis of genomes, transcriptomes, metagenomes, and genetic variation studies. The application is used by top private and public research institutions worldwide and allows researchers to easily process large and complex...

See Software
Illumina Connected Analytics

Store, archive, manage, and collaborate on multi-omic datasets. Illumina Connected Analytics is a secure genomic data platform to operationalize informatics and drive scientific insights. Easily import, build, and edit workflows with tools like CWL and Nextflow. Leverage DRAGEN bioinformatics...

See Software
Google Cloud BigQuery

BigQuery is a serverless, multicloud data warehouse that simplifies the process of working with all types of data so you can focus on getting valuable business insights quickly. At the core of Google’s data cloud, BigQuery allows you to simplify data integration, cost effectively and securely...

See Software