BigBWA: approaching the Burrows–Wheeler aligner to Big Data technologies Free

Main characteristics of the input datasets

Tag	Name	Number of reads	Read length (bp)	Size (GB)
D1	NA12750/ERR000589	12 × 10⁶	51	3.9
D2	HG00096/SRR062634	24.1 × 10⁶	100	13.4
D3	150140/SRR642648	98.8 × 10⁶	100	54.7

Tag	Name	Number of reads	Read length (bp)	Size (GB)
D1	NA12750/ERR000589	12 × 10⁶	51	3.9
D2	HG00096/SRR062634	24.1 × 10⁶	100	13.4
D3	150140/SRR642648	98.8 × 10⁶	100	54.7

All the datasets were extracted from the 1000 Genomes Project (Altshuler et al., 2010).

Table 1.

Main characteristics of the input datasets

Tag	Name	Number of reads	Read length (bp)	Size (GB)
D1	NA12750/ERR000589	12 × 10⁶	51	3.9
D2	HG00096/SRR062634	24.1 × 10⁶	100	13.4
D3	150140/SRR642648	98.8 × 10⁶	100	54.7

Tag	Name	Number of reads	Read length (bp)	Size (GB)
D1	NA12750/ERR000589	12 × 10⁶	51	3.9
D2	HG00096/SRR062634	24.1 × 10⁶	100	13.4
D3	150140/SRR642648	98.8 × 10⁶	100	54.7

All the datasets were extracted from the 1000 Genomes Project (Altshuler et al., 2010).

Table 2.

Comparison of the performance for the BWA-backtrack algorithm

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	SEAL	148.5	55.7 ± 1.6	28.3 ± 1.0	22.2 ± 0.6	11.1 ± 0.1	5.7 ± 0.0	2.7	5.2	6.7	13.4	26.0
	pBWA		42.0 ± 0.7	25.3 ± 1.1	17.7 ± 0.5	9.2 ± 0.1	5.1 ± 0.1	3.5	5.9	8.4	16.1	29.1
	BigBWA		42.4 ± 0.9	23.8 ± 0.7	15.4 ± 0.4	8.5 ± 0.2	4.5 ± 0.1	3.5	6.2	9.6	17.3	33.0
D2	SEAL	556.9	186.5 ± 1.7	92.6 ± 0.8	68.1 ± 1.9	35.4 ± 0.7	18.5 ± 0.3	2.9	6.0	8.2	15.7	30.1
	pBWA		155.0 ± 0.4	94.5 ± 1.6	61.2 ± 1.5	32.7 ± 0.4	17.1 ± 0.3	3.6	5.9	9.0	17.0	32.6
	BigBWA		152.0 ± 0.3	82.3 ± 1.6	57.2 ± 0.8	30.3 ± 0.5	15.3 ± 0.1	3.7	6.8	9.7	18.3	36.4

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	SEAL	148.5	55.7 ± 1.6	28.3 ± 1.0	22.2 ± 0.6	11.1 ± 0.1	5.7 ± 0.0	2.7	5.2	6.7	13.4	26.0
	pBWA		42.0 ± 0.7	25.3 ± 1.1	17.7 ± 0.5	9.2 ± 0.1	5.1 ± 0.1	3.5	5.9	8.4	16.1	29.1
	BigBWA		42.4 ± 0.9	23.8 ± 0.7	15.4 ± 0.4	8.5 ± 0.2	4.5 ± 0.1	3.5	6.2	9.6	17.3	33.0
D2	SEAL	556.9	186.5 ± 1.7	92.6 ± 0.8	68.1 ± 1.9	35.4 ± 0.7	18.5 ± 0.3	2.9	6.0	8.2	15.7	30.1
	pBWA		155.0 ± 0.4	94.5 ± 1.6	61.2 ± 1.5	32.7 ± 0.4	17.1 ± 0.3	3.6	5.9	9.0	17.0	32.6
	BigBWA		152.0 ± 0.3	82.3 ± 1.6	57.2 ± 0.8	30.3 ± 0.5	15.3 ± 0.1	3.7	6.8	9.7	18.3	36.4

Highlighted the best tool for a particular number of cores. For fair comparison with the other tools, BigBWA obtains these results using BWA version 0.5.10. Tool versions: pBWA 0.5.9 and SEAL 0.4.0.

Table 2.

Comparison of the performance for the BWA-backtrack algorithm

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	SEAL	148.5	55.7 ± 1.6	28.3 ± 1.0	22.2 ± 0.6	11.1 ± 0.1	5.7 ± 0.0	2.7	5.2	6.7	13.4	26.0
	pBWA		42.0 ± 0.7	25.3 ± 1.1	17.7 ± 0.5	9.2 ± 0.1	5.1 ± 0.1	3.5	5.9	8.4	16.1	29.1
	BigBWA		42.4 ± 0.9	23.8 ± 0.7	15.4 ± 0.4	8.5 ± 0.2	4.5 ± 0.1	3.5	6.2	9.6	17.3	33.0
D2	SEAL	556.9	186.5 ± 1.7	92.6 ± 0.8	68.1 ± 1.9	35.4 ± 0.7	18.5 ± 0.3	2.9	6.0	8.2	15.7	30.1
	pBWA		155.0 ± 0.4	94.5 ± 1.6	61.2 ± 1.5	32.7 ± 0.4	17.1 ± 0.3	3.6	5.9	9.0	17.0	32.6
	BigBWA		152.0 ± 0.3	82.3 ± 1.6	57.2 ± 0.8	30.3 ± 0.5	15.3 ± 0.1	3.7	6.8	9.7	18.3	36.4

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	SEAL	148.5	55.7 ± 1.6	28.3 ± 1.0	22.2 ± 0.6	11.1 ± 0.1	5.7 ± 0.0	2.7	5.2	6.7	13.4	26.0
	pBWA		42.0 ± 0.7	25.3 ± 1.1	17.7 ± 0.5	9.2 ± 0.1	5.1 ± 0.1	3.5	5.9	8.4	16.1	29.1
	BigBWA		42.4 ± 0.9	23.8 ± 0.7	15.4 ± 0.4	8.5 ± 0.2	4.5 ± 0.1	3.5	6.2	9.6	17.3	33.0
D2	SEAL	556.9	186.5 ± 1.7	92.6 ± 0.8	68.1 ± 1.9	35.4 ± 0.7	18.5 ± 0.3	2.9	6.0	8.2	15.7	30.1
	pBWA		155.0 ± 0.4	94.5 ± 1.6	61.2 ± 1.5	32.7 ± 0.4	17.1 ± 0.3	3.6	5.9	9.0	17.0	32.6
	BigBWA		152.0 ± 0.3	82.3 ± 1.6	57.2 ± 0.8	30.3 ± 0.5	15.3 ± 0.1	3.7	6.8	9.7	18.3	36.4

Highlighted the best tool for a particular number of cores. For fair comparison with the other tools, BigBWA obtains these results using BWA version 0.5.10. Tool versions: pBWA 0.5.9 and SEAL 0.4.0.

Performance of BWA-MEM is shown in Table 3. It was measured using only BWA (threaded version) and BigBWA, because SEAL and pBWA do not support this algorithm. We have also included results for a hybrid version that uses BigBWA in such a way that each mapper processes the inputs using BWA with two threads. Results show that, with a small number of cores, BWA behaves slightly better than BigBWA. Note that BWA is limited to execute on just one cluster node and, therefore, we cannot provide results using more than 16 cores. Considering 16 cores, BigBWA is always the best solution but, due to the memory assigned per map task in our cluster configuration, only 13 concurrent tasks can be executed on one node. In this way, BigBWA always distributes the tasks between two nodes when using 16 cores. In addition, BigBWA shows good behavior in terms of scalability for all the datasets considered, executing up to 36.6× faster than the sequential case. Additional performance results are shown in the Supplementary Material.

Table 3.

Comparison of the performance for the BWA-MEM algorithm

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	BWA-Threads	106.6	27.6 ± 0.1	14.3 ± 0.1	10.9 ± 0.0	—	—	3.9	7.4	9.8	—	—
	BigBWA (hybrid)		29.6 ± 0.2	15.1 ± 0.3	11.8 ± 0.1	6.8 ± 0.3	3.6 ± 0.1	3.6	7.0	9.0	15.7	29.6
	BigBWA		29.1 ± 0.3	15.7 ± 0.1	7.9 ± 0.1	4.5 ± 0.1	3.0 ± 0.1	3.7	6.8	13.4	23.5	35.5
D2	BWA-Threads	258.0	66.0 ± 0.1	33.7 ± 0.1	24.9 ± 0.0	—	—	3.9	7.6	10.4	—	—
	BigBWA (hybrid)		69.6 ± 1.3	36.4 ± 0.6	24.5 ± 0.5	15.3 ± 0.1	8.8 ± 0.1	3.7	7.1	10.5	16.8	29.3
	BigBWA		69.1 ± 1.4	37.5 ± 0.4	20.7 ± 0.5	10.9 ± 0.3	7.2 ± 0.3	3.7	6.9	12.5	23.6	35.8
D3	BWA-Threads	3208.6	816.8 ± 2.5	408.1 ± 1.7	333.3 ± 0.3	—	—	3.9	7.9	9.6	—	—
	BigBWA (hybrid)		828.8 ± 9.5	431.1 ± 9.0	221.9 ± 4.0	183.3 ± 2.2	107.2 ± 0.8	3.9	7.4	14.5	17.5	29.9
	BigBWA		848.8 ± 13.6	444.9 ± 8.2	229.2 ± 5.1	120.1 ± 1.4	87.8 ± 0.2	3.8	7.2	14.0	26.7	36.6

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	BWA-Threads	106.6	27.6 ± 0.1	14.3 ± 0.1	10.9 ± 0.0	—	—	3.9	7.4	9.8	—	—
	BigBWA (hybrid)		29.6 ± 0.2	15.1 ± 0.3	11.8 ± 0.1	6.8 ± 0.3	3.6 ± 0.1	3.6	7.0	9.0	15.7	29.6
	BigBWA		29.1 ± 0.3	15.7 ± 0.1	7.9 ± 0.1	4.5 ± 0.1	3.0 ± 0.1	3.7	6.8	13.4	23.5	35.5
D2	BWA-Threads	258.0	66.0 ± 0.1	33.7 ± 0.1	24.9 ± 0.0	—	—	3.9	7.6	10.4	—	—
	BigBWA (hybrid)		69.6 ± 1.3	36.4 ± 0.6	24.5 ± 0.5	15.3 ± 0.1	8.8 ± 0.1	3.7	7.1	10.5	16.8	29.3
	BigBWA		69.1 ± 1.4	37.5 ± 0.4	20.7 ± 0.5	10.9 ± 0.3	7.2 ± 0.3	3.7	6.9	12.5	23.6	35.8
D3	BWA-Threads	3208.6	816.8 ± 2.5	408.1 ± 1.7	333.3 ± 0.3	—	—	3.9	7.9	9.6	—	—
	BigBWA (hybrid)		828.8 ± 9.5	431.1 ± 9.0	221.9 ± 4.0	183.3 ± 2.2	107.2 ± 0.8	3.9	7.4	14.5	17.5	29.9
	BigBWA		848.8 ± 13.6	444.9 ± 8.2	229.2 ± 5.1	120.1 ± 1.4	87.8 ± 0.2	3.8	7.2	14.0	26.7	36.6

Highlighted the best tool for a particular number of cores. These results were obtained using BWA version 0.7.12.

Table 3.

Comparison of the performance for the BWA-MEM algorithm

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	BWA-Threads	106.6	27.6 ± 0.1	14.3 ± 0.1	10.9 ± 0.0	—	—	3.9	7.4	9.8	—	—
	BigBWA (hybrid)		29.6 ± 0.2	15.1 ± 0.3	11.8 ± 0.1	6.8 ± 0.3	3.6 ± 0.1	3.6	7.0	9.0	15.7	29.6
	BigBWA		29.1 ± 0.3	15.7 ± 0.1	7.9 ± 0.1	4.5 ± 0.1	3.0 ± 0.1	3.7	6.8	13.4	23.5	35.5
D2	BWA-Threads	258.0	66.0 ± 0.1	33.7 ± 0.1	24.9 ± 0.0	—	—	3.9	7.6	10.4	—	—
	BigBWA (hybrid)		69.6 ± 1.3	36.4 ± 0.6	24.5 ± 0.5	15.3 ± 0.1	8.8 ± 0.1	3.7	7.1	10.5	16.8	29.3
	BigBWA		69.1 ± 1.4	37.5 ± 0.4	20.7 ± 0.5	10.9 ± 0.3	7.2 ± 0.3	3.7	6.9	12.5	23.6	35.8
D3	BWA-Threads	3208.6	816.8 ± 2.5	408.1 ± 1.7	333.3 ± 0.3	—	—	3.9	7.9	9.6	—	—
	BigBWA (hybrid)		828.8 ± 9.5	431.1 ± 9.0	221.9 ± 4.0	183.3 ± 2.2	107.2 ± 0.8	3.9	7.4	14.5	17.5	29.9
	BigBWA		848.8 ± 13.6	444.9 ± 8.2	229.2 ± 5.1	120.1 ± 1.4	87.8 ± 0.2	3.8	7.2	14.0	26.7	36.6

Dataset	Tool	Execution time (minutes)						Speedup
		Number of cores						Number of cores
		1	4	8	16	32	64	4	8	16	32	64
D1	BWA-Threads	106.6	27.6 ± 0.1	14.3 ± 0.1	10.9 ± 0.0	—	—	3.9	7.4	9.8	—	—
	BigBWA (hybrid)		29.6 ± 0.2	15.1 ± 0.3	11.8 ± 0.1	6.8 ± 0.3	3.6 ± 0.1	3.6	7.0	9.0	15.7	29.6
	BigBWA		29.1 ± 0.3	15.7 ± 0.1	7.9 ± 0.1	4.5 ± 0.1	3.0 ± 0.1	3.7	6.8	13.4	23.5	35.5
D2	BWA-Threads	258.0	66.0 ± 0.1	33.7 ± 0.1	24.9 ± 0.0	—	—	3.9	7.6	10.4	—	—
	BigBWA (hybrid)		69.6 ± 1.3	36.4 ± 0.6	24.5 ± 0.5	15.3 ± 0.1	8.8 ± 0.1	3.7	7.1	10.5	16.8	29.3
	BigBWA		69.1 ± 1.4	37.5 ± 0.4	20.7 ± 0.5	10.9 ± 0.3	7.2 ± 0.3	3.7	6.9	12.5	23.6	35.8
D3	BWA-Threads	3208.6	816.8 ± 2.5	408.1 ± 1.7	333.3 ± 0.3	—	—	3.9	7.9	9.6	—	—
	BigBWA (hybrid)		828.8 ± 9.5	431.1 ± 9.0	221.9 ± 4.0	183.3 ± 2.2	107.2 ± 0.8	3.9	7.4	14.5	17.5	29.9
	BigBWA		848.8 ± 13.6	444.9 ± 8.2	229.2 ± 5.1	120.1 ± 1.4	87.8 ± 0.2	3.8	7.2	14.0	26.7	36.6

Highlighted the best tool for a particular number of cores. These results were obtained using BWA version 0.7.12.

Correctness: We verified the correctness of BigBWA by comparing its output file with the one generated by BWA. Differences range from 0.06% to 1% on uniquely mapped reads (mapping quality greater than zero), similarly to the differences shown by the threaded version of BWA with respect to the sequential case.

Funding

This work was partially supported by MINECO (Spain) grants TIN2013-41129-P and TIN2014-54565-JIN.

Conflict of Interest: none declared.

References

Altshuler

et al. . (

2010

)

A map of human genome variation from population-scale sequencing

Nature

467

1061

–

1073

PubMed

Dean

Ghemawat

(

2008

)

MapReduce: simplified data processing on large clusters

Commun. ACM

107

–

113

Crossref

Leo

Zanetti

(

2010

)

Pydoop: a Python MapReduce and HDFS API for Hadoop

. In:

Proceeding of 19th Symposyum on HPDC

ACM

Chicago (USA)

, pp.

819

–

825

(

2013

)

Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM

arXiv, 1303.3997v2

Durbin

(

2009

)

Fast and accurate short read alignment with Burrows-Wheeler transform

Bioinformatics

1754

–

1760

Durbin

(

2010

)

Fast and accurate long-read alignment with Burrows-Wheeler transform

Bioinformatics

589

–

595

Liang

(

1999

)

Java Native Interface: Programmer’s Guide and Reference

, 1st edn.

Addison-Wesley Longman Publishing Co., Inc.

Boston, MA, USA

Google Preview

Peters

et al. . (

2012

)

Speeding up large-scale next generation sequencing data analysis with pBWA

J. Appl. Bioinform. Comput. Biol.

–