Supplementary Figure 1. Performance evaluation of the six copy number variation (CNV) algorithms using the simulated data with the loose criteria: at least 60% overlap between the inferred and ground truth CNV segments and inclusion of ≥ 0.5Mbp CNV segments. A) True positive rate(TPR), B) False discovery rate (FDR), and C) F1 score of the CNV detections achieved by the different tools when the read coverage is varied. The data points are based on the window size comparison results (Supplementary Figures 2-6), from which we selected the window settings that provided the highest F1 scores by the algorithms at each read coverage. Error bars denote the standard error of the results produced with 20 different random subsets. Supplementary Figure 2. Analysis of how the window size affects the performance of CNVnator at different read coverages with simulated data. A) True positive rate (TPR), B) False discovery rate (FDR), and C) F1 score. The hard criteria (minimum overlap of 0.8 and no filtering by size) were used in the analysis. Supplementary Figure 3. Analysis of how the window size affects the performance of BICseq2 at different read coverages with simulated data. A) True positive rate (TPR), B) False discovery rate (FDR), and C) F1 score. Default window size is 0.1 kbp. The hard criteria (minimum overlap of 0.8 and no filtering by size) were used in the analysis. Supplementary Figure 4. Analysis of how the window size affects the performance of FREEC at different read coverages with simulated data. A) True positive rate (TPR), B) False discovery rate (FDR), and C) F1 score. The coefficient of variation of 0.05 is the default value of the built-in method of FREEC for selecting the window size based on the coverage. The hard criteria (minimum overlap of 0.8 and no filtering by size) were used in the analysis. Supplementary Figure 5. Analysis of how the window size affects the performance of HMMcopy at different read coverages with simulated data. A) True positive rate (TPR), B) False discovery rate (FDR), and C) F1 score. The hard criteria (minimum overlap of 0.8 and no filtering by size) were used in the analysis. Supplementary Figure 6. Analysis of how the window size affects the performance of QDNAseq at different read coverages with simulated data. A) True positive rate (TPR), B) False discovery rate (FDR), and C) F1 score. The hard criteria (minimum overlap of 0.8 and no filtering by size) were used in the analysis. Supplementary Figure 7. Visualization of the CNVs detected in the H9-AB-p116 dataset using the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figures. 2-6). Supplementary Figure 8. Visualization of the CNVs detected in the H9-AB-p113 dataset using the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figures. 2-6). Supplementary Figure 9. Visualization of the CNVs detected in the H9-p38 dataset using the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figures. 2-6). Supplementary Figure 10. Visualization of the CNVs detected in the H9-p41 dataset using the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figures. 2-6). Supplementary Figure 11. Visualization of the CNVs detected in all the chromosomes in the combined sample H9-AB by the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottompart of the visualization depicts the depth of read coverage at each 50 kbp window. The visualizationincludes every CNV found with each tool using the window size that yielded the best performancefor the simulated data at coverage of 0.1x (see Supplementary Figures. 2-6). Supplementary Figure 12. Visualization of the CNVs detected in all the chromosomes in the combined sample H9-NO by the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figs. 2-6). Supplementary Figure 13. Visualization of the CNVs detected in all the chromosomes in the combined sample H9-AB-p116 by the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x. Supplementary Figure 14. Visualization of the CNVs detected in all the chromosomes in the combined sample H9-AB-p113 by the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figs. 2-6). Supplementary Figure 15. Visualization of the CNVs detected in all the chromosomes in the combined sample H9-NO-p41 by the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figs. 2-6). Supplementary Figure 16. Visualization of the CNVs detected in all the chromosomes in the combined sample H9-NO-p38 by the six algorithms along with the array-based benchmark CNV segments in the respective chromosomal locations. Deletions are marked in red and gains in blue. The bottom part of the visualization depicts the depth of read coverage at each 50 kbp window. All chromosomes included. Combined sample H9-NO-p38. The visualization includes every CNV found with each tool using the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figs. 2-6). Supplementary Figure 17. Performance evaluation of the six algorithms using the cell line data with the stringent criteria: at least 80% overlap between the inferred and array-validated CNV segments and ≥0.5Mbp CNV length requirement for the detected CNV segment. A,D) True positive rate, B,E) False discovery rate and C,F) F1 score of the CNV detections. The red and blue dots depict the abnormal and normal samples, respectively. With each tool we used the the window size that yielded the best performance for the simulated data at coverage of 0.1x (see Supplementary Figs. 2-6). Supplementary Table 1. Simulated CNV segments that were used to evaluate the tools. Supplementary Table 2. Number of bases and read coverage of the cell line samples for each sample individually and for the combined samples. Supplementary Table 3. Array-based CNV segments ≥500 kbp used to evaluate the tools. Supplementary Table 4. Failure rates for different read coverages with varying window size settings and 20 different down samplings using simulated data.