ngsShoRT- A Next Generation Sequencing Short Read Trimmer

Welcome to ngShoRT

ngsShoRT (Next Generation Sequencing Short Read Trimmer) a comprehensive and flexible open-source software package written in Perl that implements the novel algorithms developed by our group and many other commonly used pre-processing algorithms in the literature. ngsShoRT algorithms are designed to pre-process Single Read (SR) or Paired-end (PE)/Mate-pair (MP) reads in FastQ format or Illumina's native QSEQ format (with compressed file support). It privides parallel processing by multi-threading to deal with large volume of data and reduce running time. Another unique feature of ngsShoRT is that it was designed to handle PE/MP reads generically using paired-end specific modules. These algorithms/methods are implemented in the platform-independent programming language Perl.

  The current version (2.1) of ngsShoRT has the following features:
  • Compatible with different kinds of Input files and formats:
    1. FastQ and Illumina qseq formats (ngsShoRT auto-detects the format).
    2. Single-Read (SR) or Paired-End (PE) short read datasets files.
    3. Single or Multiple SR or PE files (with or without merging the final trimmed output).
  • Paired-end reads trimming problem: surviving (widowed) reads:
  • Read filtering methods (LQR, nperc, ncutoff, 5adpt[kr], qseq0, qseqB[kr]) will occasionally trim only one read in the read pair while the other read remains intact. Such a read is called a "widowed read." In such cases, the entire pair is discarded from the final PE output, but the surviving/widowed read is saved in a separate single-read file, surviving_SR_reads.fastq.

  • Parallel processing:
  • Using Perl's Thread module, which significantly improves trimming time for large datasets.

  • Many algorithms/methods: