Skip to contents

This function uses the NCBI SRA Toolkit via system2() to download files from SRA and convert them to fastq.gz. To process files in parallel, register a parallel backend, e.g., using doParallel::registerDoParallel(). Beware that intermediate files created by fasterq-dump are uncompressed and could require hundreds of gigabytes if files are processed in parallel.

Usage

fetch(
  accessions,
  outputDir,
  overwrite = FALSE,
  keepSra = FALSE,
  prefetchCmd = "prefetch",
  prefetchArgs = NULL,
  fasterqdumpCmd = "fasterq-dump",
  fasterqdumpArgs = NULL,
  pigzCmd = "pigz",
  pigzArgs = NULL
)

Arguments

accessions

Character vector of SRA run accessions.

outputDir

String indicating the local directory in which to save the files. Will be created if it doesn't exist.

overwrite

Logical indicating whether to overwrite files that already exist in outputDir.

keepSra

Logical indicating whether to keep the ".sra" files.

prefetchCmd

String indicating command for prefetch, which downloads ".sra" files.

prefetchArgs

Character vector indicating arguments to pass to prefetch.

fasterqdumpCmd

String indicating command for fasterq-dump, which uses ".sra" files to create ".fastq" files.

fasterqdumpArgs

Character vector indicating arguments to pass to fasterq-dump.

pigzCmd

String indicating command for pigz, which converts ".fastq" files to ".fastq.gz" files.

pigzArgs

Character vector indicating arguments to pass to pigz.

Value

A list. As the function runs, it updates a tab-delimited log file in outputDir called "progress.tsv".