cassiopeia.pp.convert_fastqs_to_unmapped_bam#

cassiopeia.pp.convert_fastqs_to_unmapped_bam(fastq_fps, chemistry, output_directory, name=None, n_threads=1)[source]#

Converts FASTQs into an unmapped BAM based on a chemistry.

This function converts a set of FASTQs into an unmapped BAM with appropriate BAM tags.

Parameters:
fastq_fps List[str]

List of paths to FASTQ files. Usually, this argument contains two FASTQs, where the first contains the barcode and UMI sequences and the second contains cDNA. The FASTQs may be gzipped.

chemistry Literal['dropseq', '10xv2', '10xv3', 'indropsv3', 'slideseq2']

Sample-prep/sequencing chemistry used. The following chemistries are supported: * dropseq: Droplet-based scRNA-seq chemistry described in

Macosco et al. 2015

  • 10xv2: 10x Genomics 3’ version 2

  • 10xv3: 10x Genomics 3’ version 3

  • indropsv3: inDrops version 3 by Zilionis et al. 2017

  • slideseq2: Slide-seq version 2

output_directory str

The output directory where the unmapped BAM will be written to. This directory must exist prior to calling this function.

name str | NoneOptional[str] (default: None)

Name of the reads in the FASTQs. This name is set as the read group name for the reads in the output BAM, as well as the filename prefix of the output BAM. If not provided, a short random UUID is used as the read group name, but not as the filename prefix of the BAM.

n_threads int (default: 1)

Number of threads to use. Defaults to 1.

Return type:

str

Returns:

Path to written BAM

Raises:

PreprocessError if the provided chemistry does not exist.