cassiopeia.pp.convert_fastqs_to_unmapped_bam

cassiopeia.pp.convert_fastqs_to_unmapped_bam(fastq_fps, chemistry, output_directory, name=None, n_threads=1)[source]

Converts FASTQs into an unmapped BAM based on a chemistry.

This function converts a set of FASTQs into an unmapped BAM with appropriate BAM tags.

Parameters
fastq_fps : List[str]List[str]

List of paths to FASTQ files. Usually, this argument contains two FASTQs, where the first contains the barcode and UMI sequences and the second contains cDNA. The FASTQs may be gzipped.

chemistry : {‘dropseq’, ‘10xv2’, ‘10xv3’, ‘indropsv3’, ‘slideseq2’}Literal[‘dropseq’, ‘10xv2’, ‘10xv3’, ‘indropsv3’, ‘slideseq2’]

Sample-prep/sequencing chemistry used. The following chemistries are supported: * dropseq: Droplet-based scRNA-seq chemistry described in

Macosco et al. 2015

  • 10xv2: 10x Genomics 3’ version 2

  • 10xv3: 10x Genomics 3’ version 3

  • indropsv3: inDrops version 3 by Zilionis et al. 2017

  • slideseq2: Slide-seq version 2

output_directory : strstr

The output directory where the unmapped BAM will be written to. This directory must exist prior to calling this function.

name : str, NoneOptional[str] (default: None)

Name of the reads in the FASTQs. This name is set as the read group name for the reads in the output BAM, as well as the filename prefix of the output BAM. If not provided, a short random UUID is used as the read group name, but not as the filename prefix of the BAM.

n_threads : intint (default: 1)

Number of threads to use. Defaults to 1.

Return type

strstr

Returns

Path to written BAM

Raises

PreprocessError if the provided chemistry does not exist.