scReadSim.Utility.scATAC_CreateFeatureSets
- scReadSim.Utility.scATAC_CreateFeatureSets(INPUT_bamfile, samtools_directory, bedtools_directory, outdirectory, genome_size_file, peak_mode='macs3', macs3_directory=None, INPUT_peakfile=None, INPUT_nonpeakfile=None, OUTPUT_peakfile=None, superset_peakfile=None)[source]
Create the foreground and background feature set for the input scATAC-seq bam file.
- Parameters:
INPUT_bamfile (str) – Directory of input BAM file.
samtools_directory (str) – Directory of software samtools.
bedtools_directory (str) – Directory of software bedtools.
outdirectory (str) – Output directory.
genome_size_file (str) – Directory of Genome sizes file. The file should be a tab delimited text file with two columns: first column for the chromosome name, second column indicates the size.
peak_mode (str (default: macs3)) – Specify mode for trustworthy peak and non-peak generation, must be one of the following: “macs3”, “user”, and “superset”.
macs3_directory (str (default: None)) – Path to software MACS3. Must be specified if INPUT_peakfile and INPUT_nonpeakfile are None. Must be specified under peak_mode “macs3” or “superset”.
INPUT_peakfile (str (default: None)) – Directory of user-specified input peak file. Must be specified under peak_mode “user”.
INPUT_nonpeakfile (str (default: None)) – Directory of user-specified input non-peak file. Must be specified under peak_mode “user”.
superset_peakfile (str (default: None)) – Directory of a superset of potential chromatin open regions, including sources such as ENCODE cCRE (Candidate Cis-Regulatory Elements) collection. Must be specified under peak_mode “superset”.
OUTPUT_peakfile (str (default: None)) – Directory of user-specified output peak file. Synthetic scATAC-seq reads will be generated taking OUTPUT_peakfile as ground truth peaks. Note that OUTPUT_peakfile does not name the generated feature files by function scATAC_CreateFeatureSets.