src.utils package
Subpackages
- src.utils.demultiplexor_adapter package
- src.utils.table_manager package
- Submodules
- src.utils.table_manager.csv_table_manager module
- src.utils.table_manager.excel_table_manager module
- src.utils.table_manager.i_table_manager module
- src.utils.table_manager.sample_sheet_builder module
- src.utils.table_manager.sample_sheet_container module
- src.utils.table_manager.table_manager_factory module
- src.utils.table_manager.xml_table_manager module
- Module contents
Submodules
src.utils.reads_merger module
FAST(A|Q) Reads Merger.
This script consolidates paired-end FASTQ file from a specified directory by merging multiple files per sample if necessary, or copying single files directly. It identifies samples based on a provided pattern, and processes R1 and R2 read files separately.
- Usage:
- python script_name.py -i <input> -o <output>
-id <id_regex> -r1 <r1_pattern> -r2 <r2_pattern>
- Options:
- --path
Input directory containing FASTQ files.
- --outpath
Output directory for merged FASTQ files.
- --id_pattern
Regex pattern to extract sample IDs from filenames.
- --r1_pattern
Regex pattern for R1 read files.
- --r2_pattern
Regex pattern for R2 read files.
Example
python3.13 read_merger.py –path <workdir>/input_dir –outpath <workdir>/output_dir –id_pattern ‘<sample_base>_[d]{4}_([^_]*){1,2}’ –r1_pattern ‘.*R1.*.fastq.gz’ –r2_pattern ‘.*R2.*.fastq.gz’
- src.utils.reads_merger.parse_args() Namespace[source]
Parses command-line arguments for input and output paths, and patterns.
- Returns:
Parsed arguments namespace.
- Return type:
args
- src.utils.reads_merger.merge_fastq(path: PathLike, outpath: PathLike, id_pattern: str = '(?:russco_[\\d]{4}_(?:ffpe_cr|leu))', r1_pattern: str = '[^\\s]*R1[^\\s]*(?:\\.fa(?:st(?:a|q)))(?:\\.(?:gz|bz|bgz))?', r2_pattern: str = '[^\\s]*R2[^\\s]*(?:\\.fa(?:st(?:a|q)))(?:\\.(?:gz|bz|bgz))?')[source]
Merges R1 and R2 FASTQ files per sample based on provided patterns.
src.utils.report_merger module
src.utils.util module
This module provides utility functions for handling configuration-based data generation and other helper operations.
- Currently, it includes:
- reg_tuple_generator:
Generates a tuple containing a region identifier and the corresponding mpileup file path based on a given configuration and chromosome interval.
- Dependencies:
- src.configurator.Configurator:
A configuration handler that provides configuration data.
- Usage:
Import functions from this module to facilitate region and file path generation based on configured settings.
- src.utils.util.reg_tuple_generator(configurator: Configurator, chr_interval: str) tuple[str, str][source]
Generate a tuple (region, mpileup_filepath) based on the configuration.
- src.utils.util.depth_filter(filepath: PathLike, depth: int = 10, logger: Logger | None = None) None[source]
Filters lines in a mpileup file based on a depth value in the fourth field.
Reads the specified file line by line, and writes only those lines where the integer value in the fourth field (index 3) is greater than or equal to the specified ‘depth’ threshold. The original file is atomically replaced with the filtered content.
- Parameters:
filepath (PathLike[AnyStr]) – Path to the input file to be filtered.
depth (int, optional) – Minimum depth value to retain lines. Defaults to 10.
logger (Optional[logging.Logger], optional) – Logger instance for warnings and critical messages. If None, messages are printed to standard output.
- Raises:
FileNotFoundError – If the input file does not exist.
PermissionError – If there are insufficient permissions to read/write the file.
SystemError, IOError, OSError – For other I/O related errors.
Example
depth_filter(‘data.txt’, depth=15)