src.utils package

Subpackages

Submodules

src.utils.reads_merger module

FAST(A|Q) Reads Merger.

This script consolidates paired-end FASTQ file from a specified directory by merging multiple files per sample if necessary, or copying single files directly. It identifies samples based on a provided pattern, and processes R1 and R2 read files separately.

Usage:
python script_name.py -i <input> -o <output>

-id <id_regex> -r1 <r1_pattern> -r2 <r2_pattern>

Options:
--path

Input directory containing FASTQ files.

--outpath

Output directory for merged FASTQ files.

--id_pattern

Regex pattern to extract sample IDs from filenames.

--r1_pattern

Regex pattern for R1 read files.

--r2_pattern

Regex pattern for R2 read files.

Example

python3.13 read_merger.py –path <workdir>/input_dir –outpath <workdir>/output_dir –id_pattern ‘<sample_base>_[d]{4}_([^_]*){1,2}’ –r1_pattern ‘.*R1.*.fastq.gz’ –r2_pattern ‘.*R2.*.fastq.gz’

src.utils.reads_merger.parse_args() Namespace[source]

Parses command-line arguments for input and output paths, and patterns.

Returns:

Parsed arguments namespace.

Return type:

args

src.utils.reads_merger.merge_fastq(path: PathLike, outpath: PathLike, id_pattern: str = '(?:russco_[\\d]{4}_(?:ffpe_cr|leu))', r1_pattern: str = '[^\\s]*R1[^\\s]*(?:\\.fa(?:st(?:a|q)))(?:\\.(?:gz|bz|bgz))?', r2_pattern: str = '[^\\s]*R2[^\\s]*(?:\\.fa(?:st(?:a|q)))(?:\\.(?:gz|bz|bgz))?')[source]

Merges R1 and R2 FASTQ files per sample based on provided patterns.

src.utils.report_merger module

src.utils.util module

This module provides utility functions for handling configuration-based data generation and other helper operations.

Currently, it includes:
  • reg_tuple_generator:

    Generates a tuple containing a region identifier and the corresponding mpileup file path based on a given configuration and chromosome interval.

Dependencies:
  • src.configurator.Configurator:

    A configuration handler that provides configuration data.

Usage:

Import functions from this module to facilitate region and file path generation based on configured settings.

src.utils.util.reg_tuple_generator(configurator: Configurator, chr_interval: str) tuple[str, str][source]

Generate a tuple (region, mpileup_filepath) based on the configuration.

src.utils.util.depth_filter(filepath: PathLike, depth: int = 10, logger: Logger | None = None) None[source]

Filters lines in a mpileup file based on a depth value in the fourth field.

Reads the specified file line by line, and writes only those lines where the integer value in the fourth field (index 3) is greater than or equal to the specified ‘depth’ threshold. The original file is atomically replaced with the filtered content.

Parameters:
  • filepath (PathLike[AnyStr]) – Path to the input file to be filtered.

  • depth (int, optional) – Minimum depth value to retain lines. Defaults to 10.

  • logger (Optional[logging.Logger], optional) – Logger instance for warnings and critical messages. If None, messages are printed to standard output.

Raises:

Example

depth_filter(‘data.txt’, depth=15)

Module contents