src package
Subpackages
- src.core package
- Subpackages
- src.core.analyzer package
- Submodules
- src.core.analyzer.adapter_trimmer module
- src.core.analyzer.amplicon_coverage_computer module
- src.core.analyzer.annotation_adapter module
- src.core.analyzer.bam_grouper module
- src.core.analyzer.bqsr_performer module
- src.core.analyzer.i_data_preparator module
- src.core.analyzer.i_variant_caller module
- src.core.analyzer.primer_cutter module
- src.core.analyzer.sequence_aligner module
- src.core.analyzer.variant_caller module
- src.core.analyzer.variant_caller_factory module
- Module contents
- src.core.configurator package
- src.core.analyzer package
- Submodules
- src.core.base module
- src.core.sample_data_container module
SampleDataContainerSampleDataContainer.r1_sourceSampleDataContainer.r2_sourceSampleDataContainer.sidSampleDataContainer.processing_pathSampleDataContainer.processing_logpathSampleDataContainer.bam_filepathSampleDataContainer.vcf_filepathSampleDataContainer.report_pathSampleDataContainer.r1_sourceSampleDataContainer.r2_sourceSampleDataContainer.sidSampleDataContainer.processing_pathSampleDataContainer.processing_logpathSampleDataContainer.report_pathSampleDataContainer.target_regionsSampleDataContainer.bam_filepathSampleDataContainer.vcf_filepathSampleDataContainer.parse_regions()
- src.core.sample_data_factory module
- Module contents
- Subpackages
- src.utils package
- Subpackages
- src.utils.demultiplexor_adapter package
- src.utils.table_manager package
- Submodules
- src.utils.table_manager.csv_table_manager module
- src.utils.table_manager.excel_table_manager module
- src.utils.table_manager.i_table_manager module
- src.utils.table_manager.sample_sheet_builder module
- src.utils.table_manager.sample_sheet_container module
- src.utils.table_manager.table_manager_factory module
- src.utils.table_manager.xml_table_manager module
- Module contents
- Submodules
- src.utils.reads_merger module
- src.utils.report_merger module
- src.utils.util module
- Module contents
- Subpackages
Submodules
src.analyzer module
This module defines the Analyzer class, responsible for orchestrating the entire genomic data analysis pipeline.
It manages data preparation, alignment, variant calling, annotation, and conversion steps, using various components like SequenceAligner, BamGrouper, BQSRPerformer, and variant callers. It also handles command execution, logging, and file management throughout the process.
Main Features: - Initializes with a configurator and command caller. - Prepares data by performing sequence alignment, grouping reads, and recalibration. - Analyzes samples by calling variants, annotating them, converting formats, and generating reports. - Manages paths, logs, and subprocess execution.
- class src.analyzer.Analyzer(configurator: Configurator, cmd_caller: CommandExecutor | callable = None)[source]
Bases:
ProtocolProtocol class for design your own analyze stage that manages the entire genomic data analysis pipeline.
This class orchestrates the steps involved in processing sequencing data, including data preparation, read alignment, variant calling, annotation, and format conversion. It leverages various components such as SequenceAligner, BamGrouper, BQSRPerformer, variant callers, and annotation tools to perform the analysis systematically.
- configurator
Configuration object containing paths, parameters, and logger.
- Type:
- cmd_caller
Function or object to execute system commands.
- Type:
Union[CommandExecutor, callable]
- prepare_data(sample)[source]
Prepares raw sequencing data by trimming, aligning, and recalibrating.
- prepare_data(sample: SampleDataContainer) SampleDataContainer[source]
Prepares raw sequencing data for analysis, including alignment, read grouping, and recalibration.
- Parameters:
sample (SampleDataContainer) – Sample with raw data and metadata.
- Returns:
Updated sample with paths to intermediate and final files.
- Return type:
- analyze(sample: SampleDataContainer) SampleDataContainer[source]
Performs variant calling, annotation, and format conversion.
- Parameters:
sample (SampleDataContainer) – Sample with aligned data.
- Returns:
Updated sample with annotated variants and reports.
- Return type:
- _abc_impl = <_abc._abc_data object>
- _is_protocol = True
- class src.analyzer.BRCAAnalyzer(configurator: Configurator, cmd_caller: CommandExecutor | callable = None)[source]
Bases:
AnalyzerMain analyzer class that manages the entire genomic data analysis pipeline.
This class orchestrates the steps involved in processing sequencing data, including data preparation, read alignment, variant calling, annotation, and format conversion. It leverages various components such as SequenceAligner, BamGrouper, BQSRPerformer, variant callers, and annotation tools to perform the analysis systematically.
- configurator
Configuration object containing paths, parameters, and logger.
- Type:
- cmd_caller
Function or object to execute system commands.
- Type:
Union[CommandExecutor, callable]
- prepare_data(sample)[source]
Prepares raw sequencing data by trimming, aligning, and recalibrating.
- prepare_data(sample: SampleDataContainer) SampleDataContainer[source]
Prepares raw sequencing data for analysis, including alignment, read grouping, and recalibration.
- Parameters:
sample (SampleDataContainer) – Sample with raw data and metadata.
- Returns:
Updated sample with paths to intermediate and final files.
- Return type:
- analyze(sample: SampleDataContainer) SampleDataContainer[source]
Performs variant calling, annotation, and format conversion.
- Parameters:
sample (SampleDataContainer) – Sample with aligned data.
- Returns:
Updated sample with annotated variants and reports.
- Return type:
- _abc_impl = <_abc._abc_data object>
- _is_protocol = False
- class src.analyzer.TestAnalyzerhg19(configurator: Configurator, cmd_caller: CommandExecutor | callable = None)[source]
Bases:
Analyzer…
- _abc_impl = <_abc._abc_data object>
- _is_protocol = False
- prepare_data(sample: SampleDataContainer) SampleDataContainer[source]
Prepares raw sequencing data for analysis, including alignment, read grouping, and recalibration.
- Parameters:
sample (SampleDataContainer) – Sample with raw data and metadata.
- Returns:
Updated sample with paths to intermediate and final files.
- Return type:
- analyze(sample: SampleDataContainer) SampleDataContainer[source]
Performs variant calling, annotation, and format conversion.
- Parameters:
sample (SampleDataContainer) – Sample with aligned data.
- Returns:
Updated sample with annotated variants and reports.
- Return type:
src.configurator module
This module contains the implementation of the Configurator class, which manages the configuration setup for an analysis pipeline.
It handles command-line argument parsing, logging configuration, output directory validation/creation, and loading configuration parameters from a configuration file.
The Configurator class is designed as a singleton to ensure a single point of configuration management throughout the application.
It utilizes other components such as PathValidator, LoggingConfigurator, and ConfigLoader to perform its tasks.
- Usage:
Instantiate the Configurator class to initialize configuration, logging, and output directories.
- class src.configurator.Configurator(*args, **kwargs)[source]
Bases:
objectSingleton class responsible for managing configuration, logging, and output directory setup for the analysis pipeline.
This class handles parsing command-line arguments, setting up logging, validating and creating the output directory, and loading configuration parameters from a configuration file.
- args
Parsed command-line arguments.
- Type:
- logger
Logger instance for logging messages.
- Type:
- _load_configuration(config_path)
Loads configuration parameters.
- parse_configuration(base_config_filepath, target_section)[source]
Loads specific configuration sections.
- static _parse_args() Namespace[source]
Parses command-line arguments using argparse.
- Returns:
Parsed arguments object containing command-line parameters.
- Return type:
- static _setup_logger(log_filename: PathLike, args: Namespace = None) tuple[PathLike, Logger][source]
Sets up the logging system with the specified log file.
- Parameters:
log_filename (PathLike[AnyStr]) – Path to the log file.
- Returns:
A tuple containing the absolute path to the log file and the configured Logger object.
- Return type:
- _setup_output_directory(output_dir: PathLike) PathLike[source]
Validates the output directory path, creates it if it doesn’t exist, and handles existing directory conflicts based on user input.
- Parameters:
output_dir (PathLike[AnyStr]) – Path to the desired output directory.
- Returns:
Absolute path to the validated or created output directory.
- Return type:
PathLike[AnyStr]
- parse_configuration(base_config_filepath: PathLike | None, target_section: AnyStr = 'Pathes') dict[source]
Loads a specific section of the configuration from a base configuration file.
- Parameters:
base_config_filepath (PathLike[AnyStr], optional) – Path to the base configuration file. Defaults to ‘src/conf/config.ini’ relative to current directory.
target_section (str, optional) – The section within the configuration file to load. Defaults to ‘Pathes’.
- Returns:
Dictionary of configuration parameters from the specified section.
- Return type:
src.demultiplexor_adapter module
Module for demultiplexor adapter functionality.
src.dependency_handler module
This module provides the DependencyHandler class, which is responsible for managing dependencies, checking the existence of reference files or archives, extracting archives, verifying module installation, attempting to install missing modules via pip, and verifying or creating paths.
- Classes:
- DependencyHandler:
Singleton class to handle dependencies and reference files.
- Functions:
Various static methods for module loading, installation
path verification, and file extension resolution.
- Usage:
Instantiate the DependencyHandler singleton to perform dependency management tasks, such as checking reference files, installing modules, and verifying paths.
- class src.dependency_handler.DependencyHandler(*args, **kwargs)[source]
Bases:
objectSingleton class to manage dependencies and reference files.
Provides methods to set loggers, check and resolve reference files (including archives), verify and install modules, and verify or create filesystem paths.
- set_logger(new_logger: Logger) None[source]
Sets a new logger for the handler, replacing the current one.
- Parameters:
new_logger (logging.Logger) – The new logger to set.
- Raises:
RuntimeError – If the current logger is not set or if the new logger is None.
- check_reference(ref_filepath: PathLike, ref_dirpath: PathLike = '.') PathLike[source]
Check that file or archive with reference sequence exists. If there is only an archive, extract it to the current reference directory.
Returns path to reference file if it exists or raise FileNotFoundError otherwise.
- static is_module_loaded(module_name: AnyStr) bool[source]
Checks if a module is loaded in the current environment.
- static try_to_install_module(module_name: AnyStr, logger: Logger | None = None) bool[source]
Attempts to install a module via pip.
- Parameters:
module_name (str) – Name of the module to install.
logger (logging.Logger) – A logger for the function call.
- Returns:
True if installation succeeded, False otherwise.
- Return type:
- static fetch_dependency(module_name: AnyStr, logger: Logger | None = None) None[source]
Attempts to fetch a dependency module from pip, prompting the user if needed.
- Parameters:
module_name (str) – Name of the module to fetch.
logger (logging.Logger) – A logger for the function call.
- Exits:
Exits with code os.EX_SOFTWARE if installation fails.
Exits with code os.EX_OK if user chooses not to install.
Exits with code os.EX_USAGE if command is unrecognized.
- static verify_path(src: str, logger: Logger | None = None) bool[source]
Checks the existence of the file or directory at the given path src.
If the path does not exist, creates the necessary directories and the file.
- Parameters:
src (str) – The path to the file or directory to check or create.
logger (logging.Logger) – A logger for the function call.
- Returns:
True if the path exists or was successfully created, otherwise False.
- Return type:
- static resolve_file_path_by_extensions(base_name: PathLike, extension_list: list[str]) PathLike[source]
- Searches for the first existing file that matches
the base name with any of the provided extensions.
- Parameters:
- Returns:
- The full path to the first existing file
that matches the base name with one of the extensions.
- Return type:
PathLike[AnyStr]
- Raises:
FileNotFoundError – If no file matching the base name with any of the provided extensions is found.