src.utils.report_agregator package

Submodules

src.utils.report_agregator.annotation_data_container module

Module containing the AnnotationDataContainer class. This class encapsulates data for a single annotation entry from an annotation section (ANN), crucial for analyzing variant impact and location. It inherits from IReportDataContainer.

class src.utils.report_aggregator.annotation_data_container.AnnotationDataContainer(allele: str, annotation: str, annotation_impact: str, gene_name: str, gene_id: str, mutation_type: str, mutation_id: str, transcript_biotype: str, exon: str, hgvs_cds: str, hgvs_protein: str, c_dna: str, cds: str, aminoacid: str, distance: str, info: str)[source]

Bases: IReportDataContainer

Annotation section (ANN) has a bit of annotations divided by 15 fields such as attributes listed bellow

Allele

Annotation

Annotation_Impact

Gene_Name

Gene_ID

Feature_Type

Feature_ID

Transcript_BioType

Rank

HGVS.c

HGVS.p

cDNA.pos / cDNA.length

CDS.pos / CDS.length

AA.pos / AA.length

Distance

ERRORS / WARNINGS / INFO

Distance to feature

All items in this field are options, so the field could be empty. * Up/Downstream:

Distance to first / last codon

Intergenic:
Distance to the closest gene
Distance to the closest Intron boundary

in exon (+/- up/downstream). If same, use positive number. * Distance to the closest exon boundary in Intron (+/- up/downstream) * Distance to first base in MOTIF * Distance to first base in miRNA * Distance to exon-intron boundary in splice_site or splice _region * ChipSeq peak:

Distance to summit (or peak center)

Histone mark / Histone state:
Distance to summit (or peak center)

allele: str

annotation: str

annotation_impact: str

gene_name: str

gene_id: str

mutation_type: str

mutation_id: str

transcript_biotype: str

exon: str

hgvs_cds: str

hgvs_protein: str

c_dna: str

cds: str

aminoacid: str

distance: str

info: str

to_dict()[source]

Convert the dataclass instance to a dictionary.

Parameters:: self – The instance of the data container.
Returns:: A dictionary with attribute names as keys and attribute values.
Return type:: dict

_abc_impl = <_abc._abc_data object>

src.utils.report_agregator.i_report_data_container module

Defines the IReportDataContainer interface, which provides methods for serializing and deserializing data container instances, along with a string representation method.

class src.utils.report_aggregator.i_report_data_container.IReportDataContainer[source]

Bases: ABC

Interface for report data containers, providing utility methods for converting to dictionary, creating instances from lists, and generating string representations.

classmethod to_dict(self)[source]

Convert the dataclass instance to a dictionary.

Parameters:: self – The instance of the data container.
Returns:: A dictionary with attribute names as keys and attribute values.
Return type:: dict

classmethod from_list(data_list)[source]

Create an instance of the class from a list of values.

Parameters:: data_list (list) – List of values corresponding to the class fields.
Returns:: An instance of the class initialized with the provided values.

_abc_impl = <_abc._abc_data object>

src.utils.report_agregator.report_agregator module

src.utils.report_agregator.variant_data_container module

This module defines the VariantDataContainer class, which encapsulates data related to a genomic variant. It includes information about the variant’s location, alleles, associated genes, and potentially functional consequences. This class is designed for structured representation of variant data, facilitating analysis and reporting.

The module also includes a dataclass for VariantDataContainer, inheriting from IReportDataContainer (defined in another module) and providing methods for accessing and manipulating variant data.

The module src.utils.report_aggregator.annotation_data_container is imported for potentially including annotation data.

class src.utils.report_aggregator.variant_data_container.ClinvarVariantAnnotationContainer(allele_id: str, disease_name: str, disease_database: str, review_status: str, clinical_sign: str, onco_disease_name: str, onco_disease_database: str, onco_review_status: str, oncogenicity_factor: str, somatic_clinical_impact_disease_name: str, somatic_clinical_impact_disease_database: str, somatic_clinical_impact_review_status: str, somatic_clinical_impact: str)[source]

Bases: IReportDataContainer

Here is description of Clinvar database’s headers: SCI:

Aggregate somatic clinical impact for this single variant

SCIDN:

ClinVar’s preferred disease name for the concept specified by disease identifiers in SCIDISDB

SCIDISDB:

Tag-value pairs of disease database name and identifier submitted for somatic clinical impact classifications,

e.g. MedGen: NNNNNN

SCIREVSTAT:

ClinVar review status of somatic clinical impact: for the Variation ID

ONC: Aggregate oncogenicity classification for the variant ONCREVSTAT:

ClinVar review status of oncogenicity classification for the Variation ID

ONCCONF: Conflicting oncogenicity classifications for the variant

allele_id: str

disease_name: str

disease_database: str

review_status: str

clinical_sign: str

onco_disease_name: str

onco_disease_database: str

onco_review_status: str

oncogenicity_factor: str

somatic_clinical_impact_disease_name: str

somatic_clinical_impact_disease_database: str

somatic_clinical_impact_review_status: str

somatic_clinical_impact: str

_abc_impl = <_abc._abc_data object>

class src.utils.report_aggregator.variant_data_container.VariantDataContainer(chromosome: str, start: str, end: str, reference: str, alternate: str, gene_function: str, gene_name: str, gene_detail: str, exonic_function: str, aminoacid_change: str, clinvar: ClinvarVariantAnnotationContainer, one_thousand_genomics: str, other_info: str)[source]

Bases: IReportDataContainer

Represents a container for variant data, inheriting from IReportDataContainer.

This class encapsulates the necessary information about a variant, including its genomic location, reference and alternate alleles, associated gene information, and functional effects. It’s designed to be used within a reporting system and can likely hold additional attributes.

chromosome

The chromosome the variant is located on.

Type:: str

start

The start position of the variant.

Type:: str

end

The end position of the variant.

Type:: str

reference

The reference allele.

Type:: str

alternate

The alternate allele.

Type:: str

gene_function

Functional impact on the gene.

Type:: str

gene_name

The name of the gene affected.

Type:: str

gene_detail

Additional details about the gene.

Type:: str

exonic_function

Impact on the exonic region.

Type:: str

aminoacid_change

Description of any amino acid changes.

Type:: str

clinvar

dataclass for clinvar annotation fields.

Type:: ClinvarVariantAnnotationContainer

one_thousand_genomics

Annotation with 1K_Genomics.

Type:: str

other_info

Field for the part of the annotation file row, that’s not a part of a first annotation section.

Type:: str

chromosome: str

start: str

end: str

reference: str

alternate: str

gene_function: str

gene_name: str

gene_detail: str

exonic_function: str

aminoacid_change: str

clinvar: ClinvarVariantAnnotationContainer

one_thousand_genomics: str

other_info: str

_abc_impl = <_abc._abc_data object>

src.utils.report_agregator package

Submodules

src.utils.report_agregator.annotation_data_container module

src.utils.report_agregator.i_report_data_container module

src.utils.report_agregator.report_agregator module

src.utils.report_agregator.variant_data_container module

Module contents