src.utils.report_agregator package

Submodules

src.utils.report_agregator.annotation_data_container module

Module containing the AnnotationDataContainer class. This class encapsulates data for a single annotation entry from an annotation section (ANN), crucial for analyzing variant impact and location. It inherits from IReportDataContainer.

class src.utils.report_aggregator.annotation_data_container.AnnotationDataContainer(allele: str, annotation: str, annotation_impact: str, gene_name: str, gene_id: str, mutation_type: str, mutation_id: str, transcript_biotype: str, exon: str, hgvs_cds: str, hgvs_protein: str, c_dna: str, cds: str, aminoacid: str, distance: str, info: str)[source]

Bases: IReportDataContainer

Annotation section (ANN) has a bit of annotations divided by 15 fields such as attributes listed bellow

Allele
Annotation
Annotation_Impact
Gene_Name
Gene_ID
Feature_Type
Feature_ID
Transcript_BioType
Rank
HGVS.c
HGVS.p
cDNA.pos / cDNA.length
CDS.pos / CDS.length
AA.pos / AA.length
Distance
ERRORS / WARNINGS / INFO
Distance to feature

All items in this field are options, so the field could be empty. * Up/Downstream:

Distance to first / last codon

  • Intergenic:

    Distance to the closest gene

  • Distance to the closest Intron boundary

in exon (+/- up/downstream). If same, use positive number. * Distance to the closest exon boundary in Intron (+/- up/downstream) * Distance to first base in MOTIF * Distance to first base in miRNA * Distance to exon-intron boundary in splice_site or splice _region * ChipSeq peak:

Distance to summit (or peak center)

  • Histone mark / Histone state:

    Distance to summit (or peak center)

allele: str
annotation: str
annotation_impact: str
gene_name: str
gene_id: str
mutation_type: str
mutation_id: str
transcript_biotype: str
exon: str
hgvs_cds: str
hgvs_protein: str
c_dna: str
cds: str
aminoacid: str
distance: str
info: str
to_dict()[source]

Convert the dataclass instance to a dictionary.

Parameters:

self – The instance of the data container.

Returns:

A dictionary with attribute names as keys and attribute values.

Return type:

dict

_abc_impl = <_abc._abc_data object>

src.utils.report_agregator.i_report_data_container module

Defines the IReportDataContainer interface, which provides methods for serializing and deserializing data container instances, along with a string representation method.

class src.utils.report_aggregator.i_report_data_container.IReportDataContainer[source]

Bases: ABC

Interface for report data containers, providing utility methods for converting to dictionary, creating instances from lists, and generating string representations.

classmethod to_dict(self)[source]

Convert the dataclass instance to a dictionary.

Parameters:

self – The instance of the data container.

Returns:

A dictionary with attribute names as keys and attribute values.

Return type:

dict

classmethod from_list(data_list)[source]

Create an instance of the class from a list of values.

Parameters:

data_list (list) – List of values corresponding to the class fields.

Returns:

An instance of the class initialized with the provided values.

_abc_impl = <_abc._abc_data object>

src.utils.report_agregator.report_agregator module

src.utils.report_agregator.variant_data_container module

This module defines the VariantDataContainer class, which encapsulates data related to a genomic variant. It includes information about the variant’s location, alleles, associated genes, and potentially functional consequences. This class is designed for structured representation of variant data, facilitating analysis and reporting.

The module also includes a dataclass for VariantDataContainer, inheriting from IReportDataContainer (defined in another module) and providing methods for accessing and manipulating variant data.

The module src.utils.report_aggregator.annotation_data_container is imported for potentially including annotation data.

class src.utils.report_aggregator.variant_data_container.ClinvarVariantAnnotationContainer(allele_id: str, disease_name: str, disease_database: str, review_status: str, clinical_sign: str, onco_disease_name: str, onco_disease_database: str, onco_review_status: str, oncogenicity_factor: str, somatic_clinical_impact_disease_name: str, somatic_clinical_impact_disease_database: str, somatic_clinical_impact_review_status: str, somatic_clinical_impact: str)[source]

Bases: IReportDataContainer

Here is description of Clinvar database’s headers: SCI:

Aggregate somatic clinical impact for this single variant

SCIDN:

ClinVar’s preferred disease name for the concept specified by disease identifiers in SCIDISDB

SCIDISDB:

Tag-value pairs of disease database name and identifier submitted for somatic clinical impact classifications,

e.g. MedGen: NNNNNN

SCIREVSTAT:
ClinVar review status of somatic clinical impact

for the Variation ID

ONC: Aggregate oncogenicity classification for the variant ONCREVSTAT:

ClinVar review status of oncogenicity classification for the Variation ID

ONCCONF: Conflicting oncogenicity classifications for the variant

allele_id: str
disease_name: str
disease_database: str
review_status: str
clinical_sign: str
onco_disease_name: str
onco_disease_database: str
onco_review_status: str
oncogenicity_factor: str
somatic_clinical_impact_disease_name: str
somatic_clinical_impact_disease_database: str
somatic_clinical_impact_review_status: str
somatic_clinical_impact: str
_abc_impl = <_abc._abc_data object>
class src.utils.report_aggregator.variant_data_container.VariantDataContainer(chromosome: str, start: str, end: str, reference: str, alternate: str, gene_function: str, gene_name: str, gene_detail: str, exonic_function: str, aminoacid_change: str, clinvar: ClinvarVariantAnnotationContainer, one_thousand_genomics: str, other_info: str)[source]

Bases: IReportDataContainer

Represents a container for variant data, inheriting from IReportDataContainer.

This class encapsulates the necessary information about a variant, including its genomic location, reference and alternate alleles, associated gene information, and functional effects. It’s designed to be used within a reporting system and can likely hold additional attributes.

chromosome

The chromosome the variant is located on.

Type:

str

start

The start position of the variant.

Type:

str

end

The end position of the variant.

Type:

str

reference

The reference allele.

Type:

str

alternate

The alternate allele.

Type:

str

gene_function

Functional impact on the gene.

Type:

str

gene_name

The name of the gene affected.

Type:

str

gene_detail

Additional details about the gene.

Type:

str

exonic_function

Impact on the exonic region.

Type:

str

aminoacid_change

Description of any amino acid changes.

Type:

str

clinvar

dataclass for clinvar annotation fields.

Type:

ClinvarVariantAnnotationContainer

one_thousand_genomics

Annotation with 1K_Genomics.

Type:

str

other_info

Field for the part of the annotation file row, that’s not a part of a first annotation section.

Type:

str

chromosome: str
start: str
end: str
reference: str
alternate: str
gene_function: str
gene_name: str
gene_detail: str
exonic_function: str
aminoacid_change: str
clinvar: ClinvarVariantAnnotationContainer
one_thousand_genomics: str
other_info: str
_abc_impl = <_abc._abc_data object>

Module contents