Variant Consequences
gentropy.assets.variant_consequences
¶
Common module representing Ensembl Variation Variant Consequences.
This module contains the Consequence dataclass and the VariantConsequence enum.
The VariantConsequence enum contains all Consequence instances defined by the Ensembl Variation API (used by Variant Effect Predictor - VEP).
The full definition of the consequence was derived from the Ensembl Variation API.
Consequence
dataclass
¶
Base class for the variant consequence term.
Note
This class is used as a base class for the VariantConsequence enum, which
contains all the valid consequence terms defined by the Ensembl Variation API.
Warning
Building new instances of this class is not recommended, as it may lead to inconsistencies
in the way the Consequence.score is calculated.
Rather then creating new instances of this class it is recommended to subclass it
and override the score property with custom logic.
Examples:
>>> c = VariantConsequence.MISSENSE_VARIANT.value
>>> c.id
'SO_0001583'
>>> c.label
'missense_variant'
>>> c.impact
'MODERATE'
>>> c.score
0.68
>>> c.rank
13
>>> str(c)
'Consequence(id=SO_0001583, label=missense_variant, impact=MODERATE, rank=13)'
Source code in src/gentropy/assets/variant_consequences.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | |
score: float
property
¶
Scores the impact of a variant consequence.
Note
The consequence scores are derived from the rank introduced by the ensembl-variation ranking of sequence ontology consequence terms.
The ranking is derived from Constants.pm.
The scoring that follows the inverse of the ranking is based on the formula.
where $max(rank)$ is the maximum rank of the consequences (41 in this case). The score is then rounded to 2 decimal places.
- The score derived this way follows the VEP consequence ranking, which means that the consequence score is in sync to the actual mostSevereConsequence term
- The score is different for each consequence term
- The score follow the severity measure (0.98 - highest severity, 0 - lowest severity)
VariantConsequence
¶
Bases: Enum
Enum representing Ensembl Variation Variant Consequences.
This enum contains all Consequence instances defined by the Ensembl Variation API (used by Variant Effect Predictor - VEP).
The full definition of the consequence was derived from the Ensembl Variation API. See issue for more details.
Attributes:
| Name | Type | Description |
|---|---|---|
TRANSCRIPT_ABLATION |
Consequence
|
A feature ablation whereby the deleted region includes a transcript feature |
SPLICE_ACCEPTOR_VARIANT |
Consequence
|
A splice variant that changes the 2 base region at the 3' end of an intron |
SPLICE_DONOR_VARIANT |
Consequence
|
A splice variant that changes the 2 base region at the 5' end of an intron |
STOP_GAINED |
Consequence
|
A sequence variant whereby at least one base of a codon is changed, resulting in a premature stop codon, leading to a shortened transcript |
FRAMESHIFT_VARIANT |
Consequence
|
A sequence variant which causes a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three |
STOP_LOST |
Consequence
|
A sequence variant where at least one base of the terminator codon (stop) is changed, resulting in an elongated transcript |
START_LOST |
Consequence
|
A codon variant that changes at least one base of the canonical start codon |
TRANSCRIPT_AMPLIFICATION |
Consequence
|
A feature amplification of a region containing a transcript |
FEATURE_ELONGATION |
Consequence
|
A sequence variant that causes the extension of a genomic feature, with regard to the reference sequence |
FEATURE_TRUNCATION |
Consequence
|
A sequence variant that causes the reduction of a genomic feature, with regard to the reference sequence |
INFRAME_INSERTION |
Consequence
|
An inframe non synonymous variant that inserts bases into in the coding sequence |
INFRAME_DELETION |
Consequence
|
An inframe non synonymous variant that deletes bases from the coding sequence |
MISSENSE_VARIANT |
Consequence
|
A sequence variant, that changes one or more bases, resulting in a different amino acid sequence but where the length is preserved |
PROTEIN_ALTERING_VARIANT |
Consequence
|
A sequence_variant which is predicted to change the protein encoded in the coding sequence |
SPLICE_DONOR_5TH_BASE_VARIANT |
Consequence
|
A sequence variant that causes a change at the 5th base pair after the start of the intron in the orientation of the transcript |
SPLICE_REGION_VARIANT |
Consequence
|
A sequence variant in which a change has occurred within the region of the splice site, either within 1-3 bases of the exon or 3-8 bases of the intron |
SPLICE_DONOR_REGION_VARIANT |
Consequence
|
A sequence variant that falls in the region between the 3rd and 6th base after splice junction (5' end of intron). |
SPLICE_POLYPYRIMIDINE_TRACT_VARIANT |
Consequence
|
A sequence variant that falls in the polypyrimidine tract at 3' end of intron between 17 and 3 bases from the end (acceptor -3 to acceptor -17) |
INCOMPLETE_TERMINAL_CODON_VARIANT |
Consequence
|
A sequence variant where at least one base of the final codon of an incompletely annotated transcript is changed |
START_RETAINED_VARIANT |
Consequence
|
A sequence variant where at least one base in the start codon is changed, but the start remains |
STOP_RETAINED_VARIANT |
Consequence
|
A sequence variant where at least one base in the terminator codon is changed, but the terminator remains |
SYNONYMOUS_VARIANT |
Consequence
|
A sequence variant where there is no resulting change to the encoded amino acid |
CODING_SEQUENCE_VARIANT |
Consequence
|
A sequence variant that changes the coding sequence |
MATURE_MIRNA_VARIANT |
Consequence
|
A transcript variant located with the sequence of the mature miRNA |
5_PRIME_UTR_VARIANT |
Consequence
|
A UTR variant of the 5' UTR |
3_PRIME_UTR_VARIANT |
Consequence
|
A UTR variant of the 3' UTR |
NON_CODING_TRANSCRIPT_EXON_VARIANT |
Consequence
|
A sequence variant that changes non-coding exon sequence in a non-coding transcript |
INTRON_VARIANT |
Consequence
|
A transcript variant occurring within an intron |
NMD_TRANSCRIPT_VARIANT |
Consequence
|
A variant in a transcript that is the target of NMD |
NON_CODING_TRANSCRIPT_VARIANT |
Consequence
|
A transcript variant of a non coding RNA gene |
CODING_TRANSCRIPT_VARIANT |
Consequence
|
A transcript variant of a protein coding gene |
UPSTREAM_GENE_VARIANT |
Consequence
|
A sequence variant located 5' of a gene |
DOWNSTREAM_GENE_VARIANT |
Consequence
|
A sequence variant located 3' of a gene |
TFBS_ABLATION |
Consequence
|
A feature ablation whereby the deleted region includes a transcription factor binding site |
TFBS_AMPLIFICATION |
Consequence
|
A feature amplification of a region containing a transcription factor binding site |
TF_BINDING_SITE_VARIANT |
Consequence
|
A sequence variant located within a transcription factor binding site |
REGULATORY_REGION_ABLATION |
ConsequeSO_0001893nce
|
A feature ablation whereby the deleted region includes a regulatory region |
REGULATORY_REGION_AMPLIFICATION |
Consequence
|
A feature amplification of a region containing a regulatory region |
REGULATORY_REGION_VARIANT |
Consequence
|
A sequence variant located within a regulatory region |
INTERGENIC_VARIANT |
Consequence
|
A sequence variant located in the intergenic region, between genes |
SEQUENCE_VARIANT |
Consequence
|
A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alterations |
Source code in src/gentropy/assets/variant_consequences.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 | |
map_score() -> dict[str, float]
classmethod
¶
Return the mapping of the Consequence.label (key) and Consequence.score (value).
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
dict[str, float]: Mapping of consequence label to score. |
Examples:
>>> s = VariantConsequence.map_score()
>>> s["missense_variant"]
0.68
>>> len(s)
41
Source code in src/gentropy/assets/variant_consequences.py
292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 | |
map_sequence_ontology() -> dict[str, str]
classmethod
¶
Return the mapping of the Consequence.label (key) and Consequence.id (value) representing Sequence Ontology term.
Returns:
| Type | Description |
|---|---|
dict[str, str]
|
dict[str, str]: Mapping of consequence label to ID. |
Examples:
>>> m = VariantConsequence.map_sequence_ontology()
>>> m["missense_variant"]
'SO_0001583'
>>> len(m)
41
Source code in src/gentropy/assets/variant_consequences.py
273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 | |