Variant index
Bases: Dataset
Variant index dataset.
Variant index dataset is the result of intersecting the variant annotation (gnomad) dataset with the variants with V2D available information.
Source code in src/otg/dataset/variant_index.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|
from_variant_annotation(variant_annotation)
classmethod
Initialise VariantIndex from pre-existing variant annotation dataset.
Source code in src/otg/dataset/variant_index.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|
get_schema()
classmethod
Provides the schema for the VariantIndex dataset.
Source code in src/otg/dataset/variant_index.py
26 27 28 29 |
|
Schema
root
|-- variantId: string (nullable = false)
|-- chromosome: string (nullable = false)
|-- position: integer (nullable = false)
|-- referenceAllele: string (nullable = false)
|-- alternateAllele: string (nullable = false)
|-- chromosomeB37: string (nullable = true)
|-- positionB37: integer (nullable = true)
|-- alleleType: string (nullable = false)
|-- alleleFrequencies: array (nullable = false)
| |-- element: struct (containsNull = true)
| | |-- populationName: string (nullable = true)
| | |-- alleleFrequency: double (nullable = true)
|-- cadd: struct (nullable = true)
| |-- phred: float (nullable = true)
| |-- raw: float (nullable = true)
|-- mostSevereConsequence: string (nullable = true)
|-- rsIds: array (nullable = true)
| |-- element: string (containsNull = true)