Variant index
gentropy.dataset.variant_index.VariantIndex
dataclass
¶
Bases: Dataset
Variant index dataset.
Variant index dataset is the result of intersecting the variant annotation dataset with the variants with V2D available information.
Source code in src/gentropy/dataset/variant_index.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
|
from_variant_annotation(variant_annotation: VariantAnnotation, study_locus: StudyLocus) -> VariantIndex
classmethod
¶
Initialise VariantIndex from pre-existing variant annotation dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
variant_annotation |
VariantAnnotation
|
Variant annotation dataset |
required |
study_locus |
StudyLocus
|
Study locus dataset with the variants to intersect with the variant annotation dataset |
required |
Returns:
Name | Type | Description |
---|---|---|
VariantIndex |
VariantIndex
|
Variant index dataset |
Source code in src/gentropy/dataset/variant_index.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
|
get_schema() -> StructType
classmethod
¶
Provides the schema for the VariantIndex dataset.
Returns:
Name | Type | Description |
---|---|---|
StructType |
StructType
|
Schema for the VariantIndex dataset |
Source code in src/gentropy/dataset/variant_index.py
27 28 29 30 31 32 33 34 |
|
Schema¶
root
|-- variantId: string (nullable = false)
|-- chromosome: string (nullable = false)
|-- position: integer (nullable = false)
|-- referenceAllele: string (nullable = false)
|-- alternateAllele: string (nullable = false)
|-- chromosomeB37: string (nullable = true)
|-- positionB37: integer (nullable = true)
|-- alleleType: string (nullable = false)
|-- alleleFrequencies: array (nullable = false)
| |-- element: struct (containsNull = true)
| | |-- populationName: string (nullable = true)
| | |-- alleleFrequency: double (nullable = true)
|-- inSilicoPredictors: struct (nullable = false)
| |-- cadd: struct (nullable = true)
| | |-- raw: float (nullable = true)
| | |-- phred: float (nullable = true)
| |-- revelMax: double (nullable = true)
| |-- spliceaiDsMax: float (nullable = true)
| |-- pangolinLargestDs: double (nullable = true)
| |-- phylop: double (nullable = true)
| |-- siftMax: double (nullable = true)
| |-- polyphenMax: double (nullable = true)
|-- mostSevereConsequence: string (nullable = true)
|-- rsIds: array (nullable = true)
| |-- element: string (containsNull = true)