Summary Statistics
gentropy.datasource.ukb_ppp_eur.summary_stats.UkbPppEurSummaryStats
dataclass
¶
Summary statistics dataset for UKB PPP (EUR).
Source code in src/gentropy/datasource/ukb_ppp_eur/summary_stats.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
from_source(spark: SparkSession, raw_summary_stats_path: str, tmp_variant_annotation_path: str, chromosome: str, study_index_path: str) -> SummaryStatistics
classmethod
¶
Ingest and harmonise all summary stats for UKB PPP (EUR) data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
spark
|
SparkSession
|
Spark session object. |
required |
raw_summary_stats_path
|
str
|
Input raw summary stats path. |
required |
tmp_variant_annotation_path
|
str
|
Input variant annotation dataset path. |
required |
chromosome
|
str
|
Which chromosome to process. |
required |
study_index_path
|
str
|
The path to study index, which is necessary in some cases to populate the sample size column. |
required |
Returns:
Name | Type | Description |
---|---|---|
SummaryStatistics |
SummaryStatistics
|
Processed summary statistics dataset for a given chromosome. |
Source code in src/gentropy/datasource/ukb_ppp_eur/summary_stats.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|