GAB_2026v17n3

Genomics and Applied Biology 2026, Vol.17, No.3, 138-153 http://bioscipublisher.com/index.php/gab 142 At the modeling level, different approaches distinguish themselves primarily through the choice of prior distributions imposed on effect sizes, thereby reflecting distinct assumptions about the underlying genetic architecture. For instance, the LDpred family of methods typically adopts a point-normal mixture prior to explicitly capture sparsity in effect sizes, whereas PRS-CS employs a continuous shrinkage prior that allows effect sizes to vary more smoothly across the genome and enables data-driven estimation of global hyperparameters (Sima et al., 2024). These differences in prior specification fundamentally represent alternative trade-offs between sparsity and continuity in modeling genetic effects. From a unified statistical perspective, these methods can be expressed as: ̂ =shrinkage( , , ) That is, regularized estimation of GWAS effects under LD constraints and prior assumptions. Compared to C+T, these approaches avoid discrete LD pruning and instead achieve near-optimal linear combinations within LD regions, typically resulting in improved predictive performance and greater parameter stability. However, their performance depends critically on the consistency between the LD reference panel and the target population. When ancestry mismatch or structural variation exists, LD discrepancies may bias effect estimation and degrade prediction accuracy (Jayasinghe et al., 2024). In addition, large-scale LD matrix computation imposes substantial computational burden, and although methods such as LDpred2 have improved efficiency, trade-offs between accuracy and scalability remain. 1.3 Functional annotation and prior integration To better approximate causal variants and reduce “tagging effects,” recent methodological advances have incorporated functional annotation information into effect estimation. Stratified LD score regression (S-LDSC) estimates heritability contributions across annotation categories within the baseline-LD framework, providing a basis for constructing informative priors (Sima et al., 2024). Representative methods such as AnnoPred and LDpred-funct integrate functional annotations into Bayesian priors, assigning higher inclusion probabilities or weaker shrinkage to biologically relevant variants (e.g., eQTLs or open chromatin regions), thereby improving signal-to-noise ratio and predictive performance. Because some functional annotations are relatively conserved across populations, these approaches can partially mitigate cross-population bias induced by differences in LD structure and allele frequencies. From a statistical modeling perspective, the incorporation of functional annotations can be viewed as integrating external biological information into the effect estimation process. This is typically achieved by imposing structured priors that differentiate across variant categories, thereby enabling a more structured representation of effect size distributions. As a result, the model no longer relies solely on data-driven estimation, but instead leverages prior information to guide shrinkage, improving both signal detection and estimation stability. However, functional annotations are often tissue-and environment-specific and subject to measurement and annotation biases, which may lead to prior misspecification. Therefore, in practice, multi-tissue integration and small-sample recalibration in the target population are recommended to enhance robustness (Jayasinghe et al., 2024). 1.4 Multi-ancestry transfer and cross-population models The primary objective of multi-ancestry PRS is to balance sample size and genetic heterogeneity across populations by jointly leveraging GWAS data from multiple ancestries and ancestry-specific LD structures, thereby improving generalization performance (Ruan et al., 2021; Zhang et al., 2023). For example, PRS-CSx performs continuous shrinkage separately within each ancestry and integrates results using shared hyperparameters and data-driven weights, capturing both shared and population-specific effects (Sima et al., 2024).

RkJQdWJsaXNoZXIy MjQ4ODYzNA==