Supplementary data to: Age-specific epigenetic drift in
late-onset Alzheimer's disease
MALDI-TOF mass spectrometry in post-mortem brain samples and lymphocytes
In the following tables the methylation data from MALDI-TOF mass spectrometry experiments (Wang et al., 2008) are available. The graphs are in an Excel format.
Discriminant Analysis
Using Diagonal Linear Discriminant Analysis (DLDA) and Diagonal Quadratic Discriminant Analysis (DQDA) we tried to verify our marker selection from the ROC curve model, resulting in a similar set of classification markers (see Supplementary Table). Intriguingly, when stratifying for potential sex effects within the LOAD group, a set of new CpG dinucleotides could be identified that was specific for the male cases. The comparison of male LOAD cases with male controls resulted in 7 significant CpG dinucleotides as discriminators of the disease, all within the DNMT1 promoter.
We took the groups of cases and controls and build a classifier from them, using Diagonal Linear Discriminant Analysis (DLDA) and Diagonal Quadratic Discriminant Analysis (DQDA). This approach finds a minimal set of CpG sites in the dataset from which one can build a classifier with the same predictive power, using forward step-wise variable selection (FSVS). That is, the output of the Discriminant Analysis is a set of CpG sites that optimally separate the dataset into LOAD cases and controls. DLDA and DQDA usually give different results; for example DLDA has a lower misclassification rate than DQDA, and hence it is recommended to use both approaches in combination to find interesting results. In case a combinations of CpG sites had missing values, we replaced the missing value with a value averaged across individuals (using the geometric mean) to avoid too many dropouts. This approach has the smallest impact on the results for up to 20% missing values. Intriguingly, when stratifying for potential sex effects within the LOAD group, a set of new CpG dinucleotides could be identified that was specific for the male cases. The comparison of male LOAD cases with male controls resulted in 7 significant CpG dinucleotides as discriminators of the disease, all within the DNMT1 promoter.
Supplementary web table 1: Combined Diagonal Linear Discriminant Analysis (DLDA) and Diagonal Quadratic Discriminant Analysis (DQDA) identified several significantly discriminate CpG sites in LOAD brains.
Marker
AD
brain group
Control
brain group
Covariance
structure
PSEN#19
0.01500
0.02300
0.00028
PSEN#10
0.18583
0.33390
0.03692
APOE 3’-CGI #11
0.83750
0.78800
0.01186
APOE#1
0.11657
0.53124
0.38061
HTATIP#16
0.69658
0.91953
0.07253
HTATIP#15
0.13104
0.16650
0.00375
APP#11
0.79983
0.84820
0.10358
TFAM#8
0.28103
0.65500
0.16522
Supplementary web table 2: Combined DQDA and DLDA (in brackets) analysis of the male subset of LOAD cases compared to the controls revealed a clustering of potential markers within the DNMT1 gene promoter.
Marker
AD
brain group
Control
brain group
Covariance
structure
DNMT1
CpG#3
0.19833
0.22000
0.00015
DNMT1
CpG#4
0.19833
0.22000
0.00015
DNMT1
CpG#5
0.63961
(0.91004)
0.476986
(0.67644)
0.01714
(0.03242)
DNMT1
CpG#9
0.77076
0.85961
0.10258
DNMT1
CpG#10
2.20906
(2.20906)
1.59970
(1.59970)
0.18798
(0.15037)
DNMT1
CpG#12
0.27250
0.31750
0.00788
DNMT1
CpG#17
0.48848
0.56750
0.00308
Supplementary web table 3: Description of the DNA samples used (for more details see the manuscript supplement):