To suggest a functional relationship of loci associated with IgG glycosylation, we performed pairwise association analyses of glycome-wide effects of lead SNPs. Glycome-wide effect of the SNP was defined as the vector of z scores [z score = β/se(β)] of this SNP on each of glycans.

Before analysis, we removed 15 derived traits (IGP41 to IGP54) that represent directly measured glycans (IGP1 to IGP15) that were normalized with the surface area of neutrally charged glycans (see the Supplementary Note for details).

We then constructed glycome-wide effects for all 27 lead SNPs from the meta-analysis results for 62 glycans, resulting in 27 vectors of 62 z scores. For every pair of SNPs, we computed the Spearman’s correlation coefficient and corresponding P value between 27 glycome-wide effects.

The network of significant Spearman’s correlations [P value corrected for (27 × 26)/2 = 351 tests, P ≤ 1.4 × 10−4] was visualized using Cytoscape (44), where each node represents one lead SNP annotated with the gene prioritized in the region of that SNP and width and color intensity of edges present squared Spearman’s correlation coefficient of the two nodes. We next performed clustering analysis of the full correlation matrix by applying hierarchical clustering with Euclidean distance and complete linkage. To validate the network, we performed permutation analysis and compared the network with STRING PPI networks (16).

The permutation analysis was performed by estimating the correlation of glycome-wide effects of every top SNP and glycome-wide effects of 100,000 random SNPs. To ensure that no glycosylation-associated SNPs were included in the permutation analysis, the following procedure was used. From the list of all HapMap release 22 SNPs (2,574,585 SNPs), all variants from the associated loci were removed, resulting in 2,510,568 remaining SNPs. An additional 6641 SNPs were removed because they had PGWAS ≤ 5 × 10−5 in at least one glycosylation GWAS. The final list contained 2,503,927 SNPs. From this list, 100,000 SNPs were randomly selected for the permutation analysis. A Spearman’s correlation between glycome-wide effects of each of the 27 top SNPs with glycome-wide effects of 100,000 random SNPs was computed. These correlations were then compared with correlations used to construct the network.

Additional validation was performed by comparing the network obtained by submitting IgG N-glycosylation candidate genes to the STRING database of PPIs (version 10.5, accessed in September 2017) (16).

ChIP-seq data (.narrowPeak files) for the TFs were downloaded from for the GM12878 LCL. The peak-motifs tool (18) in the Regulatory Sequence Analysis Tools was used for motif discovery in the peak sequences of the ChIP-seq datasets. To assess whether glycosylation-associated SNPs disrupt or introduce TF-binding sites, we first filtered out all SNPs with no predicted effect on TF binding, with TF weight score P values higher than 1.8 × 10−8 (0.05/2,764,712, Bonferroni corrected for the number of tests performed). We then further filtered out the SNPs whose alternate alleles had similar TF-binding scores and focused on SNPs for which one allele results in loss of TF-binding site. To compare frequencies of TF-binding site disruptions of glycosylation SNPs and other SNPs in the region, we performed the same analysis using all significantly and suggestively associated glycosylation SNPs (associated SNPs) and all SNPs within 50 kb of every glycosylation locus whose association P value was ≤5 × 10−4 (nonassociated SNPs) and compared frequency of SNPs that are predicted to significantly (Bonferroni-corrected P ≤ 1.8 × 10−8) disrupt binding for the given TF.

Hi-C data derived from GM12878, digested with Mbo I, were interrogated using the online tool available at (45). More details can be found in the Supplementary Note.