We aligned putative amino acid sequences, which translated from the iHSMGC, against the proteins or domains in KEGG databases (release 84.0, genes from animals or plants were excluded) using BLASTP (v2.7.1, default parameters except that -outfmt 6 -evalue 1e-6). At least 30% alignment coverage of each gene was required. Each protein was assigned to a KEGG orthologue (KO) based on the best-hit gene in the database. Using this approach, 6.42 million of the genes in the combined gene catalog could be assigned a KO.