The Aspergillus genus contains leading industrial microorganisms, excelling in producing bioactive compounds and enzymes. Using synthetic biology and bioinformatics, we aim to re-engineer these organisms for applications within human health, pharmaceuticals, environmental engineering, and food production. In this project, we compare the genomes of +300 species from the Aspergillus genus to generate a high-resolution pan-genomic map, representing genetic diversity spanning ~200 million years. We are identifying genes specific to species and clades to allow for guilt-by-association-based mapping of genotype-to-phenotype. To achieve this, we have developed orthologous protein prediction software that utilizes genus-wide genetic diversity. The approach is optimized for large data sets, based on BLASTp considering protein identity and alignment coverage, and clustering using single linkage of bi-directional hits. The result is orthologous protein families describing the genomic and functional features of individual species, clades and the core/pan genome of Aspergillus; and applicable to genotype-to-phenotype analyses in other microbial genera.