TY - JOUR
T1 - PDBminer to Find and Annotate Protein Structures for Computational Analysis
AU - Degn, Kristine
AU - Beltrame, Ludovica
AU - Tiberti, Matteo
AU - Papaleo, Elena
N1 - Publisher Copyright:
© 2023 American Chemical Society.
PY - 2023
Y1 - 2023
N2 - Computational methods relying on protein structure strongly depend on the structure selected for investigation. Typical sources of protein structures include experimental structures available at the Protein Data Bank (PDB) and high-quality in silico model structures, such as those available at the AlphaFold Protein Structure Database. Either option has significant advantages and drawbacks, and exploring the wealth of available structures to identify the most suitable ones for specific applications can be a daunting task. We provide an open-source software package, PDBminer, with the purpose of making structure identification and selection easier, faster, and less error prone. PDBminer searches the AlphaFold Database and the PDB for available structures of interest and provides an up-to-date, quality-ranked table of structures applicable for further use. PDBminer provides an overview of the available protein structures to one or more input proteins, parallelizing the runs if multiple cores are specified. The output table reports the coverage of the protein structures aligned to the UniProt sequence, overcoming numbering differences in PDB structures and providing information regarding model quality, protein complexes, ligands, and nucleic acid chain binding. The PDBminer2coverage and PDBminer2network tools assist in visualizing the results. PDBminer can be applied to overcome the tedious task of choosing a PDB structure without losing the wealth of additional information available in the PDB. Here, we showcase the main functionalities of the package on the p53 tumor suppressor protein. The package is available at http://github.com/ELELAB/PDBminer.
AB - Computational methods relying on protein structure strongly depend on the structure selected for investigation. Typical sources of protein structures include experimental structures available at the Protein Data Bank (PDB) and high-quality in silico model structures, such as those available at the AlphaFold Protein Structure Database. Either option has significant advantages and drawbacks, and exploring the wealth of available structures to identify the most suitable ones for specific applications can be a daunting task. We provide an open-source software package, PDBminer, with the purpose of making structure identification and selection easier, faster, and less error prone. PDBminer searches the AlphaFold Database and the PDB for available structures of interest and provides an up-to-date, quality-ranked table of structures applicable for further use. PDBminer provides an overview of the available protein structures to one or more input proteins, parallelizing the runs if multiple cores are specified. The output table reports the coverage of the protein structures aligned to the UniProt sequence, overcoming numbering differences in PDB structures and providing information regarding model quality, protein complexes, ligands, and nucleic acid chain binding. The PDBminer2coverage and PDBminer2network tools assist in visualizing the results. PDBminer can be applied to overcome the tedious task of choosing a PDB structure without losing the wealth of additional information available in the PDB. Here, we showcase the main functionalities of the package on the p53 tumor suppressor protein. The package is available at http://github.com/ELELAB/PDBminer.
U2 - 10.1021/acs.jcim.3c00884
DO - 10.1021/acs.jcim.3c00884
M3 - Journal article
C2 - 37977136
AN - SCOPUS:85175686434
SN - 1549-9596
VL - 63
SP - 7274
EP - 7281
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 23
ER -