Skip to main content

Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive

  • Protocol
  • First Online:
Protein Crystallography

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1607))

Abstract

The Protein Data Bank (PDB)––the single global repository of experimentally determined 3D structures of biological macromolecules and their complexes––was established in 1971, becoming the first open-access digital resource in the biological sciences. The PDB archive currently houses ~130,000 entries (May 2017). It is managed by the Worldwide Protein Data Bank organization (wwPDB; wwpdb.org), which includes the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu). The four wwPDB partners operate a unified global software system that enforces community-agreed data standards and supports data Deposition, Biocuration, and Validation of ~11,000 new PDB entries annually (deposit.wwpdb.org). The RCSB PDB currently acts as the archive keeper, ensuring disaster recovery of PDB data and coordinating weekly updates. wwPDB partners disseminate the same archival data from multiple FTP sites, while operating complementary websites that provide their own views of PDB data with selected value-added information and links to related data resources. At present, the PDB archives experimental data, associated metadata, and 3D-atomic level structural models derived from three well-established methods: crystallography, nuclear magnetic resonance spectroscopy (NMR), and electron microscopy (3DEM). wwPDB partners are working closely with experts in related experimental areas (small-angle scattering, chemical cross-linking/mass spectrometry, Forster energy resonance transfer or FRET, etc.) to establish a federation of data resources that will support sustainable archiving and validation of 3D structural models and experimental data derived from integrative or hybrid methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Protein Data Bank (1971) Protein Data Bank. Nature New Biology 233:223

    Google Scholar 

  2. Kendrew JC, Bodo G, Dintzis HM et al (1958) A three-dimensional model of the myoglobin molecule obtained by X-ray analysis. Nature 181:662–666

    Article  CAS  PubMed  Google Scholar 

  3. Kendrew JC, Dickerson RE, Strandberg BE et al (1960) Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å resolution. Nature 185:422–427

    Article  CAS  PubMed  Google Scholar 

  4. Bolton W, Perutz MF (1970) Three dimensional fourier synthesis of horse deoxyhaemoglobin at 2.8 Ångstrom units resolution. Nature 228:551–552

    Article  CAS  PubMed  Google Scholar 

  5. Perutz MF, Rossmann MG, Cullis AF et al (1960) Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5 Å resolution, obtained by X-ray analysis. Nature 185:416–422

    Article  CAS  PubMed  Google Scholar 

  6. Cold Spring Laboratory (1972) Cold Spring Harbor Symposia on quantitative biology, vol 36. Cold Spring Laboratory Press, Cold Spring Harbor, NY

    Google Scholar 

  7. Berman H (2008) The Protein Data Bank: a historical perspective. Acta Crystallogr A 64:88–95

    Article  CAS  PubMed  Google Scholar 

  8. Meyer EF (1997) The first years of the Protein Data Bank. Protein Sci 6:1591–1597

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. International Union of Crystallography (1989) Policy on publication and the deposition of data from crystallographic studies of biological macromolecules. Acta Crystallogr A 45:658

    Article  Google Scholar 

  10. Sussman JL, Lin D, Jiang J et al (1998) Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules. Acta Crystallogr D Biol Crystallogr 54:1078–1084

    Article  CAS  PubMed  Google Scholar 

  11. Berman HM, Westbrook J, Feng Z et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Standley DM, Kinjo AR, Kinoshita K et al (2008) Protein structure databases with new web services for structural biology and biomedical research. Brief Bioinform 9:276–285

    Article  CAS  PubMed  Google Scholar 

  13. Keller PA, Henrick K, McNeil P et al (1998) Deposition of macromolecular structures. Acta Crystallogr D Biol Crystallogr 54:1105–1108

    Article  CAS  PubMed  Google Scholar 

  14. Velankar S, van Ginkel G, Alhroub Y et al (2016) PDBe: improved accessibility of macromolecular structure data from PDB and EMDB. Nucleic Acids Res 44:D385–D395

    Article  PubMed  Google Scholar 

  15. Berman HM, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10:980

    Article  CAS  PubMed  Google Scholar 

  16. Ulrich EL, Markley JL, Kyogoku Y (1989) Creation of a nuclear magnetic resonance data repository and literature database. Protein Seq Data Anal 2:23–37

    CAS  PubMed  Google Scholar 

  17. Markley JL, Ulrich EL, Berman HM et al (2008) BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): new policies affecting biomolecular NMR depositions. J Biomol NMR 40:153–155

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Ulrich EL, Akutsu H, Doreleijers JF et al (2008) BioMagResBank. Nucleic Acids Res 36:D402–D408

    Article  CAS  PubMed  Google Scholar 

  19. Velankar S, Best C, Beuth B et al (2010) PDBe: Protein Data Bank in Europe. Nucleic Acids Res 38:D308–D317

    Article  CAS  PubMed  Google Scholar 

  20. Lin D, Manning NO, Jiang J et al (2000) AutoDep: a web-based system for deposition and validation of macromolecular structural information. Acta Crystallogr D Biol Crystallogr 56:828–841

    Article  CAS  PubMed  Google Scholar 

  21. Tagari M, Tate J, Swaminathan GJ et al (2006) E-MSD: improving data deposition and structure quality. Nucleic Acids Res 34:D287–D290

    Article  CAS  PubMed  Google Scholar 

  22. Read RJ, Adams PD, Arendall WB et al (2011) A new generation of crystallographic validation tools for the Protein Data Bank. Structure 19:1395–1412

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Montelione GT, Nilges M, Bax A et al (2013) Recommendations of the wwPDB NMR Validation Task Force. Structure 21:1563–1570

    Article  CAS  PubMed  Google Scholar 

  24. Henderson R, Sali A, Baker ML et al (2012) Outcome of the first electron microscopy validation task force meeting. Structure 20:205–214

    Article  CAS  PubMed  Google Scholar 

  25. Berman HM, Burley SK, Chiu W et al (2006) Outcome of a workshop on archiving structural models of biological macromolecules. Structure 14:1211–1217

    Article  CAS  PubMed  Google Scholar 

  26. Arnold K, Kiefer F, Kopp J et al (2009) The Protein Model Portal. J Struct Funct Genom 10:1–8

    Article  CAS  Google Scholar 

  27. Trewhella J, Hendrickson WA, Kleywegt GJ et al (2013) Report of the wwPDB Small-Angle Scattering Task Force: data requirements for biomolecular modeling and the PDB. Structure 21:875–881

    Article  CAS  PubMed  Google Scholar 

  28. Valentini E, Kikhney AG, Previtali G et al (2015) SASBDB, a repository for biological small-angle scattering data. Nucleic Acids Res 43:D357–D363

    Article  PubMed  Google Scholar 

  29. Groom CR, Bruno IJ, Lightfoot MP et al (2016) The Cambridge Structural Database. Acta Crystallogr B 72:171–179

    Article  CAS  Google Scholar 

  30. Adams PD, Aertgeerts K, Bauer C et al (2016) Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. Structure 24:502–508

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Meyer PA, Socias S, Key J et al (2016) Data publication with the structural biology data grid supports live analysis. Nature Commun 7:10882

    Article  CAS  Google Scholar 

  32. Markley JL, Ulrich EL, Westler WM et al (2003) Macromolecular structure determination by NMR spectroscopy. In: Bourne PE, Weissig H (eds) Structural bioinformatics. John Wiley & Sons, Inc., Hoboken, NJ, pp 89–113

    Google Scholar 

  33. Lawson CL, Patwardhan A, Baker ML et al (2016) EMDataBank unified data resource for 3DEM. Nucleic Acids Res 44:D396–D403

    Article  PubMed  Google Scholar 

  34. Iudin A, Korir PK, Salavert-Torres J et al (2016) EMPIAR: a public archive for raw electron microscopy image data. Nat Methods 13:387

    Article  CAS  PubMed  Google Scholar 

  35. Bernstein FC, Koetzle TF, Williams GJB et al (1977) Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol 112:535–542

    Article  CAS  PubMed  Google Scholar 

  36. Fitzgerald PMD, Westbrook JD, Bourne PE et al (2005) 4.5 Macromolecular dictionary (mmCIF). In: Hall SR, McMahon B (eds) International Tables for Crystallography G. Definition and exchange of crystallographic data. Springer, Dordrecht, The Netherlands, pp 295–443

    Google Scholar 

  37. Westbrook JD, Henrick K, Ulrich EL et al (2005) Appendix 3.6.2. The Protein Data Bank Exchange Data Dictionary. In: Hall SR, McMahon B (eds) International Tables for Crystallography G. Definition and exchange of crystallographic data. Springer, Dordrecht, The Netherlands, pp 195–198

    Google Scholar 

  38. Westbrook J, Ito N, Nakamura H et al (2005) PDBML: the representation of archival macromolecular structure data in XML. Bioinformatics 21:988–992

    Article  CAS  PubMed  Google Scholar 

  39. Kinjo AR, Suzuki H, Yamashita R et al (2012) Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format. Nucleic Acids Res 40:D453–D460

    Article  CAS  PubMed  Google Scholar 

  40. Yokochi M, Kobayashi N, Ulrich EL et al (2016) Publication of nuclear magnetic resonance experimental data with semantic web technology and the application thereof to biomedical research of proteins. J Biomed Semantics 7:16

    Article  PubMed  PubMed Central  Google Scholar 

  41. Malfois M, Svergun DI (2000) sasCIF: an extension of core Crystallographic Information File for SAS. J Appl Crystallogr 33:812–816

    Article  CAS  Google Scholar 

  42. Ulrich EL, Argentar D, Klimowicz A et al (1996) STAR/CIF macromolecular NMR data dictionaries and data file formats. Acta Crystallogr A 52:C577–C577

    Article  Google Scholar 

  43. Berman HM, Henrick K, Nakamura H et al (2009) The Worldwide Protein Data Bank. In: Gu J, Bourne PE (eds) Structural bioinformatics, 2nd edn. Wiley, Hoboken, NJ, pp 293–303

    Google Scholar 

  44. Doreleijers JF, Vranken WF, Schulte C et al (2012) NRG-CING: integrated validation reports of remediated experimental biomolecular NMR data and coordinates in wwPDB. Nucleic Acids Res 40:D519–D524

    Article  CAS  PubMed  Google Scholar 

  45. Doreleijers JF, Vranken WF, Schulte C et al (2009) The NMR restraints grid at BMRB for 5,266 protein and nucleic acid PDB entries. J Biomol NMR 45:389–396

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Gutmanas A, Adams PD, Bardiaux B et al (2015) NMR Exchange Format: a unified and open standard for representation of NMR restraint data. Nat Struct Mol Biol 22:433–434

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Westbrook JD, Shao C, Feng Z et al (2015) The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank. Bioinformatics 31:1274–1278

    Article  PubMed  Google Scholar 

  48. Dutta S, Dimitropoulos D, Feng Z et al (2014) Improving the representation of peptide-like inhibitor and antibiotic molecules in the Protein Data Bank. Biopolymers 101:659–668

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212

    Article  Google Scholar 

  50. Caboche S, Pupin M, Leclere V et al (2008) NORINE: a database of nonribosomal peptides. Nucleic Acids Res 36:D326–D331

    Article  CAS  PubMed  Google Scholar 

  51. Haas J, Roth S, Arnold K et al (2013) The Protein Model Portal—a comprehensive resource for protein structure and model information. Database 2013:bat031

    Article  PubMed  PubMed Central  Google Scholar 

  52. Prischi F, Pastore A (2016) Application of nuclear magnetic resonance and hybrid methods to structure determination of complex systems. Adv Exper Med Biol 896:351–368

    Article  Google Scholar 

  53. Cornilescu G, Didychuk AL, Rodgers ML et al (2016) Structural analysis of multi-helical RNAs by NMR-SAXS/WAXS: application to the U4/U6 di-snRNA. J Mol Biol 428:777–789

    Article  CAS  PubMed  Google Scholar 

  54. Venditti V, Egner TK, Clore GM (2016) Hybrid approaches to structural characterization of conformational ensembles of complex macromolecular systems combining NMR residual dipolar couplings and solution X-ray scattering. Chem Rev 116:6305–6322

    Article  CAS  PubMed  Google Scholar 

  55. Erzberger JP, Stengel F, Pellarin R et al (2014) Molecular architecture of the 40SeIF1eIF3 translation initiation complex. Cell 158:1123–1135

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Sali A, Berman HM, Schwede T et al (2015) Outcome of the First wwPDB Hybrid/Integrative Methods Task Force Workshop. Structure 23:1156–1167

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgments

The RCSB PDB is supported by the National Science Foundation (DBI 1338415), National Institutes of Health, and the Department of Energy; PDBe by the Wellcome Trust, BBSRC, MRC, EU, CCP4 , and EMBL-EBI; PDBj by JST-NBDC; and BMRB by the National Institute of General Medical Sciences (GM109046). We thank Christine Zardecki for expert help with manuscript preparation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen K. Burley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Burley, S.K., Berman, H.M., Kleywegt, G.J., Markley, J.L., Nakamura, H., Velankar, S. (2017). Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive. In: Wlodawer, A., Dauter, Z., Jaskolski, M. (eds) Protein Crystallography. Methods in Molecular Biology, vol 1607. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7000-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7000-1_26

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-6998-2

  • Online ISBN: 978-1-4939-7000-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics