Unexpected features of the ‘dark’ proteome
We surveyed the ‘dark’ proteome, i.e., regions of proteins that remain stubbornly inaccessible to both experimental structure determination and modeling. Building upon a recent structural modeling study covering 546,000 proteins across many organisms, we find 44–54% of the proteome in eukaryotes and viruses is dark, compared with only 14% for archaea and bacteria. This includes 68,621 dark proteins, in which the entire sequence lacks reliable similarity to any known structure. Surprisingly, most dark proteins cannot be accounted for by conventional explanations (e.g., intrinsic disorder, transmembrane regions, or compositional bias). Dark proteins are most strongly associated with secretion, the endoplasmic reticulum, specific tissues, disulfide bonding, proteolytic cleavage, and shorter sequence length. They also have surprisingly few interactions with other proteins, and, in humans, some association with cancer and retroviruses. Our results suggest new research directions in structural and computational biology.
|Authors||Perdigao, N.; Heinrich, J.; Stolte, C.; Sabir, K.S.; Buckley, M.J.; Tabor, B; Signal, B; Gloss, B.S.; Hammang, C.J.; Rost, B.; Schafferhans, A.; O'Donoghue, S.I.|
|Publisher Name||PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA|
|URL link to publisher's version||http://www.ncbi.nlm.nih.gov/pubmed/26578815|
|OpenAccess link to author's accepted manuscript version||https://publications.gimr.garvan.org.au/open-access/12554|