.
The Open Protein Structure Annotation Network
PDB Keyword
.
 

DUFs

    PFAM DUF families solved by PSI centers

     

    The genome projects have unearthed an enormous diversity of novel genes of unknown function that require biological and biochemical characterization to assess their role in the organism(s) from which they were derived. These genes, like all others, can be grouped into families based on sequence similarity.

    duf_sizes.PNG

    The PFAM database 23.0 contains over 2200 such families, referred to as Domains of Unknown Function (DUF). In a coordinated effort, the four large-scale centers of the NIH Protein Structure Initiative have determined the first three‑dimensional structures for more than 250 of these DUF families. Analysis of the first 248, solved until October 2008, reveals that they significantly vary in size (with an average of  252 proteins) and in contributions from sequenced genomes and from metagenomic data (see the chart on the right). It also shows that about two thirds of the DUF families likely represent very divergent branches of already known and well-characterized families, which allows us to propose hypotheses about their biological function. The remainder can be formally categorized as new folds or topologies, although about one third of these show significant sub-structure similarity to previously characterized folds. The homology to functionally annotated protein families remains an important clue in proposing hypotheses about functions of DUF families but it is usually not sufficient for a very reliable functional annotation. The chart below shows overall percentages of DUF families with new folds, new folds partially similar to previously known folds, putative analogs, putative homologs and recognizable homologs. homology.PNGThe inset pie charts show the percentage of DUF families with proposed hypothesis about function in each of these six categories. From a more general perspective, our results infer that, despite the enormous increase in the number and the diversity of new genes being uncovered, the fold space of proteins encoded by those genes is gradually becoming saturated. These previously unexplored sectors of the protein universe are, therefore, primarily shaped by extreme diversification of known protein families, which enables organisms to evolve new functions and adapt to particular niches and habitats. Notwithstanding, these DUF families still constitute the richest source for discovery of the remaining protein folds and topologies.We recently published a paper on the structural analysis of DUF families solved by PSI centers, which was published in Plos Biology.

    http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1000205  

    A list of PFAM DUF families solved by PSI centers 

    DUFs: 248

    Displaying: 0 - 10

    Next
    Representative Structure
    Annotation Solved by Fold Type

    PF01519: Pfam family PFAM:PF01519 {Protein of unknown function DUF16 } with 42 members in NR database and additional 2 members in the metagenomic datasets is represented in Archaea, and Bacteria. The first st... BSGC Homolog

    PF01796: Pfam family PFAM:PF01796 {Domain of unknown function DUF35 } with 1010 members in NR database and additional 561 members in the metagenomic datasets is represented in Archaea, and Bacteria. The first... JCSG Homolog

    PF01861: Pfam family PFAM:PF01861 {Protein of unknown function DUF43} with 30 members in NR database. The first structural representative solved (PDB Id: TOPSAN:2qm3) was subject to FATCAT structural similarit... MCSG Homolog

    PF01865: Pfam family PFAM:PF01865 {Protein of unknown function DUF47 } with 595 members in NR database and additional 352 members in the metagenomic datasets is represented in Archaea, and Bacteria. The first... JCSG Homolog

    PF01877: Pfam family PFAM:PF01877 {Protein of unknown function DUF54 } with 175 members in NR database and additional 89 members in the metagenomic datasets is represented in Archaea only. The first structural... NYSGXRC Putative Analog

    PF01883: Pfam family PFAM:PF01883 {Domain of unknown function DUF59 } with 3219 members in NR database and additional 2322 members in the metagenomic datasets is represented in Archaea, Bacteria, and Eukaryota... JCSG Putative Homolog

    PF01893: Pfam family PFAM:PF01893 {Uncharacterized protein family UPF0058 } with 41 members in NR database and additional 5 members in the metagenomic datasets is represented in Archaea. The first structural r... NESG Putative Analog

    PF01904: Pfam family PFAM:PF01904 {Protein of unknown function DUF72 } with 478 members in NR database and additional 168 members in the metagenomic datasets is represented in Archaea, Bacteria, and Eukaryota.... JCSG Putative Homolog

    PF01906: Pfam family PFAM:PF01906 {Domain of unknown function DUF74 } with 756 members in NR database and additional 468 members in the metagenomic datasets is represented in Archaea, Bacteria, and Eukaryota. ... MCSG Putative Analog

    PF01908: Pfam family PFAM:PF01908 {Protein of unknown function DUF75} with 848 members in NR database and additional 664 members in the metagenomic datasets is represented in Archaea, Bacteria, and Eukaryota. ... MCSG Putative Analog
    Next
     
     

     
     
     
     

    Reviews

    References

     

    No references found.

    Tag page

    Files (2)

    FileSizeDateAttached by 
     duf_sizes.PNG
    No description
    6.42 kB20:45, 1 Oct 2009lukaszActions
     homology.PNG
    No description
    12.02 kB21:54, 30 Sep 2009lukaszActions
    You must login to post a comment.

    ABOUT SSL CERTIFICATES
    All content on this site is licensed under a Creative Commons Attribution 3.0 License