DDRprot:DDR Proteins Data Base Documentation

FAQs

1- How are the homologous sequences calculated?

We obtained the human proteins (129 manually curated) and conducted orthologues detection, with modified parameters, and checked for conservation of PFAM domain architectures. For details, see our paper here.

2- How are ages calculated?

We assigned ages following a species tree, available here. When a human homologous protein is found at the deepest group of the tree, we assign that age to the gene. For details, see our paper here.

3- Where do the Post-translational modifications come from?

From available literature. Links are provided in the database.

4- How is the pathway assignation made?

According to available literature and checked by experts in the field.

5- Why are some proteins missing in some pathways (paralogues of i.e. repair genes)?

In the paper we aimed to construct a trustable bona-fide set, the less redundant as possible in order to further conduct statistical calculations. This is the reason why many paralogues have not been included in the database.

Links to methods used to generate data

Inparanoid, a method to identify groups of orthologous proteins across multiple genomes
MAFFT, a multiple sequence alignment tool
Count, to analyze phylogenetic profiles
iTOL, an interactive phylogenetic tree visualizer
MrBayes, to compute phylogenetic probabilistic trees
Jalview, a sofisticated Multiple sequence alignment visualizer
Belvu, a simple Multiple sequence alignment visualizer
HMMER V.3, sequence searches using hidden Markov models
BioJS, WebGL protein structure viewer.
And our paper in Mol. Biol. Evol

Workflow

Downloads

Download here the database as a MySQL dump. ( Schema )
Download Full fasta file with all Human DDR proteins and their orthologs. ( Species codes )
Download all protein families alignments and trees. ( Families codes )

The DDRprot team

Ildefonso Cases
Eduardo Andres-Leon
Aida Arcas
Ana M. Rojas