FAQs
1- How are the homologous sequences calculated?
We obtained the human proteins (129 manually curated) and conducted orthologues detection, with modified parameters, and checked for conservation of PFAM domain architectures. For details, see our paper here.2- How are ages calculated?
We assigned ages following a species tree, available here. When a human homologous protein is found at the deepest group of the tree, we assign that age to the gene. For details, see our paper here.3- Where do the Post-translational modifications come from?
From available literature. Links are provided in the database.4- How is the pathway assignation made?
According to available literature and checked by experts in the field.5- Why are some proteins missing in some pathways (paralogues of i.e. repair genes)?
In the paper we aimed to construct a trustable bona-fide set, the less redundant as possible in order to further conduct statistical calculations. This is the reason why many paralogues have not been included in the database.Links to methods used to generate data
- Inparanoid, a method to identify groups of orthologous proteins across multiple genomes
- MAFFT, a multiple sequence alignment tool
- Count, to analyze phylogenetic profiles
- iTOL, an interactive phylogenetic tree visualizer
- MrBayes, to compute phylogenetic probabilistic trees
- Jalview, a sofisticated Multiple sequence alignment visualizer
- Belvu, a simple Multiple sequence alignment visualizer
- HMMER V.3, sequence searches using hidden Markov models
- BioJS, WebGL protein structure viewer.
- And our paper in Mol. Biol. Evol
Downloads
- Download here the database as a MySQL dump. ( Schema )
- Download Full fasta file with all Human DDR proteins and their orthologs. ( Species codes )
- Download all protein families alignments and trees. ( Families codes )