Available Databanks at the Institut Pasteur


Page revised : Tue Jan 31 04:51:41 2012 (CET)

The databanks are downloaded from their original sites by Internet. Original versions of each databank are verified daily by an automatic system which downloads the newest data needed to build an update.
The simplest format (flat or fasta) is retrieved ; it is proceeded by local formatters to build the other available formats (Blast2, Srs, Gcg, ...).
This task is completed each night and requires a few hours.
The "Date of preparation" is the date and time the databank is available locally to users. Databanks are normally changed twice a week: on wednesday and sunday at midnight.
Article in french: Databanks at the Institut Pasteur / Les banques de sequences biologiques a l'Institut Pasteur - F. Galisson (2000)
Detailed list

DNA databanks

Name Version Date of preparation Description
Embl 110 25 Jan 2012 20:32 EMBL Nucleotide Sequence Database
Genbank 187.0 30 Jan 2012 15:52 NIH DNA sequence database
Imgt 201147-6 29 Nov 2011 01:10 IMGT/LIGM-DB, ImMunoGeneTics sequence database
RefSeq 51 31 Jan 2012 04:23 NCBI Reference Sequence (RefSeq) Database
Vector - 11 Sep 2010 00:16 Vector subset of GenBank(R), NCBI
WGS - 30 Jan 2012 22:37 Genbank - Whole Genome Shotgun

Protein databanks

Name Version Date of preparation Description
Genpept 186.0 25 Jan 2012 21:45 Translations of Genbank
Nrprot 01.31 30 Jan 2012 23:02 NCBI non-redundant: (Genbank CDS translations+PDB+Swissprot+PIR)
UniRef90 - 26 Jan 2012 01:48 Clustered sets of sequences from UniProt Knowledgebase
Uniprot 2012_01 27 Jan 2012 02:48 Universal Protein Resource = SwissProt + TrEMBL + PIR

Structure databanks

Name Version Date of preparation Description
Dssp - 25 Jan 2012 19:30 Secondary structure assignments database
Hssp - 27 Jan 2012 04:15 Homology-derived secondary structure of proteins database
Pdb - 24 Jan 2012 14:03 Brookhaven Protein Databank

Genome databanks

Name Version Date of preparation Description
Borrelia - 25 Jan 2012 21:45 Borrelia burgdorferi B31, complete genome
Bsubtilis - 10 Jan 2012 11:36 Bacillus subtilis subsp. subtilis str. 168, complete genome
Cpneumoniae - 11 Dec 2011 00:17 Chlamydophila pneumoniae CWL029, complete genome
Ecoli - 26 Jan 2012 01:49 Escherichia coli K12, complete genome
Genitalium - 26 Jan 2012 01:49 Mycoplasma genitalium G-37, complete genome
Hpylori - 11 Dec 2011 00:18 Helicobacter pylori 26695, complete genome
Mtuberculosis - 13 Mar 2011 00:13 Mycobacterium tuberculosis H37Rv, complete genome
Pneumoniae - 26 Jan 2012 01:50 Mycoplasma pneumoniae M129, complete genome
UFMG - 13 Apr 2011 09:54 UnFinished Microbial Genomes (UFMG)
Yeast - 19 Sep 2011 00:20 Yeast chromosomes
Ypestis - 26 Jan 2012 21:05 Yersinia pestis unfinished genome

Other databanks

Name Version Date of preparation Description
Alu - 29 Jan 2012 15:35 Select Alu repeats from REPBASE
Enzyme 20120131 26 Jan 2012 01:49 Enzyme Nomenclature Database
Epd 105 30 Mar 2011 22:03 Eukaryotic Promoter Database
Jaspar - 24 Jul 2009 15:39 High-quality transcription factor binding profile database
Pfam 26.0 31 Jan 2012 04:47 Collection of Protein Domain Families
Prints 41.1 30 Mar 2011 15:53 Protein Motif Fingerprint Database
Prosite 20.79 26 Jan 2012 01:51 Dictionary of Protein Sites and Patterns
RDPII 10_27 9 Dec 2011 15:54 Ribosomal Database Project II database
Rebase 201 10 Jan 2012 16:50 The Restriction Enzyme Database
Taxodb - 31 Jan 2012 04:51 Taxonomy database