UGA logo RCC: Research Computing Center
 
 
Home >
 
 
RESOURCES
SERVICES
Application & Code Development
Consulting
Grantwriting Support

Blast Database

NCBI Blast Database | COG | KEGG | Refseq | FISH | Uniprot| Arabidopsis| Rice(TIGR)| Populus| Medicago| NCBI Bacteria | Homo_sapiens | Sus_Scrofa | Installation
Program on:altix | inQuiry | pcluster | rcluster,IOB

Category(ies): Bioinformatics

NCBI Blast Database:

Updated at 07/14/2008
Lasted version is always at /db/ncbiblast-latest/
07/2008 version is at /db/ncbiblast/07142008, est_others, nr, nt
03/2008 version is at /db/ncbiblast/03142008, est_others, nr, nt
11/2007 version is at /db/ncbiblast/11262007, est_others, nr, nt
06/2007 version is at /db/ncbiblast/06042007, est_others, nr, nt
02/2007 version is at /db/ncbiblast/02082007, est_others, nr, nt
11/2006 version is at /db/ncbiblast/11162006, nr, nt
08/2006 version is at /db/ncbiblast/082006, ecoli, est, nr, nt
for traditional reason, /db/ncbiblast still point to 08/2006 version (depreciated)

nr, nt, est, ecoli

/db/ncbiblast-latest/nr
/db/ncbiblast-latest/nt
/db/ncbiblast-latest/est_others
/db/ncbiblast/ecoli

Updated from NCBI
Details at NCBI BLAST

Back to top

COG:
COG(Cluster of Orthologous Groups of proteins), an updated version includes eukaryotes

Not available at altix

Clusters of Orthologous Groups from 7 eukaryotic complete genomes (KOGs): kyva Clusters of Orthologous Groups from 66 complete genomes: myva

/db/ncbiblast/COG/kyva
/db/ncbiblast/COG/myva

Updated from NCBI COG at 04/07/2006
Details at COG, KOG

Back to top

KEGG:
Last updates: 07/2008
Made by:
formatdb -n kgenespep0708 -i genes.pep -l kgenespep0708.log -p T
formatdb -n kgenesgenome0708 -i genes.genome -l kgenome0708.log -p T
formatdb -n kgenesnuc0708 -i genes.nuc -l kgenesnuc0708.log -p T
Updated at 07/2008. Details at KEGG

Three databases built up seperately at 07/2008 :
kgenespep0708 for genes peptide
kgenesgenome0708 for genome
kgenesnuc0708 for genes nucleotide

/db/ncbiblast/KEGG/kgenespep0708
/db/ncbiblast/KEGG/kgenome0708
/db/ncbiblast/KEGG/kgenes-nt0708

Three databases built up seperately at 06/25/2007 :
kgenespep07 for genes peptide
kgenesgenome07 for genome
kgenesnuc07 for genes nucleotide

/db/ncbiblast/KEGG/kgenespep07
/db/ncbiblast/KEGG/kgenome07
/db/ncbiblast/KEGG/kgenes-nt07



Three databases built up seperately at 2006:
kgenes for genes
kgenome for genome
kgenes-nt for nucleotide

/db/ncbiblast/KEGG/kgenes
/db/ncbiblast/KEGG/kgenome
/db/ncbiblast/KEGG/kgenes-nt

FISH:
Not available at altix

catfish-est-ncbi
zebra-fish-genome-ensembl
zebra-fish-est-ncbi
tetrahymena-thermophila-est-ncbi
tetrahymena-thermophilasgenome-TIGR
plasmodium-falciparum-3D7-genome-ncbi
paramecium-tetraurelia-genome-genoscope

/db/ncbiblast/fish/catfish-est-ncbi
/db/ncbiblast/fish/zebra-fish-genome-ensembl
/db/ncbiblast/fish/zebra-fish-est-ncbi
/db/ncbiblast/fish/tetrahymena-thermophila-est-ncbi
/db/ncbiblast/fish/tetrahymena-thermophilasgenome-TIGR
/db/ncbiblast/fish/plasmodium-falciparum-3D7-genome-ncbi
/db/ncbiblast/fish/paramecium-tetraurelia-genome-genoscope

Updated at 04/07/2006
Details at FISH

Uniprot:
Not available at altix

latest versions:
UniProt Knowledgebase Release 13.6 consists of:
UniProtKB/Swiss-Prot Release 55.6
UniProtKB/TrEMBL Release 38.6

release at 01-Jul-2008 at /db/uniprot/ver13.6
Older versions at /db/uniprot/ver7.0, /db/uniprot/ver9.6, /db/uniprot/ver13.1

uniprot_trembl
uniprot_sprot
uniprot_sprot_varsplic
uniref100

/db/uniprot/latest-version/uniprot_trembl
/db/uniprot/latest-version/uniprot_sprot
/db/uniprot/latest-version/uniprot_sprot_varsplic
/db/uniprot/latest-version/uniref100


formatdb -i uniprot_trembl.fasta -n uniprot_trembl -t uniprot_trembl -l uniprot_trembl.log -p T
formatdb -i uniprot_sprot.fasta -n uniprot_sprot -t uniprot_sprot -l uniprot_sprot.log -p T
formatdb -i uniprot_sprot_varsplic.fasta -n uniprot_sprot_varsplic -t uniprot_sprot_varsplic -l uniprot_sprot_varsplic.log -p T

Details at Uniprot

Back to top

Refseq:
Refseq from NCBI.
Updated at 07/14/2008
Lasted version is always at /db/ncbiblast/refseq-latest/
Version 07/2008 is at /db/ncbiblast/refseq/072008
Version 03/2008 is at /db/ncbiblast/refseq/032008
Version 06/2007 is at /db/ncbiblast/refseq

Directly downloaded from NCBI

/db/ncbiblast/refseq-latest/refseq_genomic
/db/ncbiblast/refseq-latest/refseq_protein
/db/ncbiblast/refseq-latest/refseq_rna

Updated from NCBI
Details at NCBI Refseq

Back to top

Arabidopsis:
Release 8 at:
/db/ncbiblast/TAIR8/T83u
/db/ncbiblast/TAIR8/T85u
/db/ncbiblast/TAIR8/T8bacCon
/db/ncbiblast/TAIR8/T8cdna
/db/ncbiblast/TAIR8/T8cds
/db/ncbiblast/TAIR8/T8downstream1000
/db/ncbiblast/TAIR8/T8downstream3000
/db/ncbiblast/TAIR8/T8downstream500
/db/ncbiblast/TAIR8/T8dsTranslationStart1000
/db/ncbiblast/TAIR8/T8dsTranslationStart3000
/db/ncbiblast/TAIR8/T8intergenic
/db/ncbiblast/TAIR8/T8intron
/db/ncbiblast/TAIR8/T8pep
/db/ncbiblast/TAIR8/T8seq
/db/ncbiblast/TAIR8/T8upstream1000
/db/ncbiblast/TAIR8/T8upstream3000
/db/ncbiblast/TAIR8/T8upstream500
/db/ncbiblast/TAIR8/T8usTranslationStart1000
/db/ncbiblast/TAIR8/T8usTranslationStart3000

Release 7 at:
/db/ncbiblast/TAIR6/
/db/ncbiblast/TAIR7/TAIR7_cdna_20070425
/db/ncbiblast/TAIR7/TAIR7_cds_20070425
/db/ncbiblast/TAIR7/TAIR7_pep_20070425

Release 6 at:
/db/ncbiblast/TAIR6/TAIR6_cdna_20060907
/db/ncbiblast/TAIR6/TAIR6_cds_20060907
/db/ncbiblast/TAIR6/TAIR6_pep_20060907

/db/ncbiblast/TAIR8/T83u /db/ncbiblast/TAIR8/T85u /db/ncbiblast/TAIR8/T8bacCon /db/ncbiblast/TAIR8/T8cdna /db/ncbiblast/TAIR8/T8cds /db/ncbiblast/TAIR8/T8downstream1000 /db/ncbiblast/TAIR8/T8downstream3000 /db/ncbiblast/TAIR8/T8downstream500 /db/ncbiblast/TAIR8/T8dsTranslationStart1000 /db/ncbiblast/TAIR8/T8dsTranslationStart3000 /db/ncbiblast/TAIR8/T8intergenic /db/ncbiblast/TAIR8/T8intron /db/ncbiblast/TAIR8/T8pep /db/ncbiblast/TAIR8/T8seq /db/ncbiblast/TAIR8/T8upstream1000 /db/ncbiblast/TAIR8/T8upstream3000 /db/ncbiblast/TAIR8/T8upstream500 /db/ncbiblast/TAIR8/T8usTranslationStart1000 /db/ncbiblast/TAIR8/T8usTranslationStart3000

For more details of databases, please refer to README
Made by:

formatdb -i TAIR8_3_utr_20080228.fasta -n T83u -t T83u -l T83u.log -p F
formatdb -i TAIR8_5_utr_20080228.fasta -n T85u -t T85u -l T85u.log -p F
formatdb -i TAIR8_bac_con_20080229.fasta -n T8bacCon -t T8bacCon -l T8bacCon.log -p F
formatdb -i TAIR8_cdna_20080412.fasta -n T8cdna -t T8cdna -l T8cdna.log -p F
formatdb -i TAIR8_cds_20080412.fasta -n T8cds -t T8cds -l T8cds.log -p F
formatdb -i TAIR8_downstream_1000_20080229.fasta -n T8downstream1000 -t T8downstream1000 -l T8downstream1000.log -p F
formatdb -i TAIR8_downstream_3000_20080229.fasta -n T8downstream3000 -t T8downstream3000 -l T8downstream3000.log -p F
formatdb -i TAIR8_downstream_500_20080228.fasta -n T8downstream500 -t T8downstream500 -l T8downstream500.log -p F
formatdb -i TAIR8_downstream_translation_start_1000_20080708.fasta -n T8dsTranslationStart1000 -t T8dsTranslationStart1000 -l T8dsTranslationStart1000.log -p F
formatdb -i TAIR8_downstream_translation_start_3000_20080708.fasta -n T8dsTranslationStart3000 -t T8dsTranslationStart3000 -l T8dsTranslationStart3000.log -p F
formatdb -i TAIR8_intergenic_20080229.fasta -n T8intergenic -t T8intergenic -l T8intergenic.log -p F
formatdb -i TAIR8_intron_20080228.fasta -n T8intron -t T8intron -l T8intron.log -p F
formatdb -i TAIR8_pep_20080412.fasta -n T8pep -t T8pep -l T8pep.log -p T
formatdb -i TAIR8_seq_20080412.fasta -n T8seq -t T8seq -l T8seq.log -p T
formatdb -i TAIR8_upstream_1000_20080229.fasta -n T8upstream1000 -t T8upstream1000 -l T8upstream1000.log -p T
formatdb -i TAIR8_upstream_3000_20080229.fasta -n T8upstream3000 -t T8upstream3000 -l T8upstream3000.log -p T
formatdb -i TAIR8_upstream_500_20080228.fasta -n T8upstream500 -t T8upstream500 -l T8upstream500.log -p T
formatdb -i TAIR8_upstream_translation_start_1000_20080708.fasta -n T8usTranslationStart1000 -t T8usTranslationStart1000 -l T8usTranslationStart1000.log -p T
formatdb -i TAIR8_upstream_translation_start_3000_20080708.fasta -n T8usTranslationStart3000 -t T8usTranslationStart300 -l T8usTranslationStart3000.log -p T

Data Source: TAIR and TAIR FTP

Back to top

Rice (TIGR) :
Not available at altix

latest version is TIGR Rice Annotation Release 5.0 (January 24, 2007)

/db/ncbiblast/rice.tigr/allcon
/db/ncbiblast/OSA/allcdna
/db/ncbiblast/OSA/allcds
/db/ncbiblast/OSA/allpep


databases has been established with command:

formatdb -t allcon -i all.con -l allconP.log -p F -n allcon
formatdb -i all.cDNA -n allcdna -t allcdna -l allcdna.log -p F
formatdb -i all.cds -n allcds -t allcds -l allcds.log -p F
formatdb -i all.pep -n allpep -t allpep -l allpep.log -p T

Sequence file is downloaded from TIGR

Populus:
From PTR_JGI.seqs.tgz

/db/ncbiblast/PTR/ptrcdna
/db/ncbiblast/PTR/ptrcds
/db/ncbiblast/PTR/ptrtran
/db/ncbiblast/PTR/ptrprotein

Made by:

formatdb -i ptr.cdna -n ptrcdna -t ptrcdna -l ptrcdna.log -p F
formatdb -i ptr.cds -n ptrcds -t ptrcds -l ptrcds.log -p F
formatdb -i transcripts.Poptr1_1.JamboreeModels.fasta -n ptrtran -t ptrtran -l ptrtran.log -p T
formatdb -i transcripts.Poptr1_1.JamboreeModels.fasta -n ptrtran -t ptrtran -l ptrtran.log -p F
formatdb -i proteins.Poptr1_1.JamboreeModels.fasta -n ptrprotein -t ptrprotein -l ptrprotein.log -p T -o T

Data source: JGI

Back to top

Medicago :
From MGSC
MT2:
/db/ncbiblast/MT2/Mt2cds
/db/ncbiblast/MT2/Mt2prot
/db/ncbiblast/MT2/MtChr2

MT1:
/db/ncbiblast/MT1/Mt1cdna
/db/ncbiblast/MT1/Mt1pep

For details of database, please refer to README.

Made by:

formatdb -i 20080227_imgag_cdsMAPPED_NO_OVERLAP.fa -n Mt2cds -t Mt2cds -l Mt2cds.log -p F
formatdb -i 20080227_imgag_protMAPPED_NO_OVERLAP.fa -n Mt2prot -t Mt2prot -l Mt2prot.log -p T
cat Mtchr* > Mt2chr
formatdb -i Mt2Chr.fa -n Mt2Chr -t Mt2Chr -l Mt2Chr.log -p T

formatdb -i Mt1.cds -n Mt1cdna -t Mt1cds -l Mt1cds.log -p F
formatdb -i Mt1.pep -n Mt1pep -t Mt1pep -l Mt1pep.log -p T

Data source: Medicago

Back to top

NCBI Bacteria:
From NCBI
Updated at 11/10/2007
Lasted version is always at /db/ncbi-genomes-bacteria/Bacteria-latest/
11/2007 version is at /db/ncbi-genomes-bacteria/Bacteria112007
for traditional reason, /db/ncbi-genomes-bacteria/Bacteria still point to 08/2006 version (depreciated)

/db/ncbi-genomes-bacteria/Bacteria-latest

Back to top

Homo_sapien:
From Homo_sapien
Updated at 19/03/2008, Homo_sapiens.NCBI36.48.dna_rm.*
Lasted version is always at /db/homo_sapien/latest/
19/03/2008 version is at /db/homo_sapien/NCBI36.48

/db/homo_sapien/homo_sapien-latest/Homo_sapiens

Back to top

Sus_scrofa:
From EBI
Updated at 19/03/2008,
select a chromosome from the pull-down box; Click on the [GO] button; use the “Export sequence as FASTA” to get the sequences. The file structure is explained at http://pre.ensembl.org/info/data/ftp_files.html .

Lasted version is always at /db/sus_scrofa/latest/
19/03/2008 version is at /db/sus_scrofa/0308

/db/sus_scrofa/latest/sus_scofa

Back to top

Installation:

System(s): Unix

Back to top

 
Partnering with UGA