get.seq {bio3d}R Documentation

Download FASTA Sequence Files

Description

Downloads FASTA sequence files from the NR, or SWISSPROT/UNIPROT databases.

Usage

get.seq(ids, outfile = "seqs.fasta", db = "nr")

Arguments

ids A character vector of one or more appropriate database codes/identifiers of the files to be downloaded.
outfile A single element character vector specifying the name of the local file to which sequences will be written.
db A single element character vector specifying the database from which sequences are to be obtained.

Details

This is a basic function to automate sequence file download from the NR and SWISSPROT/UNIPROT databases.

Value

If all files are successfully downloaded a list object with two components is returned:

ali an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.
ids sequence names as identifiers.

This is similar to that returned by read.fasta. However, if some files were not successfully downloaded then a vector detailing which ids were not found is returned.

Note

For a description of FASTA format see: http://www.ebi.ac.uk/help/formats_frame.html. When reading alignment files, the dash ‘-’ is interpreted as the gap character.

Author(s)

Barry Grant

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.

See Also

blast.pdb, read.fasta, read.fasta.pdb, get.pdb

Examples

##pdb <- read.pdb( get.pdb("5p21", URLonly=TRUE) )
##blast <- blast.pdb( seq.pdb(pdb), database = "swiss" )
##ids <- plot.blast( blast )
##get.seq(ids)

[Package bio3d version 1.0-6 Index]