read.pdb {bio3d} | R Documentation |
Read a Protein Data Bank (PDB) coordinate file.
read.pdb(file, maxlines = 50000, multi = FALSE, rm.insert = FALSE, rm.alt = TRUE, het2atom=FALSE, verbose = TRUE)
file |
the name of the PDB file to be read. |
maxlines |
the maximum number of lines to read before giving up with large files. Default is 50,000 lines. |
multi |
logical, if TRUE multiple ATOM records are read for all models in multi-model files. |
rm.insert |
logical, if TRUE PDB insert records are ignored. |
rm.alt |
logical, if TRUE PDB alternate records are ignored. |
het2atom |
logical, if TRUE HETATM PDB records are stored as ATOM records and returned in the output as such, this should be used with caution. |
verbose |
print details of the reading process. |
maxlines
may require increasing for some large multi-model files.
The preferred means of reading such data is via binary DCD format
trajectory files (see the read.dcd
function).
Returns a list of class "pdb"
with the following components:
atom |
a character matrix containing all atomic coordinate ATOM data, with a row per ATOM and a column per record type. See below for details of the record type naming convention (useful for accessing columns). |
het |
a character matrix containing atomic coordinate records
for atoms within “non-standard” HET groups (see atom ). |
helix |
‘start’, ‘end’ and ‘length’ of H type sse, where start and end are residue numbers “resno”. |
sheet |
‘start’, ‘end’ and ‘length’ of E type sse, where start and end are residue numbers “resno”. |
seqres |
sequence from SEQRES field. |
xyz |
a numeric vector of ATOM coordinate data. |
calpha |
logical vector with length equal to nrow(atom)
with TRUE values indicating a C-alpha “elety”. |
For both atom
and het
list components the column names can be
used as a convenient means of data access, namely:
Atom serial number “eleno” ,
Atom type “elety”,
Alternate location indicator “alt”,
Residue name “resid”,
Chain identifier “chain”,
Residue sequence number “resno”,
Code for insertion of residues “insert”,
Orthogonal coordinates “x”,
Orthogonal coordinates “y”,
Orthogonal coordinates “z”,
Occupancy “o”, and
Temperature factor “b”.
See examples for further details.
Barry Grant
Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.
For a description of PDB format (version2.2) see:
http://www.rcsb.org/pdb/file_formats/pdb/pdbguide2.2/guide2.2_frame.html.
atom.select
, write.pdb
,
read.dcd
, read.fasta.pdb
,
read.fasta
# Read a PDB file pdb <- read.pdb( system.file("examples/1bg2.pdb", package="bio3d") ) # Print data for the first atom pdb$atom[1,] # Look at the first het atom pdb$het[1,] # Print some coordinate data pdb$atom[, c("x","y","z")] # Print C-alpha coordinates (can also use 'atom.select') ##pdb$xyz[pdb$calpha, c("resid","x","y","z")] # Print SSE data (for helix and sheet) pdb$helix pdb$sheet$start # Print SEQRES data pdb$seqres # Renumber residues nums <- as.numeric(pdb$atom[,"resno"]) pdb$atom[,"resno"] <- nums - (nums[1] - 1) # Write out renumbered PDB file write.pdb(pdb=pdb,file="eg.pdb")