dm {bio3d} | R Documentation |
Construct a distance matrix for a given protein structure.
dm(pdb, selection = "calpha", verbose=TRUE) dm.xyz(xyz, grpby = NULL, scut = NULL, mask.lower = TRUE)
pdb |
a pdb structure object as returned by
read.pdb or a numeric vector of ‘xyz’ coordinates. |
selection |
a character string for selecting the pdb atoms
to undergo comparison (see atom.select ). |
verbose |
logical, if TRUE possible warnings are printed. |
xyz |
a numeric vector of Cartesian coordinates. |
grpby |
a vector counting connective duplicated elements that
indicate the elements of xyz that should be considered as a group
(e.g. atoms from a particular residue). |
scut |
a cutoff neighbour value which has the effect of excluding atoms, or groups, that are sequentially within this value. |
mask.lower |
logical, if TRUE the lower matrix elements (i.e. those below the diagonal) are returned as NA. |
Distance matrices, also called distance plots or distance maps, are an established means of describing and comparing protein conformations (e.g. Phillips, 1970; Holm, 1993).
A distance matrix is a 2D representation of 3D structure that is independent of the coordinate reference frame and, ignoring chirality, contains enough information to reconstruct the 3D Cartesian coordinates (e.g. Havel, 1983).
Returns a numeric matrix of class "dmat"
, with all N by N
distances, where N is the number of selected atoms.
The input selection
can be any character string or pattern
interpretable by the function atom.select
. For example,
shortcuts "calpha"
, "back"
, "all"
and selection
strings of the form /segment/chain/residue number/residue
name/element number/element name/
; see atom.select
for details.
If a coordinate vector is provided as input (rather than a pdb
object) the selection
option is redundant and the input vector
should be pruned instead to include only desired positions.
Barry Grant
Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.
Phillips (1970) Biochem. Soc. Symp. 31, 11–28.
Holm (1993) J. Mol. Biol. 233, 123–138.
Havel (1983) Bull. Math. Biol. 45, 665–720.
plot.dmat
, read.pdb
, atom.select
##--- Distance Matrix Plot pdb <- read.pdb( system.file("examples/d1bg2__.ent", package = "bio3d") ) k <- dm(pdb,selection="calpha") filled.contour(k, nlevels = 4) ##--- DDM: Difference Distance Matrix # Read aligned PDBs aln <- read.fasta(system.file("examples/kif1a.fa",package="bio3d")) pdb.path=paste(system.file(package="bio3d"),"/examples/",sep="") m <- read.fasta.pdb(aln, pdb.path = pdb.path, pdbext = ".ent") # Get distance matrix a <- dm(m$xyz[2,]) b <- dm(m$xyz[3,]) # Calculate DDM c <- a - b # Plot DDM plot(c,key=FALSE, grid=FALSE) plot(c, axis.tick.space=10, resnum.1=m$resno[1,], resnum.2=m$resno[2,], grid.col="gray", xlab="Residue No. (1i6i)", ylab="Residue No. (1i5s)") ## Not run: ##-- Residue-wise distance matrix based on the ## minimal distance between all available atoms l <- dm.xyz(pdb$xyz, grpby=pdb$atom[,"resno"], scut=3) ##--- Extract all-atom contacts pdb <- read.pdb( system.file("examples/d1bg2__.ent", package = "bio3d") ) l <- dm(pdb,selection="all") l[upper.tri(l)]=NA # make top diagonal NA # Find residues with contacting atoms (<=5 Angstrom) inds.stru <- which(l<=5, arr.ind=TRUE) # Find non-consecutive residues (>5 residues sequence separation) seq.sep <- abs(as.numeric(pdb$atom[inds.stru[,1],"resno"]) - as.numeric(pdb$atom[inds.stru[,2],"resno"])) inds.seq <- which(seq.sep>5) # seperated by > 5 residues # All-atom contacts (indices) inds <- inds.stru[inds.seq,] # All-atom contacts (now in terms of residue numbers) tmp <- unique( paste(pdb$atom[inds[,1],"resno"], pdb$atom[inds[,2],"resno"], sep="#") ) contacts <- matrix(as.numeric(unlist(strsplit(tmp,split="#"))), ncol=2, byrow=TRUE ) # Plot residue contacts freq <- table(contacts) xaxis <-as.vector(bounds(as.numeric(names(freq)))[,c(1,2)]) x11() plot(freq, typ="h", xlab="Residue Number", xaxt="n",ylab="Number of Contacts" ) axis(1,at=xaxis,labels=xaxis) ## End(Not run)