Home | Trees | Index | Help |
|
---|
Package CedarBackup2 :: Module filesystem :: Class BackupFileList |
|
object
--+ |list
--+ |FilesystemList
--+ | BackupFileList
List of files to be backed up.
A BackupFileList is aFilesystemList
containing a list of files
to be backed up. It only contains files, not directories (soft links are
treated like files). On top of the generic functionality provided by FilesystemList
, this class adds
functionality to keep a hash (checksum) for each file in the list, and it
also provides a method to calculate the total size of the files in the
list and a way to export the list into tar form.
Method Summary | |
---|---|
Initializes a list with no configured exclusions. | |
Adds a directory to the list. | |
Returns the total size among all files in the list. | |
Generates a mapping from file to file size in bytes. | |
Generates a mapping from file to file digest. | |
Generates a list of items that fit in the indicated capacity. | |
Creates a tar file containing the files in the list. | |
Removes unchanged entries from the list. | |
Generates an SHA digest for a given file on disk. (Static method) | |
Inherited from FilesystemList | |
Adds the contents of a directory to the list. | |
Adds a file to the list. | |
Normalizes the list, ensuring that each entry is unique. | |
Removes directory entries from the list. | |
Removes file entries from the list. | |
Removes from the list all entries that do not exist on disk. | |
Removes soft link entries from the list. | |
Removes from the list all entries matching a pattern. | |
Verifies that all entries in the list exist on disk. | |
Internal implementation of addDirContents . | |
Property target used to get the exclude directories flag. | |
Property target used to get the exclude files flag. | |
Property target used to get the exclude soft links flag. | |
Property target used to get the absolute exclude paths list. | |
Property target used to get the exclude patterns list. | |
Property target used to get the ignore file. | |
Property target used to set the exclude directories flag. | |
Property target used to set the exclude files flag. | |
Property target used to set the exclude soft links flag. | |
Property target used to set the exclude paths list. | |
Property target used to set the exclude patterns list. | |
Property target used to set the ignore file. | |
Inherited from list | |
x.__add__(y) <==> x+y | |
x.__contains__(y) <==> y in x | |
x.__delitem__(y) <==> del x[y] | |
Use of negative indices is not supported. | |
x.__eq__(y) <==> x==y | |
x.__ge__(y) <==> x>=y | |
x.__getattribute__('name') <==> x.name | |
x.__getitem__(y) <==> x[y] | |
Use of negative indices is not supported. | |
x.__gt__(y) <==> x>y | |
x.__hash__() <==> hash(x) | |
x.__iadd__(y) <==> x+=y | |
x.__imul__(y) <==> x*=y | |
x.__iter__() <==> iter(x) | |
x.__le__(y) <==> x<=y | |
x.__len__() <==> len(x) | |
x.__lt__(y) <==> x<y | |
x.__mul__(n) <==> x*n | |
x.__ne__(y) <==> x!=y | |
T.__new__(S, ...) -> a new object with type S, a subtype of T | |
x.__repr__() <==> repr(x) | |
L.__reversed__() -- return a reverse iterator over the list | |
x.__rmul__(n) <==> n*x | |
x.__setitem__(i, y) <==> x[i]=y | |
Use of negative indices is not supported. | |
L.append(object) -- append object to end | |
L.count(value) -> integer -- return number of occurrences of value | |
L.extend(iterable) -- extend list by appending elements from the iterable | |
L.index(value, [start, [stop]]) -> integer -- return first index of value | |
L.insert(index, object) -- insert object before index | |
L.pop([index]) -> item -- remove and return item at index (default last) | |
L.remove(value) -- remove first occurrence of value | |
L.reverse() -- reverse *IN PLACE* | |
L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*; cmp(x, y) -> -1, 0, 1 | |
Inherited from object | |
x.__delattr__('name') <==> del x.name | |
helper for pickle | |
helper for pickle | |
x.__setattr__('name', value) <==> x.name = value | |
x.__str__() <==> str(x) |
Property Summary | |
---|---|
Inherited from FilesystemList | |
excludeDirs : Boolean indicating whether directories should be excluded. | |
excludeFiles : Boolean indicating whether files should be excluded. | |
excludeLinks : Boolean indicating whether soft links should be excluded. | |
excludePaths : List of absolute paths to be excluded. | |
excludePatterns : List of regular expression patterns to be excluded. | |
ignoreFile : Name of file which will cause directory contents to be ignored. |
Instance Method Details |
---|
__init__(self)
Initializes a list with no configured exclusions.
|
addDir(self, path)Adds a directory to the list. Note that this class does not allow directories to be added by themselves (a backup list contains only files). However, since links to directories are technically files, we allow them to be added. This method is implemented in terms of the superclass method, with one additional validation: the superclass method is only called if the passed-in path is both a directory and a link. All of the superclass's existing validations and restrictions apply.
|
totalSize(self)Returns the total size among all files in the list. Only files are counted. Soft links that point at files are ignored. Entries which do not exist on disk are ignored.
|
generateSizeMap(self)Generates a mapping from file to file size in bytes. The mapping does include soft links, which are listed with size zero. Entries which do not exist on disk are ignored.
|
generateDigestMap(self, stripPrefix=None)Generates a mapping from file to file digest. Currently, the digest is an SHA hash, which should be pretty secure. In the future, this might be a different kind of hash, but we guarantee that the type of the hash will not change unless the library major version number is bumped. Entries which do not exist on disk are ignored. Soft links are ignored. We would end up generating a digest for the file that the soft link points at, which doesn't make any sense. IfstripPrefix is passed in, then that prefix will be
stripped from each key when the map is generated. This can be useful in
generating two "relative" digest maps to be compared to one
another.
|
generateFitted(self, capacity, algorithm='worst_fit')Generates a list of items that fit in the indicated capacity. Sometimes, callers would like to include every item in a list, but are unable to because not all of the items fit in the space available. This method returns a copy of the list, containing only the items that fit in a given capacity. A copy is returned so that we don't lose any information if for some reason the fitted list is unsatisfactory. The fitting is done using the functions in the knapsack module. By default, the first fit algorithm is used, but you can also choose from best fit, worst fit and alternate fit.
|
generateTarfile(self, path, mode='tar', ignore=False, flat=False)Creates a tar file containing the files in the list. By default, this method will create uncompressed tar files. If you
pass in mode The tar file will be created as a GNU tar archive, which enables extended file name lengths, etc. Since GNU tar is so prevalent, I've decided that the extra functionality out-weighs the disadvantage of not being "standard". If you pass in By default, the whole method call fails if there are problems adding
any of the files to the archive, resulting in an exception. Under these
circumstances, callers are advised that they might want to call If you want to, you can pass in
|
removeUnchanged(self, digestMap, captureDigest=False)Removes unchanged entries from the list. This method relies on a digest map as returned from This method offers a convenient way for callers to filter unneeded
entries from a list. The idea is that a caller will capture a digest
map from If If captureDigest , as
well. To preserve backwards compatibility, if
captureDigest is False , then we'll just
return a single value representing the number of entries removed.
Otherwise, we'll return a tuple of (entries removed, digest
map) . The returned digest map will be in exactly the form
returned by generateDigestMap .
|
Static Method Details |
---|
_generateDigest(path)Generates an SHA digest for a given file on disk. The original code for this function used this simplistic implementation, which requires reading the entire file into memory at once in order to generate a digest value:sha.new(open(path).read()).hexdigest() Not surprisingly, this isn't an optimal solution. The Simple
file hashing Python Cookbook recipe describes how to incrementally
generate a hash value by reading in chunks of data rather than reading
the file all at once. The recipe relies on the the
In my tests using a 110 MB file on CD, the original implementation requires 111 seconds. This implementation requires only 40-45 seconds, which is a pretty substantial speed-up. Practice shows that reading in around 4kB (4096 bytes) at a time yields the best performance. Smaller reads are quite a bit slower, and larger reads don't make much of a difference. The 4kB number makes me a little suspicious, and I think it might be related to the size of a filesystem read at the hardware level. However, I've decided to just hardcode 4096 until I have evidence that shows it's worthwhile making the read size configurable.
|
Home | Trees | Index | Help |
|
---|
Generated by Epydoc 2.1 on Mon Dec 18 22:53:29 2006 | http://epydoc.sf.net |