Package CedarBackup2 :: Module action
[show private | hide private]
[frames | no frames]

Module CedarBackup2.action

Provides implementation of various backup-related actions.

The command-line interface is mostly implemented in terms of the action-related functionality in this module. There is one process function for each high-level backup action (collect, stage, store, purge, rebuild). In turn, each of the action functions is mostly implemented in terms of lower-level functionality in the other Cedar Backup modules. This is mostly "glue" code.

All of the public action functions in this file implements the Cedar Backup Extension Architecture Interface, i.e. the same interface that extensions will implement. There's no particular reason it has to be this way, except that it seems more straightforward to do it this way.

The code is organized into three rough sections: general utility code, attribute "getter" functions, and public functions. Attribute getter function encode rules for getting the correct value for various attributes. For instance, what do we do when the device type is unset or if a collect dir doesn't have an ignore file set, etc. They are grouped roughly by the action that they are associated with. Other utility functions related to a single public function are grouped with that function (below it, typically).

Author: Kenneth J. Pronovici <pronovic@ieee.org>

Function Summary
  executeCollect(configPath, options, config)
Executes the collect backup action.
  executeStage(configPath, options, config)
Executes the stage backup action.
  executeStore(configPath, options, config)
Executes the store backup action.
  executePurge(configPath, options, config)
Executes the purge backup action.
  executeRebuild(configPath, options, config)
Executes the rebuild backup action.
  executeValidate(configPath, options, config)
Executes the validate action.
  _checkDir(path, writable, logfunc, prefix)
Checks that the indicated directory is OK.
  _collectDirectory(config, absolutePath, tarfilePath, collectMode, archiveMode, ignoreFile, resetDigest, digestPath, excludePaths, excludePatterns)
Collects a configured collect directory.
  _collectFile(config, absolutePath, tarfilePath, collectMode, archiveMode, resetDigest, digestPath)
Collects a configured collect file.
  _consistencyCheck(config, stagingDirs)
Runs a consistency check against media in the backup device.
  _createStagingDirs(config, dailyDir, peers)
Creates staging directories as required.
  _deriveDayOfWeek(dayName)
Converts English day name to numeric day of week as from time.localtime.
  _executeBackup(config, backupList, absolutePath, tarfilePath, collectMode, archiveMode, resetDigest, digestPath)
Execute the backup process for the indicated backup list.
  _findCorrectDailyDir(options, config)
Finds the correct daily staging directory to be written to disk.
  _findRebuildDirs(config)
Finds the set of directories to be included in a disc rebuild.
  _getArchiveMode(config, item)
Gets the archive mode that should be used for a collect directory or file.
  _getCollectMode(config, item)
Gets the collect mode that should be used for a collect directory or file.
  _getDailyDir(config)
Gets the daily staging directory.
  _getDeviceType(config)
Gets the device type that should be used for storing.
  _getDigestPath(config, item)
Gets the digest path associated with a collect directory or file.
  _getExclusions(config, collectDir)
Gets exclusions (file and patterns) associated with a collect directory.
  _getIgnoreFile(config, item)
Gets the ignore file that should be used for a collect directory or file.
  _getLocalPeers(config)
Return a list of LocalPeer objects based on configuration.
  _getMediaType(config)
Gets the media type that should be used for storing.
  _getRcpCommand(config, remotePeer)
Gets the RCP command associated with a remote peer.
  _getRemotePeers(config)
Return a list of RemotePeer objects based on configuration.
  _getRemoteUser(config, remotePeer)
Gets the remote user associated with a remote peer.
  _getTarfilePath(config, item, archiveMode)
Gets the tarfile path (including correct extension) associated with a collect directory.
  _getWriter(config)
Gets a writer object based on current configuration.
  _loadDigest(digestPath)
Loads the indicated digest path from disk into a dictionary.
  _validateCollect(config, logfunc)
Execute runtime validations on collect configuration.
  _validateExtensions(config, logfunc)
Execute runtime validations on extensions configuration.
  _validateOptions(config, logfunc)
Execute runtime validations on options configuration.
  _validatePurge(config, logfunc)
Execute runtime validations on purge configuration.
  _validateReference(config, logfunc)
Execute runtime validations on reference configuration.
  _validateStage(config, logfunc)
Execute runtime validations on stage configuration.
  _validateStore(config, logfunc)
Execute runtime validations on store configuration.
  _writeCollectIndicator(config)
Writes a collect indicator file into a target collect directory.
  _writeDigest(config, digest, digestPath)
Writes the digest dictionary to the indicated digest path on disk.
  _writeImage(config, entireDisc, stagingDirs)
Builds and writes an ISO image containing the indicated stage directories.
  _writeStageIndicator(config, dailyDir)
Writes a stage indicator file into the daily staging directory.
  _writeStoreIndicator(config, stagingDirs)
Writes a store indicator file into staging directories.
  buildNormalizedPath(absPath)
Returns a "normalized" path based on an absolute path.
  isStartOfWeek(startingDay)
Indicates whether "today" is the backup starting day per configuration.

Variable Summary
str COLLECT_INDICATOR = 'cback.collect'
str DIGEST_EXTENSION = 'sha'
str DIR_TIME_FORMAT = '%Y/%m/%d'
Logger logger = <logging.Logger instance at 0x3b00392c>
str PREFIX_TIME_FORMAT = '%Y/%m/%d'
int SECONDS_PER_DAY = 86400                                                                 
str STAGE_INDICATOR = 'cback.stage'
str STORE_INDICATOR = 'cback.store'

Function Details

executeCollect(configPath, options, config)

Executes the collect backup action.
Parameters:
configPath - Path to configuration file on disk.
           (type=String representing a path on disk.)
options - Program command-line options.
           (type=Options object.)
config - Program configuration.
           (type=Config object.)
Raises:
ValueError - Under many generic error conditions
TarError - If there is a problem creating a tar file

Note: When the collect action is complete, we will write a collect indicator to the collect directory, so it's obvious that the collect action has completed. The stage process uses this indicator to decide whether a peer is ready to be staged.

executeStage(configPath, options, config)

Executes the stage backup action.
Parameters:
configPath - Path to configuration file on disk.
           (type=String representing a path on disk.)
options - Program command-line options.
           (type=Options object.)
config - Program configuration.
           (type=Config object.)
Raises:
ValueError - Under many generic error conditions
IOError - If there are problems reading or writing files.

Notes:

  • The daily directory is derived once and then we stick with it, just in case a backup happens to span midnite.
  • As portions of the stage action is complete, we will write various indicator files so that it's obvious what actions have been completed. Each peer gets a stage indicator in its collect directory, and then the master gets a stage indicator in its daily staging directory. The store process uses the master's stage indicator to decide whether a directory is ready to be stored. Currently, nothing uses the indicator at each peer, and it exists for reference only.

executeStore(configPath, options, config)

Executes the store backup action.

Note that the rebuild action and the store action are very similar. The main difference is that while store only stores a single day's staging directory, the rebuild action operates on multiple staging directories.
Parameters:
configPath - Path to configuration file on disk.
           (type=String representing a path on disk.)
options - Program command-line options.
           (type=Options object.)
config - Program configuration.
           (type=Config object.)
Raises:
ValueError - Under many generic error conditions
IOError - If there are problems reading or writing files.

Note: When the store action is complete, we will write a store indicator to the daily staging directory we used, so it's obvious that the store action has completed.

executePurge(configPath, options, config)

Executes the purge backup action.

For each configured directory, we create a purge item list, remove from the list anything that's younger than the configured retain days value, and then purge from the filesystem what's left.
Parameters:
configPath - Path to configuration file on disk.
           (type=String representing a path on disk.)
options - Program command-line options.
           (type=Options object.)
config - Program configuration.
           (type=Config object.)
Raises:
ValueError - Under many generic error conditions

executeRebuild(configPath, options, config)

Executes the rebuild backup action.

This function exists mainly to recreate a disc that has been "trashed" due to media or hardware problems. Note that the "stage complete" indicator isn't checked for this action.

Note that the rebuild action and the store action are very similar. The main difference is that while store only stores a single day's staging directory, the rebuild action operates on multiple staging directories.
Parameters:
configPath - Path to configuration file on disk.
           (type=String representing a path on disk.)
options - Program command-line options.
           (type=Options object.)
config - Program configuration.
           (type=Config object.)
Raises:
ValueError - Under many generic error conditions
IOError - If there are problems reading or writing files.

executeValidate(configPath, options, config)

Executes the validate action.

This action validates each of the individual sections in the config file. This is a "runtime" validation. The config file itself is already valid in a structural sense, so what we check here that is that we can actually use the configuration without any problems.

There's a separate validation function for each of the configuration sections. Each validation function returns a true/false indication for whether configuration was valid, and then logs any configuration problems it finds. This way, one pass over configuration indicates most or all of the obvious problems, rather than finding just one problem at a time.

Any reported problems will be logged at the ERROR level normally, or at the INFO level if the quiet flag is enabled.
Parameters:
configPath - Path to configuration file on disk.
           (type=String representing a path on disk.)
options - Program command-line options.
           (type=Options object.)
config - Program configuration.
           (type=Config object.)
Raises:
ValueError - If some configuration value is invalid.

_checkDir(path, writable, logfunc, prefix)

Checks that the indicated directory is OK.

The path must exist, must be a directory, must be readable and executable, and must optionally be writable.
Parameters:
path - Path to check.
writable - Check that path is writable.
logfunc - Function to use for logging errors.
prefix - Prefix to use on logged errors.
Returns:
True if the directory is OK, False otherwise.

_collectDirectory(config, absolutePath, tarfilePath, collectMode, archiveMode, ignoreFile, resetDigest, digestPath, excludePaths, excludePatterns)

Collects a configured collect directory.

The indicated collect directory is collected into the indicated tarfile. For directories that are collected incrementally, we'll use the indicated digest path and pay attention to the reset digest flag (basically, the reset digest flag ignores any existing digest, but a new digest is always rewritten).

The caller must decide what the collect and archive modes are, since they can be on both the collect configuration and the collect directory itself.
Parameters:
config - Config object.
absolutePath - Absolute path of directory to collect.
tarfilePath - Path to tarfile that should be created.
collectMode - Collect mode to use.
archiveMode - Archive mode to use.
ignoreFile - Ignore file to use.
resetDigest - Reset digest flag.
digestPath - Path to digest file on disk, if needed.
excludePaths - List of absolute paths to exclude.
excludePatterns - List of patterns to exclude.

_collectFile(config, absolutePath, tarfilePath, collectMode, archiveMode, resetDigest, digestPath)

Collects a configured collect file.

The indicated collect file is collected into the indicated tarfile. For files that are collected incrementally, we'll use the indicated digest path and pay attention to the reset digest flag (basically, the reset digest flag ignores any existing digest, but a new digest is always rewritten).

The caller must decide what the collect and archive modes are, since they can be on both the collect configuration and the collect file itself.
Parameters:
config - Config object.
absolutePath - Absolute path of file to collect.
tarfilePath - Path to tarfile that should be created.
collectMode - Collect mode to use.
archiveMode - Archive mode to use.
resetDigest - Reset digest flag.
digestPath - Path to digest file on disk, if needed.

_consistencyCheck(config, stagingDirs)

Runs a consistency check against media in the backup device.

It seems that sometimes, it's possible to create a corrupted multisession disc (i.e. one that cannot be read) although no errors were encountered while writing the disc. This consistency check makes sure that the data read from disc matches the data that was used to create the disc.

The function mounts the device at a temporary mount point in the working directory, and then compares the indicated staging directories in the staging directory and on the media. The comparison is done via functionality in filesystem.py.

If no exceptions are thrown, there were no problems with the consistency check. A positive confirmation of "no problems" is also written to the log with info priority.
Parameters:
config - Config object.
stagingDirs - Dictionary mapping directory path to date suffix.
Raises:
ValueError - If the two directories are not equivalent.
IOError - If there is a problem working with the media.

Warning: The implementation of this function is very UNIX-specific and is probably Linux-specific as well.

_createStagingDirs(config, dailyDir, peers)

Creates staging directories as required.

The main staging directory is the passed in daily directory, something like staging/2002/05/23. Then, individual peers get their own directories, i.e. staging/2002/05/23/host.
Parameters:
config - Config object.
dailyDir - Daily staging directory.
peers - List of all configured peers.
Returns:
Dictionary mapping peer name to staging directory.

_deriveDayOfWeek(dayName)

Converts English day name to numeric day of week as from time.localtime.

For instance, the day monday would be converted to the number 0.
Parameters:
dayName - Day of week to convert
           (type=string, i.e. "monday", "tuesday", etc.)
Returns:
Integer, where Monday is 0 and Sunday is 6.

_executeBackup(config, backupList, absolutePath, tarfilePath, collectMode, archiveMode, resetDigest, digestPath)

Execute the backup process for the indicated backup list.

This function exists mainly to consolidate functionality between the _collectFile and _collectDirectory functions. Those functions build the backup list; this function causes the backup to execute properly and also manages usage of the digest file on disk as explained in their comments.

For collect files, the digest file will always just contain the single file that is being backed up. This might little wasteful in terms of the number of files that we keep around, but it's consistent and easy to understand.
Parameters:
config - Config object.
backupList - List to execute backup for
absolutePath - Absolute path of directory or file to collect.
tarfilePath - Path to tarfile that should be created.
collectMode - Collect mode to use.
archiveMode - Archive mode to use.
resetDigest - Reset digest flag.
digestPath - Path to digest file on disk, if needed.

_findCorrectDailyDir(options, config)

Finds the correct daily staging directory to be written to disk.

In Cedar Backup v1.0, we assumed that the correct staging directory matched the current date. However, that has problems. In particular, it breaks down if collect is on one side of midnite and stage is on the other, or if certain processes span midnite.

For v2.0, I'm trying to be smarter. I'll first check the current day. If that directory is found, it's good enough. If it's not found, I'll look for a valid directory from the day before or day after which has not yet been staged, according to the stage indicator file. The first one I find, I'll use. If I use a directory other than for the current day and config.store.warnMidnite is set, a warning will be put in the log.

There is one exception to this rule. If the options.full flag is set, then the special "span midnite" logic will be disabled and any existing store indicator will be ignored. I did this because I think that most users who run cback --full store twice in a row expect the command to generate two identical discs. With the other rule in place, running that command twice in a row could result in an error ("no unstored directory exists") or could even cause a completely unexpected directory to be written to disc (if some previous day's contents had not yet been written).
Parameters:
options - Options object.
config - Config object.
Returns:
Correct staging dir, as a dict mapping directory to date suffix.
Raises:
IOError - If the staging directory cannot be found.

Note: This code is probably longer and more verbose than it needs to be, but at least it's straightforward.

_findRebuildDirs(config)

Finds the set of directories to be included in a disc rebuild.

A the rebuild action is supposed to recreate the "last week's" disc. This won't always be possible if some of the staging directories are missing. However, the general procedure is to look back into the past no further than the previous "starting day of week", and then work forward from there trying to find all of the staging directories between then and now that still exist and have a stage indicator.
Parameters:
config - Config object.
Returns:
Correct staging dir, as a dict mapping directory to date suffix.
Raises:
IOError - If we do not find at least one staging directory.

_getArchiveMode(config, item)

Gets the archive mode that should be used for a collect directory or file. If possible, use the one on the file or directory, otherwise take from collect section.
Parameters:
config - Config object.
item - CollectFile or CollectDir object
Returns:
Archive mode to use.

_getCollectMode(config, item)

Gets the collect mode that should be used for a collect directory or file. If possible, use the one on the file or directory, otherwise take from collect section.
Parameters:
config - Config object.
item - CollectFile or CollectDir object
Returns:
Collect mode to use.

_getDailyDir(config)

Gets the daily staging directory.

This is just a directory in the form staging/YYYY/MM/DD, i.e. staging/2000/10/07, except it will be an absolute path based on config.stage.targetDir.
Parameters:
config - Config object
Returns:
Path of daily staging directory.

_getDeviceType(config)

Gets the device type that should be used for storing.

Use the configured device type if not None, otherwise use config.DEFAULT_DEVICE_TYPE.
Parameters:
config - Config object.
Returns:
Device type to be used.

_getDigestPath(config, item)

Gets the digest path associated with a collect directory or file.
Parameters:
config - Config object.
item - CollectFile or CollectDir object
Returns:
Absolute path to the digest associated with the collect directory or file.

_getExclusions(config, collectDir)

Gets exclusions (file and patterns) associated with a collect directory.

The returned files value is a list of absolute paths to be excluded from the backup for a given directory. It is derived from the collect configuration absolute exclude paths and the collect directory's absolute and relative exclude paths.

The returned patterns value is a list of patterns to be excluded from the backup for a given directory. It is derived from the list of patterns from the collect configuration and from the collect directory itself.
Parameters:
config - Config object.
collectDir - Collect directory object.
Returns:
Tuple (files, patterns) indicating what to exclude.

_getIgnoreFile(config, item)

Gets the ignore file that should be used for a collect directory or file. If possible, use the one on the file or directory, otherwise take from collect section.
Parameters:
config - Config object.
item - CollectFile or CollectDir object
Returns:
Ignore file to use.

_getLocalPeers(config)

Return a list of LocalPeer objects based on configuration.
Parameters:
config - Config object.
Returns:
List of LocalPeer objects.

_getMediaType(config)

Gets the media type that should be used for storing.

Use the configured media type if not None, otherwise use DEFAULT_MEDIA_TYPE.

Once we figure out what configuration value to use, we return a media type value that is valid in writer.py, one of MEDIA_CDR_74, MEDIA_CDRW_74, MEDIA_CDR_80 or MEDIA_CDRW_80.
Parameters:
config - Config object.
Returns:
Media type to be used as a writer media type value.
Raises:
ValueError - If the media type is not valid.

_getRcpCommand(config, remotePeer)

Gets the RCP command associated with a remote peer. Use peer's if possible, otherwise take from options section.
Parameters:
config - Config object.
remotePeer - Configuration-style remote peer object.
Returns:
RCP command associated with remote peer.

_getRemotePeers(config)

Return a list of RemotePeer objects based on configuration.
Parameters:
config - Config object.
Returns:
List of RemotePeer objects.

_getRemoteUser(config, remotePeer)

Gets the remote user associated with a remote peer. Use peer's if possible, otherwise take from options section.
Parameters:
config - Config object.
remotePeer - Configuration-style remote peer object.
Returns:
Name of remote user associated with remote peer.

_getTarfilePath(config, item, archiveMode)

Gets the tarfile path (including correct extension) associated with a collect directory.
Parameters:
config - Config object.
item - CollectFile or CollectDir object
archiveMode - Archive mode to use for this tarfile.
Returns:
Absolute path to the tarfile associated with the collect directory.

_getWriter(config)

Gets a writer object based on current configuration.

This function creates and returns a writer based on configuration. This is done to abstract action methods from knowing what kind of writer is in use. Since all writers implement the same interface, there's no need for actions to care which one they're working with.

Right now, only the cdwriter device type is allowed, which results in a CdWriter object. An exception will be raised if any other device type is used.

This function also checks to make sure that the device isn't mounted before creating a writer object for it. Experience shows that sometimes if the device is mounted, we have problems with the backup. We may as well do the check here first, before instantiating the writer.
Parameters:
config - Config object.
Returns:
Writer that can be used to write a directory to some media.
Raises:
ValueError - If there is a problem getting the writer.
IOError - If there is a problem creating the writer object.

_loadDigest(digestPath)

Loads the indicated digest path from disk into a dictionary.

If we can't load the digest successfully (either because it doesn't exist or for some other reason), then an empty dictionary will be returned - but the condition will be logged.
Parameters:
digestPath - Path to the digest file on disk.
Returns:
Dictionary representing contents of digest path.

_validateCollect(config, logfunc)

Execute runtime validations on collect configuration.

The following validations are enforced:
  • The target directory must exist and must be writable
  • Each of the individual collect directories must exist and must be readable
Parameters:
config - Program configuration.
logfunc - Function to use for logging errors
Returns:
True if configuration is valid, false otherwise.

_validateExtensions(config, logfunc)

Execute runtime validations on extensions configuration.

The following validations are enforced:
  • Each indicated extension function must exist.
Parameters:
config - Program configuration.
logfunc - Function to use for logging errors
Returns:
True if configuration is valid, False otherwise.

_validateOptions(config, logfunc)

Execute runtime validations on options configuration.

The following validations are enforced:
  • The options section must exist
  • The working directory must exist and must be writable
  • The backup user and backup group must exist
Parameters:
config - Program configuration.
logfunc - Function to use for logging errors
Returns:
True if configuration is valid, false otherwise.

_validatePurge(config, logfunc)

Execute runtime validations on purge configuration.

The following validations are enforced:
  • Each purge directory must exist and must be writable
Parameters:
config - Program configuration.
logfunc - Function to use for logging errors
Returns:
True if configuration is valid, False otherwise.

_validateReference(config, logfunc)

Execute runtime validations on reference configuration.

We only validate that reference configuration exists at all.
Parameters:
config - Program configuration.
logfunc - Function to use for logging errors
Returns:
True if configuration is valid, false otherwise.

_validateStage(config, logfunc)

Execute runtime validations on stage configuration.

The following validations are enforced:
  • The target directory must exist and must be writable
  • Each local peer's collect directory must exist and must be readable
Parameters:
config - Program configuration.
logfunc - Function to use for logging errors
Returns:
True if configuration is valid, False otherwise.

Note: We currently do not validate anything having to do with remote peers, since we don't have a straightforward way of doing it. It would require adding an rsh command rather than just an rcp command to configuration, and that just doesn't seem worth it right now.

_validateStore(config, logfunc)

Execute runtime validations on store configuration.

The following validations are enforced:
  • The source directory must exist and must be readable
  • The backup device (path and SCSI device) must be valid
Parameters:
config - Program configuration.
logfunc - Function to use for logging errors
Returns:
True if configuration is valid, False otherwise.

_writeCollectIndicator(config)

Writes a collect indicator file into a target collect directory.
Parameters:
config - Config object.

_writeDigest(config, digest, digestPath)

Writes the digest dictionary to the indicated digest path on disk.

If we can't write the digest successfully for any reason, we'll log the condition but won't throw an exception.
Parameters:
config - Config object.
digest - Digest dictionary to write to disk.
digestPath - Path to the digest file on disk.

_writeImage(config, entireDisc, stagingDirs)

Builds and writes an ISO image containing the indicated stage directories.

The generated image will contain each of the staging directories listed in stagingDirs. The directories will be placed into the image at the root by date, so staging directory /opt/stage/2005/02/10 will be placed into the disc at /2005/02/10.
Parameters:
config - Config object.
entireDisc - Indicates whether entire disc should be used
stagingDirs - Dictionary mapping directory path to date suffix.
Raises:
ValueError - Under many generic error conditions
IOError - If there is a problem writing the image to disc.

_writeStageIndicator(config, dailyDir)

Writes a stage indicator file into the daily staging directory.

Note that there is a stage indicator on each peer (to indicate that a collect directory has been staged) and in the daily staging directory itself (to indicate that the staging directory has been utilized). This just deals with the daily staging directory.
Parameters:
config - Config object.
dailyDir - Daily staging directory.

_writeStoreIndicator(config, stagingDirs)

Writes a store indicator file into staging directories.

The store indicator is written into each of the staging directories when either a store or rebuild action has written the staging directory to disc.
Parameters:
config - Config object.
stagingDirs - Dictionary mapping directory path to date suffix.

buildNormalizedPath(absPath)

Returns a "normalized" path based on an absolute path.

A "normalized" path has its leading '/' or '.' characters removed, and then converts all remaining whitespace and '/' characters to the '_' character. As a special case, the absolute path / will be normalized to just '-'.
Parameters:
absPath - Absolute path
Returns:
Normalized path.

isStartOfWeek(startingDay)

Indicates whether "today" is the backup starting day per configuration.

If the current day's English name matches the indicated starting day, then today is a starting day.
Parameters:
startingDay - Configured starting day.
           (type=string, i.e. "monday", "tuesday", etc.)
Returns:
Boolean indicating whether today is the starting day.

Variable Details

COLLECT_INDICATOR

Type:
str
Value:
'cback.collect'                                                        

DIGEST_EXTENSION

Type:
str
Value:
'sha'                                                                  

DIR_TIME_FORMAT

Type:
str
Value:
'%Y/%m/%d'                                                             

logger

Type:
Logger
Value:
<logging.Logger instance at 0x3b00392c>                                

PREFIX_TIME_FORMAT

Type:
str
Value:
'%Y/%m/%d'                                                             

SECONDS_PER_DAY

Type:
int
Value:
86400                                                                 

STAGE_INDICATOR

Type:
str
Value:
'cback.stage'                                                          

STORE_INDICATOR

Type:
str
Value:
'cback.store'                                                          

Generated by Epydoc 2.1 on Mon Sep 4 13:49:34 2006 http://epydoc.sf.net