org.jets3t.service.utils
Class FileComparer

java.lang.Object
  extended by org.jets3t.service.utils.FileComparer

public class FileComparer
extends Object

File comparison utility to compare files on the local computer with objects present in an S3 account and determine whether there are any differences. This utility contains methods to build maps of the contents of the local file system or S3 account for comparison, and buildDiscrepancyLists methods to find differences in these maps.

File comparisons are based primarily on MD5 hashes of the files' contents. If a local file does not match an object in S3 with the same name, this utility determine which of the items is newer by comparing the last modified dates.

Author:
James Murty

Nested Class Summary
 class FileComparer.PartialObjectListing
           
 
Constructor Summary
FileComparer(Jets3tProperties jets3tProperties)
          Constructs the class.
 
Method Summary
 FileComparerResults buildDiscrepancyLists(Map filesMap, Map s3ObjectsMap)
          Compares the contents of a directory on the local file system with the contents of an S3 resource.
 FileComparerResults buildDiscrepancyLists(Map filesMap, Map s3ObjectsMap, BytesProgressWatcher progressWatcher)
          Compares the contents of a directory on the local file system with the contents of an S3 resource.
 Map buildFileMap(File[] files, boolean includeDirectories)
          Builds a File Map containing the given files.
 Map buildFileMap(File rootDirectory, String fileKeyPrefix, boolean includeDirectories)
          Builds a File Map containing all the files and directories inside the given root directory, where the map's key for each file is the relative path to the file.
 Map buildS3ObjectMap(S3Service s3Service, S3Bucket bucket, String targetPath, boolean skipMetadata, S3ServiceEventListener s3ServiceEventListener)
          Builds an S3 Object Map containing all the objects within the given target path, where the map's key for each object is the relative path to the object.
 Map buildS3ObjectMap(S3Service s3Service, S3Bucket bucket, String targetPath, S3Object[] s3ObjectsIncomplete, boolean skipMetadata, S3ServiceEventListener s3ServiceEventListener)
          Builds an S3 Object Map containing all the given objects, by retrieving HEAD details about all the objects and using populateS3ObjectMap(String, S3Object[]) to product an object/key map.
 FileComparer.PartialObjectListing buildS3ObjectMapPartial(S3Service s3Service, S3Bucket bucket, String targetPath, String priorLastKey, boolean completeListing, boolean skipMetadata, S3ServiceEventListener s3ServiceEventListener)
          Builds an S3 Object Map containing a partial set of objects within the given target path, where the map's key for each object is the relative path to the object.
static FileComparer getInstance()
           
static FileComparer getInstance(Jets3tProperties jets3tProperties)
           
 S3Object[] listObjectsThreaded(S3Service s3Service, String bucketName, String targetPath)
          Lists the objects in a bucket using a partitioning technique to divide the object namespace into separate partitions that can be listed by multiple simultaneous threads.
 S3Object[] listObjectsThreaded(S3Service s3Service, String bucketName, String targetPath, String delimiter, int toDepth)
          Lists the objects in a bucket using a partitioning technique to divide the object namespace into separate partitions that can be listed by multiple simultaneous threads.
 Map populateS3ObjectMap(String targetPath, S3Object[] s3Objects)
          Builds a map of key/object pairs each object is associated with a key based on its location in the S3 target path.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileComparer

public FileComparer(Jets3tProperties jets3tProperties)
Constructs the class.

Parameters:
jets3tProperties - the object containing the properties that will be applied in this class.
Method Detail

getInstance

public static FileComparer getInstance(Jets3tProperties jets3tProperties)
Parameters:
jets3tProperties - the object containing the properties that will be applied in the instance.
Returns:
a FileComparer instance.

getInstance

public static FileComparer getInstance()
Returns:
a FileComparer instance initialized with the default JetS3tProperties object.

buildFileMap

public Map buildFileMap(File[] files,
                        boolean includeDirectories)
Builds a File Map containing the given files. If any of the given files are actually directories, the contents of the directory are included.

File keys are delimited with '/' characters.

Any file or directory matching a path in a .jets3t-ignore file will be ignored.

Parameters:
files - the set of files/directories to include in the file map.
includeDirectories - If true all directories, including empty ones, will be included in the Map. These directories will be mere place-holder objects with the content type Mimetypes.MIMETYPE_JETS3T_DIRECTORY. If this variable is false directory objects will not be included in the Map, and it will not be possible to store empty directories in S3.
Returns:
a Map of file path keys to File objects.

buildFileMap

public Map buildFileMap(File rootDirectory,
                        String fileKeyPrefix,
                        boolean includeDirectories)
Builds a File Map containing all the files and directories inside the given root directory, where the map's key for each file is the relative path to the file.

File keys are delimited with '/' characters.

Any file or directory matching a path in a .jets3t-ignore file will be ignored.

Parameters:
rootDirectory - The root directory containing the files/directories of interest. The root directory is not included in the result map.
fileKeyPrefix - A prefix added to each file path key in the map, e.g. the name of the root directory the files belong to. If provided, a '/' suffix is always added to the end of the prefix. If null or empty, no prefix is used.
includeDirectories - If true all directories, including empty ones, will be included in the Map. These directories will be mere place-holder objects with the content type Mimetypes.MIMETYPE_JETS3T_DIRECTORY. If this variable is false directory objects will not be included in the Map, and it will not be possible to store empty directories in S3.
Returns:
A Map of file path keys to File objects.
See Also:
buildDiscrepancyLists(Map, Map), buildS3ObjectMap(S3Service, S3Bucket, String, S3Object[], boolean, S3ServiceEventListener)

listObjectsThreaded

public S3Object[] listObjectsThreaded(S3Service s3Service,
                                      String bucketName,
                                      String targetPath,
                                      String delimiter,
                                      int toDepth)
                               throws S3ServiceException
Lists the objects in a bucket using a partitioning technique to divide the object namespace into separate partitions that can be listed by multiple simultaneous threads. This method divides the object namespace using the given delimiter, traverses this space up to the specified depth to identify prefix names for multiple "partitions", and then lists the objects in each partition. It returns the complete list of objects in the bucket path.

This partitioning technique will work best for buckets with many objects that are divided into a number of virtual subdirectories of roughly equal size.

Parameters:
s3Service - the service object that will be used to perform listing requests.
bucketName - the name of the bucket whose contents will be listed.
targetPath - a root path within the bucket to be listed. If this parameter is null, all the bucket's objects will be listed. Otherwise, only the objects below the virtual path specified will be listed.
delimiter - the delimiter string used to identify virtual subdirectory partitions in a bucket. If this parameter is null, or it has a value that is not present in your object names, no partitioning will take place.
toDepth - the number of delimiter levels this method will traverse to identify subdirectory partions. If this value is zero, no partitioning will take place.
Returns:
the list of objects under the target path in the bucket.
Throws:
S3ServiceException

listObjectsThreaded

public S3Object[] listObjectsThreaded(S3Service s3Service,
                                      String bucketName,
                                      String targetPath)
                               throws S3ServiceException
Lists the objects in a bucket using a partitioning technique to divide the object namespace into separate partitions that can be listed by multiple simultaneous threads. This method divides the object namespace using the given delimiter, traverses this space up to the specified depth to identify prefix names for multiple "partitions", and then lists the objects in each partition. It returns the complete list of objects in the bucket path.

This partitioning technique will work best for buckets with many objects that are divided into a number of virtual subdirectories of roughly equal size.

The delimiter and depth properties that define how this method will partition the bucket's namespace are set in the jets3t.properties file with the setting: filecomparer.bucket-listing.<bucketname>=<delim>,<depth>
For example: filecomparer.bucket-listing.my-bucket=/,2

Parameters:
s3Service - the service object that will be used to perform listing requests.
bucketName - the name of the bucket whose contents will be listed.
targetPath - a root path within the bucket to be listed. If this parameter is null, all the bucket's objects will be listed. Otherwise, only the objects below the virtual path specified will be listed.
Returns:
the list of objects under the target path in the bucket.
Throws:
S3ServiceException

buildS3ObjectMap

public Map buildS3ObjectMap(S3Service s3Service,
                            S3Bucket bucket,
                            String targetPath,
                            boolean skipMetadata,
                            S3ServiceEventListener s3ServiceEventListener)
                     throws S3ServiceException
Builds an S3 Object Map containing all the objects within the given target path, where the map's key for each object is the relative path to the object.

Parameters:
s3Service -
bucket -
targetPath -
skipMetadata -
s3ServiceEventListener -
Returns:
mapping of keys/S3Objects
Throws:
S3ServiceException
See Also:
buildDiscrepancyLists(Map, Map), buildFileMap(File, String, boolean)

buildS3ObjectMapPartial

public FileComparer.PartialObjectListing buildS3ObjectMapPartial(S3Service s3Service,
                                                                 S3Bucket bucket,
                                                                 String targetPath,
                                                                 String priorLastKey,
                                                                 boolean completeListing,
                                                                 boolean skipMetadata,
                                                                 S3ServiceEventListener s3ServiceEventListener)
                                                          throws S3ServiceException
Builds an S3 Object Map containing a partial set of objects within the given target path, where the map's key for each object is the relative path to the object.

If the method is asked to perform a complete listing, it will use the listObjectsThreaded(S3Service, String, String) method to list the objects in the bucket, potentially taking advantage of any bucket name partitioning settings you have applied.

If the method is asked to perform only a partial listing, no bucket name partitioning will be applied.

Parameters:
s3Service -
bucket -
targetPath -
priorLastKey - the prior last key value returned by a prior invocation of this method, if any.
completeListing - if true, this method will perform a complete listing of an S3 target. If false, the method will list a partial set of objects commencing from the given prior last key.
Returns:
an object containing a mapping of key names to S3Objects, and the prior last key (if any) that should be used to perform follow-up method calls.
Throws:
S3ServiceException
See Also:
buildDiscrepancyLists(Map, Map), buildFileMap(File, String, boolean)

buildS3ObjectMap

public Map buildS3ObjectMap(S3Service s3Service,
                            S3Bucket bucket,
                            String targetPath,
                            S3Object[] s3ObjectsIncomplete,
                            boolean skipMetadata,
                            S3ServiceEventListener s3ServiceEventListener)
                     throws S3ServiceException
Builds an S3 Object Map containing all the given objects, by retrieving HEAD details about all the objects and using populateS3ObjectMap(String, S3Object[]) to product an object/key map.

Parameters:
s3Service -
bucket -
targetPath -
skipMetadata -
s3ObjectsIncomplete -
Returns:
mapping of keys/S3Objects
Throws:
S3ServiceException
See Also:
buildDiscrepancyLists(Map, Map), buildFileMap(File, String, boolean)

populateS3ObjectMap

public Map populateS3ObjectMap(String targetPath,
                               S3Object[] s3Objects)
Builds a map of key/object pairs each object is associated with a key based on its location in the S3 target path.

Parameters:
targetPath -
s3Objects -
Returns:
a map of key/S3Object pairs.

buildDiscrepancyLists

public FileComparerResults buildDiscrepancyLists(Map filesMap,
                                                 Map s3ObjectsMap)
                                          throws NoSuchAlgorithmException,
                                                 FileNotFoundException,
                                                 IOException,
                                                 ParseException
Compares the contents of a directory on the local file system with the contents of an S3 resource. This comparison is performed on a map of files and a map of S3 objects previously generated using other methods in this class.

Parameters:
filesMap - a map of keys/Files built using the method buildFileMap(File, String, boolean)
s3ObjectsMap - a map of keys/S3Objects built using the method buildS3ObjectMap(S3Service, S3Bucket, String, S3Object[], boolean, S3ServiceEventListener)
Returns:
an object containing the results of the file comparison.
Throws:
NoSuchAlgorithmException
FileNotFoundException
IOException
ParseException

buildDiscrepancyLists

public FileComparerResults buildDiscrepancyLists(Map filesMap,
                                                 Map s3ObjectsMap,
                                                 BytesProgressWatcher progressWatcher)
                                          throws NoSuchAlgorithmException,
                                                 FileNotFoundException,
                                                 IOException,
                                                 ParseException
Compares the contents of a directory on the local file system with the contents of an S3 resource. This comparison is performed on a map of files and a map of S3 objects previously generated using other methods in this class.

Parameters:
filesMap - a map of keys/Files built using the method buildFileMap(File, String, boolean)
s3ObjectsMap - a map of keys/S3Objects built using the method buildS3ObjectMap(S3Service, S3Bucket, String, boolean, S3ServiceEventListener)
progressWatcher - watches the progress of file hash generation.
Returns:
an object containing the results of the file comparison.
Throws:
NoSuchAlgorithmException
FileNotFoundException
IOException
ParseException