Section 3.1: Overview
- The AFS Cache Manager is a kernel-resident agent with the following duties and responsibilities:
- Users are to be given the illusion that files stored in the AFS distributed file system are in fact part of the local unix file system of their client machine. There are several areas in which this illusion is not fully realized:
- Semantics: Full unix semantics are not maintained by the set of agents implementing the AFS distributed file system. The largest deviation involves the time when changes made to a file are seen by others who also have the file open. In AFS, modifications made to a cached copy of a file are not necessarily reflected immediately to the central copy (the one hosted by File Server disk storage), and thus to other cache sites. Rather, the changes are only guaranteed to be visible to others who simultaneously have their own cached copies open when the modifying process executes a unix close() operation on the file.
- This differs from the semantics expected from the single-machine, local unix environment, where writes performed on one open file descriptor are immediately visible to all processes reading the file via their own file descriptors. Thus, instead of the standard "last writer wins" behavior, users see "last closer
wins" behavior on their AFS files. Incidentally, other DFSs, such as NFS, do not implement full unix semantics in this case either.
- Partial failures: A panic experienced by a local, single-machine unix file system will, by definition, cause all local processes to terminate immediately. On the other hand, any hard or soft failure experienced by a File Server process or the machine upon which it is executing does not cause any of the Cache Managers interacting with it to crash. Rather, the Cache Managers will now have to reflect their failures in getting responses from the affected File Server back up to their callers. Network partitions also induce the same behavior. From the user's point of view, part of the file system tree has become inaccessible. In addition, certain system calls (e.g., open() and read()) may return unexpected failures to their users. Thus, certain coding practices that have become common amongst experienced (single-machine) unix programmers (e.g., not checking error codes from operations that "can't" fail) cause these programs to misbehave in the face of partial failures.
- To support this transparent access paradigm, the Cache Manager proceeds to:
- Intercept all standard unix operations directed towards AFS objects, mapping them to references aimed at the corresponding copies in the local cache.
- Keep a synchronized local cache of AFS files referenced by the client machine's users. If the chunks involved in an operation reading data from an object are either stale or do not exist in the local cache, then they must be fetched from the File Server(s) on which they reside. This may require a query to the volume location service in order to locate the place(s) of residence. Authentication challenges from File Servers needing to verify the caller's identity are handled by the Cache Manager, and the chunk is then incorporated into the cache.
- Upon receipt of a unix close, all dirty chunks belonging to the object will be flushed back to the appropriate File Server.
- Callback deliveries and withdrawals from File Servers must be processed, keeping the local cache in close synchrony with the state of affairs at the central store.
- Interfaces are also be provided for those principals who wish to perform AFS-specific operations, such as Access Control List (ACL) manipulations or changes to the Cache Manager's configuration.
- This chapter takes a tour of the Cache Manager's architecture, and examines how it supports these roles and responsibilities. First, the set of AFS agents with which it must interact are discussed. Next, some of the Cache Manager's implementation and interface choices are examined. Finally, the server's ability to arbitrarily dispose of callback information without affecting the correctness of the cache consistency algorithm is explained.
Section 3.2: Interactions
- The main AFS agent interacting with a Cache Manager is the File Server. The most common operation performed by the Cache Manager is to act as its users' agent in fetching and storing files to and from the centralized repositories. Related to this activity, a Cache Manager must be prepared to answer queries from a File Server concerning its health. It must also be able to accept callback revocation notices generated by File Servers. Since the Cache Manager not only engages in data transfer but must also determine where the data is located in the first place, it also directs inquiries to Volume Location Server agents. There must also be an interface allowing direct interactions with both common and administrative users. Certain AFS-specific operations must be made available to these parties. In addition, administrative users may desire to dynamically reconfigure the Cache Manager. For example, information about a newly-created cell may be added without restarting the client's machine.
Section 3.3: Implementation Techniques
- The above roles and behaviors for the Cache Manager influenced the implementation choices and methods used to construct it, along with the desire to maximize portability. This section begins by showing how the VFS/vnode interface, pioneered and standardized by Sun Microsystems, provides not only the necessary fine-grain access to user file system operations, but also facilitates Cache Manager ports to new hardware and operating system platforms. Next, the use of unix system calls is examined. Finally, the threading structure employed is described.
Section 3.3.1: VFS Interface
- As mentioned above, Sun Microsystems has introduced and propagated an important concept in the file system world, that of the Virtual File System (VFS) interface. This abstraction defines a core collection of file system functions which cover all operations required for users to manipulate their data. System calls are written in terms of these standardized routines. Also, the associated vnode concept generalizes the original unix inode idea and provides hooks for differing underlying environments. Thus, to port a system to a new hardware platform, the system programmers have only to construct implementations of this base array of functions consistent with the new underlying machine.
- The VFS abstraction also allows multiple file systems (e.g., vanilla unix, DOS, NFS, and AFS) to coexist on the same machine without interference. Thus, to make a machine AFS-capable, a system designer first extends the base vnode structure in well-defined ways in order to store AFS-specific operations with each file description. Then, the base function array is coded so that calls upon the proper AFS agents are made to accomplish each function's standard objectives. In effect, the Cache Manager consists of code that interprets the standard set of unix operations imported through this interface and executes the AFS protocols to carry them out.
Section 3.3.2: System Calls
- As mentioned above, many unix system calls are implemented in terms of the base function array of vnode-oriented operations. In addition, one existing system call has been modified and two new system calls have been added to perform AFS-specific operations apart from the Cache Manager's unix 'emulation' activities. The standard ioctl() system call has been augmented to handle AFS-related operations on objects accessed via open unix file descriptors. One of the brand-new system calls is pioctl(), which is much like ioctl() except it names targeted objects by pathname instead of file descriptor. Another is afs call(), which is used to initialize the Cache Manager threads, as described in the section immediately following.
Section 3.3.3: Threading
- In order to execute its many roles, the Cache Manager is organized as a multi-threaded entity. It is implemented with (potentially multiple instantiations of) the following three thread classes:
- CallBack Listener: This thread implements the Cache Manager callback RPC interface, as described in Section 6.5.
- Periodic Maintenance: Certain maintenance and checkup activities need to be performed at five set intervals. Currently, the frequency of each of these operations is hard-wired. It would be a simple matter, though, to make these times configurable by adding command-line parameters to the Cache Manager.
- Thirty seconds: Flush pending writes for NFS clients coming in through the NFS-AFS Translator facility.
- One minute: Make sure local cache usage is below the assigned quota, write out dirty buffers holding directory data, and keep flock()s alive.
- Three minutes: Check for the resuscitation of File Servers previously determined to be down, and check the cache of previously computed access information in light of any newly expired tickets.
- Ten minutes: Check health of all File Servers marked as active, and garbage-collect old RPC connections.
- One hour: Check the status of the root AFS volume as well as all cached information concerning read-only volumes.
- Background Operations: The Cache Manager is capable of prefetching file system objects, as well as carrying out delayed stores, occurring sometime after a close() operation. At least two threads are created at Cache Manager initialization time and held in reserve to carry out these objectives. This class of background threads implements the following three operations:
- Prefetch operation: Fetches particular file system object chunks in the expectation that they will soon be needed.
- Path-based prefetch operation: The prefetch daemon mentioned above operates on objects already at least partly resident in the local cache, referenced by their vnode. The path-based prefetch daemon performs the same actions, but on objects named solely by their unix pathname.
- Delayed store operation: Flush all modified chunks from a file system object to the appropriate File Server's disks.
Section 3.4: Disposal of Cache Manager Records
- The Cache Manager is free to throw away any or all of the callbacks it has received from the set of File Servers from which it has cached files. This housecleaning does not in any way compromise the correctness of the AFS cache consistency algorithm. The File Server RPC interface described in this paper provides a call to allow a Cache Manager to advise of such unilateral jettisoning. However, failure to use this routine still leaves the machine's cache consistent. Let us examine the case of a Cache Manager on machine C disposing of its callback on file X from File Server F. The next user access on file X on machine C will cause the Cache Manager to notice that it does not currently hold a callback on it (although the File Server will think it does). The Cache Manager on C attempts to revalidate its entry when it is entirely possible that the file is still in sync with the central store. In response, the File Server will extend the existing callback information it has and deliver the new promise to the Cache Manager on C. Now consider the case where file X is modified by a party on a machine other than C before such an access occurs on C. Under these circumstances, the File Server will break its callback on file X before performing the central update. The Cache Manager on C will receive one of these "break callback" messages. Since it no longer has a callback on file X, the Cache Manager on C will cheerfully acknowledge the File Server's notification and move on to other matters. In either case, the callback information for both parties will eventually resynchronize. The only potential penalty paid is extra inquiries by the Cache Manager and thus providing for reduced performance instead of failure of operation.