Main Page | Namespace List | Class Hierarchy | Alphabetical List | Class List | Directories | File List | Namespace Members | Class Members | File Members | Related Pages

wvencoder.h

Go to the documentation of this file.
00001 /* -*- Mode: C++ -*-
00002  * Worldvisions Weaver Software:
00003  *   Copyright (C) 1997-2002 Net Integration Technologies, Inc.
00004  *
00005  * A top-level data encoder class and a few useful encoders.
00006  */
00007 #ifndef __WVENCODER_H
00008 #define __WVENCODER_H
00009 
00010 #include "wvbuf.h"
00011 #include "wvlinklist.h"
00012 #include "wvstring.h"
00013 
00014 /**
00015  * The base encoder class.
00016  * 
00017  * Encoders read data from an input buffer, transform it in some
00018  * way, then write the results to an output buffer.  The resulting
00019  * data may be of a different size or data type, and may or may
00020  * not depend on previous data.
00021  * 
00022  * Encoders may or may not possess the following characteristics:
00023  * 
00024  *  - Statefulness: encoding of successive input elements may
00025  *     depend on previous one
00026  *  - Error states: encoding may enter an error state indicated
00027  *     by isok() == false due to problems detected
00028  *     in the input, or by the manner in which the encoder has
00029  *     been user
00030  *  - Minimum input block size: data will not be drawn from the
00031  *     input buffer until enough is available or the encoder
00032  *     is flushed
00033  *  - Minimum output block size: data will not be written to the
00034  *     output buffer until enough free space is available
00035  *  - Synchronization boundaries: data is process or generated
00036  *     in chunks which can be manipulated independently of any
00037  *     others, in which case flush() may cause the encoder to
00038  *     produce such a boundary in its output
00039  *  - Recognition of end-of-data mark: a special sequence marks
00040  *     the end of input, after which the encoder transitions to
00041  *     isfinished() == true
00042  *  - Generation of end-of-data mark: a special sequence marks
00043  *     the end of output when the encoder transitions to
00044  *     isfinished() == true, usually by an explicit
00045  *     call to finish()
00046  *  - Reset support: the encoder may be reset to its initial
00047  *     state and thereby recycled at minimum cost
00048  * 
00049  * 
00050  * Helper functions are provided for encoding data from plain
00051  * memory buffers and from strings.  Some have no encode(...)
00052  * equivalent because they cannot incrementally encode from
00053  * the input, hence they always use the flush option.
00054  * 
00055  * The 'mem' suffix has been tacked on to these functions to
00056  * resolve ambiguities dealing with 'char *' that should be
00057  * promoted to WvString.  For instance, consider the signatures
00058  * of strflushmem(const void*, size_t) and strflushstr(WvStringParm,
00059  * bool).
00060  * 
00061  * Another reason for these suffixes is to simplify overloading
00062  * the basic methods in subclasses since C++ would require the
00063  * subclass to redeclare all of the other signatures for
00064  * an overloaded method.
00065  * 
00066  */
00067 class WvEncoder
00068 {
00069 protected:
00070     bool okay; /*!< false iff setnotok() was called */
00071     bool finished; /*!< true iff setfinished()/finish() was called */
00072     WvString errstr; /*!< the error message */
00073 
00074 public:
00075     /** Creates a new WvEncoder. */
00076     WvEncoder();
00077 
00078     /** Destroys the encoder.  Unflushed data is lost. */
00079     virtual ~WvEncoder();
00080     
00081     /**
00082      * Returns true if the encoder has not encountered an error.
00083      * 
00084      * This should only be used to record permanent failures.
00085      * Transient errors (eg. bad block, but recoverable) should be
00086      * detected in a different fashion.
00087      * 
00088      * Returns: true if the encoder is ok
00089      */
00090     bool isok() const
00091         { return okay && _isok(); }
00092 
00093     /**
00094      * Returns true if the encoder can no longer encode data.
00095      * 
00096      * This will be set when the encoder detects and end-of-data
00097      * mark in its input, or when finish() is called.
00098      * 
00099      * Returns: true if the encoder is finished
00100      */
00101     bool isfinished() const
00102         { return finished || _isfinished(); }
00103 
00104     /**
00105      * Returns an error message if any is available.
00106      *
00107      * Returns: the error message, or the null string is isok() == true
00108      */
00109     WvString geterror() const;
00110 
00111     /**
00112      * Reads data from the input buffer, encodes it, and writes the result
00113      * to the output buffer.
00114      * 
00115      * If flush == true, the input buffer will be drained and the output
00116      * buffer will contain all of the encoded data including any that
00117      * might have been buffered internally from previous calls.  Thus it
00118      * is possible that new data will be written to the output buffer even
00119      * though the input buffer was empty when encode() was called.  If the
00120      * buffer could not be fully drained because there was insufficient
00121      * data, this function returns false and leaves the remaining unflushed
00122      * data in the buffer.
00123      * 
00124      * If flush == false, the encoder will read and encode as much data
00125      * as possible (or as it convenient) from the input buffer and store
00126      * the results in the output buffer.  Partial results may be buffered
00127      * internally by the encoder to be written to the output buffer later
00128      * when the encoder is flushed.
00129      * 
00130      * If finish = true, the encode() will be followed up by a call to
00131      * finish().  The return values will be ANDed together to yield the
00132      * final result.  Most useful when flush is also true.
00133      *
00134      * If a permanent error occurs, then isok() will return false, this
00135      * function will return false and the input buffer will be left in an
00136      * undefined state.
00137      * 
00138      * If a recoverable error occurs, the encoder should discard the
00139      * problematic data from the input buffer and return false from this
00140      * function, but isok() will remain true.
00141      * 
00142      * A stream might become isfinished() == true if an encoder-
00143      * specific end-of-data marker was detected in the input.
00144      * 
00145      * "inbuf" is the input buffer
00146      * "outbuf" is the output buffer
00147      * "flush" is if true, flushes the encoder
00148      * "finish" is if true, calls finish() on success
00149      * Returns: true on success
00150      * @see _encode for the actual implementation
00151      */
00152     bool encode(WvBuf &inbuf, WvBuf &outbuf, bool flush = false,
00153         bool finish = false);
00154 
00155     /**
00156      * Flushes the encoder and optionally finishes it.
00157      *
00158      * "inbuf" is the input buffer
00159      * "outbuf" is the output buffer
00160      * "finish" is if true, calls finish() on success
00161      * Returns: true on success
00162      */
00163     bool flush(WvBuf &inbuf, WvBuf &outbuf,
00164         bool finish = false)
00165         { return encode(inbuf, outbuf, true, finish); }
00166 
00167     /**
00168      * Tells the encoder that NO MORE DATA will ever be encoded.
00169      * 
00170      * The encoder will flush out any internally buffered data
00171      * and write out whatever end-of-data marking it needs to the
00172      * supplied output buffer before returning.
00173      * 
00174      * Clients should invoke flush() on the input buffer before
00175      * finish() if the input buffer was not yet empty.
00176      * 
00177      * It is safe to call this function multiple times.
00178      * The implementation will simply return isok() and do nothing else.
00179      * 
00180      * "outbuf" is the output buffer
00181      * Returns: true on success
00182      * @see _finish for the actual implementation
00183      */
00184     bool finish(WvBuf &outbuf);
00185 
00186     /**
00187      * Asks an encoder to reset itself to its initial state at
00188      * creation time, if supported.
00189      * 
00190      * This function may be called at any time, even if
00191      * isok() == false, or isfinished() == true.
00192      * 
00193      * If the behaviour is not supported or an error occurs,
00194      * then false is returned and afterwards isok() == false.
00195      * 
00196      * Returns: true on success
00197      * @see _reset for the actual implementation
00198      */
00199     bool reset();
00200 
00201     /**
00202      * Flushes data through the encoder from a string to a buffer.
00203      *
00204      * "instr" is the input string
00205      * "outbuf" is the output buffer
00206      * "finish" is if true, calls finish() on success
00207      * Returns: true on success
00208      */
00209     bool flushstrbuf(WvStringParm instr, WvBuf &outbuf,
00210         bool finish = false);
00211         
00212     /**
00213      * Flushes data through the encoder from a string to a string.
00214      * 
00215      * The output data is appended to the target string.
00216      * 
00217      * "instr" is the input string
00218      * "outstr" is the output string
00219      * "finish" is if true, calls finish() on success
00220      * Returns: true on success
00221      */
00222     bool flushstrstr(WvStringParm instr, WvString &outstr,
00223         bool finish = false);
00224 
00225     /**
00226      * Encodes data from a buffer to a string.
00227      * 
00228      * The output data is appended to the target string.
00229      * 
00230      * "inbuf" is the input buffer
00231      * "outstr" is the output string
00232      * "flush" is if true, flushes the encoder
00233      * "finish" is if true, calls finish() on success
00234      * Returns: true on success
00235      */   
00236     bool encodebufstr(WvBuf &inbuf, WvString &outstr,
00237         bool flush = false, bool finish = false);
00238 
00239     /**
00240      * Flushes data through the encoder from a buffer to a string.
00241      * 
00242      * The output data is appended to the target string.
00243      * 
00244      * "inbuf" is the input buffer
00245      * "outstr" is the output string
00246      * "finish" is if true, calls finish() on success
00247      * Returns: true on success
00248      */   
00249     bool flushbufstr(WvBuf &inbuf, WvString &outstr,
00250         bool finish = false)
00251         { return encodebufstr(inbuf, outstr, true, finish); }
00252     
00253     /**
00254      * Flushes data through the encoder from a string to a string.
00255      *
00256      * "inbuf" is the input buffer
00257      * "finish" is if true, calls finish() on success
00258      * Returns: the resulting encoded string, does not signal errors
00259      */   
00260     WvString strflushstr(WvStringParm instr, bool finish = false);
00261     
00262     /**
00263      * Flushes data through the encoder from a buffer to a string.
00264      *
00265      * "inbuf" is the input buffer
00266      * "finish" is if true, calls finish() on success
00267      * Returns: the resulting encoded string, does not signal errors
00268      */   
00269     WvString strflushbuf(WvBuf &inbuf, bool finish = false);
00270 
00271     /**
00272      * Flushes data through the encoder from memory to a buffer.
00273      *
00274      * "inmem" is the input data pointer
00275      * "inlen" is the input data length
00276      * "outbuf" is the output buffer
00277      * "finish" is if true, calls finish() on success
00278      * Returns: true on success
00279      */
00280     bool flushmembuf(const void *inmem, size_t inlen, WvBuf &outbuf,
00281         bool finish = false);
00282         
00283     /**
00284      * Flushes data through the encoder from memory to memory.
00285      * 
00286      * The outlen parameter specifies by reference
00287      * the length of the output buffer.  It is updated in place to
00288      * reflect the number of bytes copied to the output buffer.
00289      * If the buffer was too small to hold the data, the overflow
00290      * bytes will be discarded and false will be returned.
00291      * 
00292      * "inmem" is the input data pointer
00293      * "inlen" is the input data length
00294      * "outmem" is the output data pointer
00295      * "outlen" is the output data length, by reference
00296      * "finish" is if true, calls finish() on success
00297      * Returns: true on success
00298      */
00299     bool flushmemmem(const void *inmem, size_t inlen, void *outmem,
00300         size_t *outlen, bool finish = false);
00301         
00302     /**
00303      * Encodes data from a buffer to memory.
00304      * 
00305      * The outlen parameter specifies by reference
00306      * the length of the output buffer.  It is updated in place to
00307      * reflect the number of bytes copied to the output buffer.
00308      * If the buffer was too small to hold the data, the overflow
00309      * bytes will be discarded and false will be returned.
00310      * 
00311      * "inmem" is the input data pointer
00312      * "inlen" is the input data length
00313      * "outmem" is the output data pointer
00314      * "outlen" is the output data length, by reference
00315      * "flush" is if true, flushes the encoder
00316      * "finish" is if true, calls finish() on success
00317      * Returns: true on success
00318      */
00319     bool encodebufmem(WvBuf &inbuf, void *outmem, size_t *outlen,
00320         bool flush = false, bool finish = false);   
00321         
00322     /**
00323      * Flushes data through the encoder from a buffer to memory.
00324      * 
00325      * The outlen parameter specifies by reference
00326      * the length of the output buffer.  It is updated in place to
00327      * reflect the number of bytes copied to the output buffer.
00328      * If the buffer was too small to hold the data, the overflow
00329      * bytes will be discarded and false will be returned.
00330      * 
00331      * "inbuf" is the input buffer
00332      * "outmem" is the output data pointer
00333      * "outlen" is the output data length, by reference
00334      * "finish" is if true, calls finish() on success
00335      * Returns: true on success
00336      */
00337     bool flushbufmem(WvBuf &inbuf, void *outmem, size_t *outlen,
00338         bool finish = false)
00339         { return encodebufmem(inbuf, outmem, outlen, true, finish); }
00340 
00341     /**
00342      * Flushes data through the encoder from a string to memory.
00343      * 
00344      * The outlen parameter specifies by reference
00345      * the length of the output buffer.  It is updated in place to
00346      * reflect the number of bytes copied to the output buffer.
00347      * If the buffer was too small to hold the data, the overflow
00348      * bytes will be discarded and false will be returned.
00349      * 
00350      * "instr" is the input string
00351      * "outmem" is the output data pointer
00352      * "outlen" is the output data length, by reference
00353      * "finish" is if true, calls finish() on success
00354      * Returns: true on success
00355      */
00356     bool flushstrmem(WvStringParm instr, void *outmem, size_t *outlen,
00357         bool finish = false);
00358 
00359     /**
00360      * Flushes data through the encoder from memory to a string.
00361      *
00362      * "inmem" is the input data pointer
00363      * "inlen" is the input data length
00364      * "finish" is if true, calls finish() on success
00365      * Returns: the resulting encoded string, does not signal errors
00366      */
00367     WvString strflushmem(const void *inmem, size_t inlen, bool finish = false);
00368 
00369 protected:
00370     /** Sets 'okay' to false explicitly. */
00371     void setnotok()
00372         { okay = false; }
00373 
00374     /** Sets an error condition, then setnotok(). */
00375     void seterror(WvStringParm message)
00376         { errstr = message; setnotok(); }
00377 
00378     /** Sets an error condition, then setnotok(). */
00379     void seterror(WVSTRING_FORMAT_DECL)
00380         { seterror(WvString(WVSTRING_FORMAT_CALL)); }
00381 
00382     /** Sets 'finished' to true explicitly. */
00383     void setfinished()
00384         { finished = true; }
00385 
00386 protected:
00387     /**
00388      * Template method implementation of isok().
00389      * 
00390      * Not called if any of the following cases are true:
00391      * 
00392      *  - okay == false
00393      * 
00394      * 
00395      * Most implementations do not need to override this.
00396      * 
00397      * Returns: true if the encoder is ok
00398      * @see setnotok
00399      */
00400     virtual bool _isok() const
00401         { return true; }
00402 
00403     /**
00404      * Template method implementation of isfinished().
00405      * 
00406      * Not called if any of the following cases are true:
00407      * 
00408      *  - finished == true
00409      * 
00410      * 
00411      * Most implementations do not need to override this.
00412      * 
00413      * Returns: true if the encoder is finished
00414      * @see setfinished
00415      */
00416     virtual bool _isfinished() const
00417         { return false; }
00418 
00419     /**
00420      * Template method implementation of geterror().
00421      * 
00422      * Not called if any of the following cases are true:
00423      * 
00424      *  - isok() == true
00425      *  - errstr is not null
00426      * 
00427      * 
00428      * Most implementations do not need to override this.
00429      * 
00430      * Returns: the error message, or the null string if _isok() == true
00431      * @see seterror
00432      */
00433     virtual WvString _geterror() const
00434         { return WvString::null; }
00435 
00436     /**
00437      * Template method implementation of encode().
00438      * 
00439      * Not called if any of the following cases are true:
00440      * 
00441      *  - okay == false
00442      *  - finished == true
00443      *  - in.used() == 0 && flush == false
00444      * 
00445      * 
00446      * All implementations MUST define this.
00447      * 
00448      * If you also override _isok() or _isfinished(), note that they
00449      * will NOT be consulted when determining whether or not to
00450      * invoke this function.  This allows finer control over the
00451      * semantics of isok() and isfinished() with respect to encode().
00452      * 
00453      * "inbuf" is the input buffer
00454      * "outbuf" is the output buffer
00455      * "flush" is if true, flushes the encoder
00456      * Returns: true on success
00457      * @see encode
00458      */
00459     virtual bool _encode(WvBuf &inbuf, WvBuf &outbuf, bool flush) = 0;
00460 
00461     /**
00462      * Template method implementation of finish().
00463      * 
00464      * Not called if any of the following cases are true:
00465      * 
00466      *  - okay == false
00467      *  - finished == true
00468      * 
00469      * 
00470      * The encoder is marked finished AFTER this function exits.
00471      * 
00472      * Many implementations do not need to override this.
00473      * 
00474      * If you also override _isok() or _isfinished(), note that they
00475      * will NOT be consulted when determining whether or not to
00476      * invoke this function.  This allows finer control over the
00477      * semantics of isok() and isfinished() with respect to finish().
00478      * 
00479      * "outbuf" is the output buffer
00480      * Returns: true on success
00481      * @see finish
00482      */
00483     virtual bool _finish(WvBuf &outbuf)
00484         { return true; }
00485 
00486     /**
00487      * Template method implementation of reset().
00488      * 
00489      * When this method is invoked, the current local state will
00490      * be okay == true and finished == false.  If false is returned,
00491      * then okay will be set to false.
00492      * 
00493      * May set a detailed error message if an error occurs.
00494      * 
00495      * Returns: true on success, false on error or if not supported
00496      * @see reset
00497      */
00498     virtual bool _reset()
00499         { return false; }
00500 };
00501 
00502 
00503 /** An encoder that discards all of its input. */
00504 class WvNullEncoder : public WvEncoder
00505 {
00506 protected:
00507     virtual bool _encode(WvBuf &in, WvBuf &out, bool flush);
00508     virtual bool _reset(); // supported: does nothing
00509 };
00510 
00511 
00512 /**
00513  * A very efficient passthrough encoder that just merges the
00514  * input buffer into the output buffer.
00515  * 
00516  * Counts the number of bytes it has processed.
00517  * 
00518  * Supports reset().
00519  * 
00520  */
00521 class WvPassthroughEncoder : public WvEncoder
00522 {
00523     size_t total;
00524     
00525 public:
00526     WvPassthroughEncoder();
00527     virtual ~WvPassthroughEncoder() { }
00528 
00529     /**
00530      * Returns the number of bytes processed so far.
00531      * Returns: the number of bytes
00532      */
00533     size_t bytes_processed() { return total; }
00534     
00535 protected:
00536     virtual bool _encode(WvBuf &in, WvBuf &out, bool flush);
00537     virtual bool _reset(); // supported: resets the count to zero
00538 };
00539 
00540 
00541 /**
00542  * An encoder chain owns a list of encoders that are used in sequence
00543  * to transform data from a source buffer to a target buffer.
00544  * 
00545  * Supports reset() if all the encoders it contains also support
00546  * reset().
00547  * 
00548  */
00549 class WvEncoderChain : public WvEncoder
00550 {
00551     class WvEncoderChainElem
00552     {
00553     public:
00554         WvEncoder *enc;
00555         WvDynBuf out;
00556         bool auto_free;
00557 
00558         WvEncoderChainElem(WvEncoder *enc, bool auto_free) :
00559             enc(enc), auto_free(auto_free) { }
00560         ~WvEncoderChainElem() { if (auto_free) delete enc; }
00561     };
00562     DeclareWvList2(WvEncoderChainElemListBase, WvEncoderChainElem);
00563 
00564     WvEncoderChainElemListBase encoders;
00565     WvPassthroughEncoder passthrough;
00566 public:
00567     /** Creates an initially empty chain of encoders. */
00568     WvEncoderChain();
00569 
00570     /**
00571      * Destroys the encoder chain.
00572      * 
00573      * Destroys any encoders that were added with auto_free == true.
00574      * 
00575      */
00576     virtual ~WvEncoderChain();
00577 
00578     /**
00579      * Appends an encoder to the tail of the chain.
00580      *
00581      * "enc" is the encoder
00582      * "auto_free" is if true, takes ownership of the encoder
00583      */
00584     void append(WvEncoder *enc, bool auto_free);
00585 
00586     /**
00587      * Prepends an encoder to the head of the chain.
00588      *
00589      * "enc" is the encoder
00590      * "auto_free" is if true, takes ownership of the encoder
00591      */
00592     void prepend(WvEncoder *enc, bool auto_free);
00593 
00594     /**
00595      * Unlinks the encoder from the chain.
00596      * 
00597      * Destroys the encoder if it was added with auto_free == true.
00598      * 
00599      * "enc" is the encoder
00600      */
00601     void unlink(WvEncoder *enc);
00602 
00603     /**
00604      * Clears the encoder chain.
00605      * 
00606      * Destroys any encoders that were added with auto_free == true.
00607      * 
00608      */
00609     void zap();
00610 
00611 protected:
00612     /**
00613      * Returns true if the encoder has not encountered an error.
00614      * 
00615      * WvEncoderChain is special in that it may transition from
00616      * isok() == false to isok() == true if the offending encoders
00617      * are removed from the list.
00618      * 
00619      * Returns: true iff all encoders return isok() == true
00620      * @see WvEncoder::_isok
00621      */
00622     virtual bool _isok() const;
00623     
00624     /**
00625      * Returns true if the encoder can no longer encode data.
00626      * 
00627      * WvEncoderChain is special in that it may transition from
00628      * isfinished() == true to isfinished() == false if the offending
00629      * encoders are removed from the list, but not if finish() is
00630      * called.
00631      * 
00632      * Returns: false iff all encoders return isfinished() == false
00633      */
00634     virtual bool _isfinished() const;
00635 
00636     /**
00637      * Returns the error message, if any.
00638      * 
00639      * WvEncoderChain is special in that it may transition from
00640      * !geterror() = false to !geterror() = true if the offending
00641      * encoders are removed from the list.
00642      * 
00643      * Returns: the first non-null error message in the chain
00644      */
00645     virtual WvString _geterror() const;
00646     
00647     /**
00648      * Passes the data through the entire chain of encoders.
00649      *
00650      * Returns: true iff all encoders return true.
00651      */
00652     virtual bool _encode(WvBuf &in, WvBuf &out, bool flush);
00653     
00654     /**
00655      * Finishes the chain of encoders.
00656      * 
00657      * Invokes finish() on the first encoder in the chain, then
00658      * flush() on the second encoder if new data was generated,
00659      * then finish() on the second encoder, and so on until all
00660      * encoders have been flushed and finished (assuming the first
00661      * encoder had already been flushed).
00662      * 
00663      * Returns: true iff all encoders return true.
00664      */
00665     virtual bool _finish(WvBuf & out);
00666 
00667     /**
00668      * Resets the chain of encoders.
00669      * 
00670      * Resets all of the encoders in the chain and discards any
00671      * pending buffered input.  Preserves the list of encoders.
00672      * 
00673      * Returns: true iff all encoders return true.
00674      */
00675     virtual bool _reset();
00676 };
00677 
00678 #endif // __WVENCODER_H

Generated on Wed Dec 15 15:08:11 2004 for WvStreams by  doxygen 1.3.9.1