HTParse: URL parsing in the WWW Library
HTParse
This module of the WWW library contains
code to parse URLs and various related
things. Implemented by HTParse.c
.
#ifndef HTPARSE_H
#define HTPARSE_H
#include "HTUtils.h"
The following are flag bits which
may be ORed together to form a number
to give the 'wanted' argument to
HTParse.
#define PARSE_ACCESS 16
#define PARSE_HOST 8
#define PARSE_PATH 4
#define PARSE_ANCHOR 2
#define PARSE_PUNCTUATION 1
#define PARSE_ALL 31
HTParse: Parse a URL relative to
another URL
This returns those parts of a name
which are given (and requested) substituting
bits from the related name where
necessary.
On entry
- aName
- A filename given
- relatedName
- A name relative to which
aName is to be parsed
- wanted
- A mask for the bits which
are wanted.
On exit,
- returns
- A pointer to a malloc'd string
which MUST BE FREED
extern char * HTParse PARAMS((const char * aName, const char * relatedName, int wanted));
HTStrip: Strip white space off a
string
On exit
Return value points to first non-white
character, or to 0 if none.
All trailing white space is OVERWRITTEN
with zero.
#ifdef __STDC__
extern char * HTStrip(char * s);
#else
extern char * HTStrip();
#endif
HTSimplify: Simplify a UTL
A URL is allowed to contain the seqeunce
xxx/../ which may be replaced by
"" , and the seqeunce "/./" which
may be replaced by "/". Simplification
helps us recognize duplicate filenames.
It doesn't deal with soft links,
though. The new (shorter) filename
overwrites the old.
/*
** Thus, /etc/junk/../fred becomes /etc/fred
** /etc/junk/./fred becomes /etc/junk/fred
*/
#ifdef __STDC__
extern void HTSimplify(char * filename);
#else
extern void HTSimplify();
#endif
HTRelative: Make Relative (Partial)
URL
This function creates and returns
a string which gives an expression
of one address as related to another.
Where there is no relation, an absolute
address is retured.
On entry,
Both names must be absolute, fully
qualified names of nodes (no anchor
bits)
On exit,
The return result points to a newly
allocated name which, if parsed by
HTParse relative to relatedName,
will yield aName. The caller is responsible
for freeing the resulting name later.
#ifdef __STDC__
extern char * HTRelative(const char * aName, const char *relatedName);
#else
extern char * HTRelative();
#endif
HTEscape: Encode unacceptable characters
in string
This funtion takes a string containing
any sequence of ASCII characters,
and returns a malloced string containing
the same infromation but with all
"unacceptable" characters represented
in the form %xy where X and Y are
two hex digits.
extern char * HTEscape PARAMS((CONST char * str, unsigned char mask));
The following are valid mask values.
The terms are the BNF names in the
URL document.
#define URL_XALPHAS (unsigned char) 1
#define URL_XPALPHAS (unsigned char) 2
#define URL_PATH (unsigned char) 4
HTUnEscape: Decode %xx escaped characters
This function takes a pointer to
a string in which character smay
have been encoded in %xy form, where
xy is the acsii hex code for character
16x+y. The string is converted in
place, as it will never grow.
extern char * HTUnEscape PARAMS(( char * str));
Prevent Security Holes
HTCleanTelnetString()
makes sure that the given string
doesn't contain characters that could cause security holes, such as
newlines in ftp, gopher, news or telnet URLs; more specifically:
allows everything between hexadesimal ASCII 20-7E, and also A0-FE,
inclusive.
-
str
- the string that is *modified* if necessary. The string will be
truncated at the first illegal character that is encountered.
- returns
- YES, if the string was modified.
NO, otherwise.
PUBLIC BOOL HTCleanTelnetString PARAMS((char * str));
#endif /* HTPARSE_H */
end of HTParse