OK, here's a second shot at documenting the TDS protocol. This document attempts to cover TDS 4.2 (Sybase SQLServer <= 10?, MS SQLServer6.5), TDS 5.0 (Sybase SQLServer >= 11?), and TDS 7.0 (MS SQLServer7).

Contents

Common Terms

Variable types used in this document:
  CHAR      8-bit char
    CHAR[6]	string of 6 chars
    CHAR[n]     variable length string
  INT8      8-bit int
  INT16    16-bit int
  UCS2LE   Unicode in UCS2LE format
Note: FreeTDS uses TDS_TINYINT for INT8 and TDS_SMALLINT for INT16.

Typical Usage sequences

These are TDS 4.2 and not meant to be 100% correct, but I thought they might be helpful to get an overall view of what goes on.

--> Login
<-- Login acknowledgement

--> INSERT SQL statement
<-- Result Set Done

--> SELECT SQL statement
<-- Column Names
<-- Column Info
<-- Row Result
<-- Row Result
<-- Result Set Done

--> call stored procedure
<-- Column Names
<-- Column Info
<-- Row Result
<-- Row Result
<-- Done Inside Process
<-- Column Names
<-- Column Info
<-- Row Result
<-- Row Result
<-- Done Inside Process
<-- Return Status
<-- Process Done

The packet format

All packets start with the following 8 byte header.

 INT8       INT8          INT16      4 bytes
+----------+-------------+----------+--------------------+
|  packet  | last packet |  packet  |    unknown         |
|   type   |  indicator  |   size   |                    |
+----------+-------------+----------+--------------------+

Fields:
packet type 
     0x01 TDS 4.2 or 7.0 query
     0x02 TDS 4.2 or 5.0 login packet
     0x03 RPC
     0x04 responses from server
     0x06 cancels
     0x07 Used in Bulk
     0x0F TDS 5.0 query
     0x10 TDS 7.0 login packet
last packet indicator 
     0x00 if more packets
     0x01 if last packet
packet size
     (in network byte order)
unknown?
     always 0x00
     this has something to do with server to server communication/rpc stuff
The remainder of the packet depends on the type of information it is providing. As noted above, packets break down into the types query, login, response, and cancels. Response packets are further split into multiple sub-types denoted by the first byte (a.k.a. the token) following the above header.

Note: A TDS packet that is longer than 512 bytes is split on the 512 byte boundary and the "more packets" bit is set. The full TDS packet is reassembled from its component 512 byte packets with the 8-byte headers stripped out. (I believe the 512 is the block_size in the login packet, so it could be set to a different value. *mjs*)

Login Packet

Packet type (first byte) is 2. The numbers on the left are decimal offsets including the 8 byte packet header.

byte   var type    description
------------------------------
   8   CHAR[30]    host_name
  38   INT8        host_name_length
  39   CHAR[30]    user_name
  69   INT8        user_name_length
  70   CHAR[30]    password
 100   INT8        password_length
 101   CHAR[30]    host_process
 131   INT8        host_process_length
 132   ?           magic1[6]          /* mystery stuff */
 138   INT8        bulk_copy 
 139   ?           magic2[9]          /* mystery stuff */
 148   CHAR[30]    app_name
 178   INT8        app_name_length
 179   CHAR[30]    server_name
 209   INT8        server_name_length
 210   ?           magic3[1]          /* 0, don't know this one either */
 211   INT8        password2_length
 212   CHAR[30]    password2
 242   CHAR[223]   magic4
 465   INT8        password2_length_plus2
 466   INT16       major_version      /* TDS version */
 468   INT16       minor_version      /* TDS version */
 470   CHAR        library_name[10]   /* "Ct-Library" or "DB-Library" */
 480   INT8        library_length
 481   INT16       major_version2     /* program version */
 483   INT16       minor_version2     /* program version */
 485   ?           magic6[3]          /* ? last two octets are 13 and 17 */
                                      /* bdw reports last two as 12 and 16 here  */
                                      /* possibly a bitset flag  */
 488   CHAR[30]    language           /* e.g. "us-english" */
 518   INT8        language_length
 519   ?           magic7[1]          /*  mystery stuff */
 520   INT16       old_secure         /* explanation? */
 522   INT8        encrypted          /*  1 means encrypted all password fields blank */
 523   ?           magic8[1]          /*  no clue... zeros */
 524   CHAR        sec_spare[9]       /* explanation? */
 533   CHAR[30]    char_set           /* e.g. "iso_1" */
 563   INT8        char_set_length
 564   INT8        magic9[1]          /* 1 */ 
 565   CHAR[6]     block_size         /*  in text */
 571   INT8        block_size_length 
 572   ?           magic10[25]        /* lots of stuff here...no clue */
Any help with the magic numbers would be most appreciated.

TDS7.0 Login Packet

byte  var type  description
---------------------------
  0   INT16	total packet size
  2   INT8[5]?	00000
  7   INT8	TDS Version?	0x70 for TDS7, 0x80 for TDS8
  8   INT8[7]?	0000000
 15   INT8[4]	magic   0x0682f2f8
 19   INT8[4]	PID of client
 23   INT8[13]  magic  0x00e003000088ffffff36040000 
                (seen  0x00e0030000c4ffffff10040000)
                third byte is 0x83 for NT authentication
 36   INT16	position of client hostname (86)
 38   INT16	hostname lenght
 40   INT16	position of username
 42   INT16	username length
 44   INT16	position of password
 46   INT16	password length
 48   INT16	position of app name
 50   INT16	app name length
 52   INT16	position of server name
 54   INT16	server name length
 56   INT16	0
 58   INT16	0
 60   INT16	position of library name
 62   INT16	library name length
 64   INT16	position of language
 66   INT16	language name
                (for italian "Italiano" coded UCS2)
 68   INT16	position of database name
 70   INT16	database name length
 72   INT8[6]	MAC address of client
 78   INT16	position of auth portion
 80   INT16	NT authentication length
 82   INT16	next position (same of total packet size)
 84   INT16	0
 86   UCS2LE[n] hostname
      UCS2LE[n]	username
      UCS2LE[n]	encrypted password
      UCS2LE[n]	app name
      UCS2LE[n]	server name
      UCS2LE[n]	library name
      UCS2LE[n]	language name
      UCS2LE[n]	database name
      NT Authentication packet

NT Authentication packet
  0   CHAR[8]	authentication id "NTLMSSP\0"
  8   INT32     1  message type
 12   INT32	0xb201 flags
 16   INT16	domain length
 18   INT16     domain length
 20   INT32     domain offset
 24   INT16     hostname length
 26   INT16     hostname length
 28   INT32     hostname offset
 32   CHAR[n]   hostname
      CHAR[n]   domain
See documentation on Samba for detail (or search ntlm authentication for IIS)
"current pos" is the starting byte address for a Unicode string within the packet. The length of that Unicode string immediately follows. That implies there are at least 2 more strings that could be defined. (character set??)

Password and user is not used is NT authentication is used (setted as empty).

Collate type - TDS8

Collate structure contain information on characters set encoding and compare method.

 INT16      INT16    INT8
+----------+--------+------------+
| codepage | flags  | charset_id |
+----------+--------+------------+

codepage    windows codepage (see http://www.microsoft.com/globaldev/nlsweb/)
            also specified in lcid column of master..syslanguages
flags       sort flags
            0x100 binary compare
            0x080 width insensitive
            0x040 Katatype insensitive
            0x020 accent insensitive
            0x010 case insensitive
            If binary flag is specified other flags are not present
            Low nibble of flags is a charset specifier (like chinese dialect)
charset_id  charset id in master..syscharsets table or zero for no SQL collations

Collations names can be obtained from select name from ::fn_helpcollations() query


Server Responses

Responses from the server start with a single octet (token) identifying its type. If variable length, they generally have the length as the second and third bytes

Tokens encountered thus far:

0x21    33   "Language packet" ? 	5.0 only, client-side?
0x71   113   "Logout"			5.0? ct_close(), client-side?
0x79   121   Return Status
0x7C   124   Process ID			4.2 only
0x81   129   7.0 Result			7.0 only
0xA0   160   Column Name		4.2 only
0xA1   161   Column Info --- Row Result	4.2 only
0xA4   164   "tabname" ?
0xA5   165   "col_info" ?
0xA7   167   compute related ?  Also "control" ?
0xA8   168   Column Info --- Compute Result
0xA9   169   Order By
0xAA   170   Error Message
0xAB   171   Non-error Message
0xAC   172   Output Parameters
0xAD   173   Login Acknowledgement
0xAE   174   "control" ?
0xD1   209   Data --- Row Result
0xD3   211   Data --- Compute Result
0xD7   215   "param packet" ? **bdw**
0xE2   226   "capability packet" ?
0xE3   227   Environment Change (database change, packet size, etc...)
0xE5   229   Extended Error Message
0xE6   230   "DBRPC" ? **bdw**
0xEC   236   "param format packet" ?
0xEE   238   Result Set
0xFD   253   Result Set Done 
0xFE   254   Process Done
0xFF   255   Done inside Process
 
"Language" (0x21 33)

 int?     INT8     CHAR[n]
+--------+--------+--------+
| length | status | string |
+--------+--------+--------+
 
"Logout" (0x71 113)

No information. (1 byte, value=0 ?)  


Return Status (0x79 121)

 4 bytes
+---------------+
| Return status |
+---------------+
The return value of a stored procedure.  
Process ID (0x7C 124)

 8 bytes
+----------------+
| process number |
+----------------+
Presumably the process ID number for an executing stored procedure. (I'm not sure how this would ever be used by a client. *mjs*)  
Result - TDS 7.0+ (0x81 129)

 INT16  
+----------+-------------+
| #columns | column_info | 
+----------+-------------+
The TDS 7.0 column_info is formatted as follows for each column:
 INT16  INT16  INT8   varies  varies     INT8[5]      INT8          UCS2LE[n]
+------+------+------+-------+----------+------------+-------------+---------+
| 0?   | 1?   | type | size  | optional | collate    | name length | name    | 
|      |      |      | (opt) | (opt)    | info(TDS8) |             |         |
+------+------+------+-------+----------+------------+-------------+---------+

type		data type, values >128 indicate a large type
size		none for fixed size types
		4 bytes for blob and text
		2 bytes for large types
		1 byte for all others
optional
                               INT8        INT8
                              +-----------+-------+
  numeric/decimal types:      | precision | scale |
                              +-----------+-------+

                               INT16               UCS2LE[n]
                              +-------------------+------------+
  blob/text types:            | table name length | table name |
                              +-------------------+------------+

  collate info are available only using TDS8 and for characters types (but not
  for old type like short VARCHAR, only 2byte length versions)
 
Column Name (0xA0 160)

 INT16          INT8      CHAR[n]               INT8      CHAR[n] 
+--------------+---------+--------------+------+---------+--------------+
| total length | length1 | column1 name | .... | lengthN | columnN name |
+--------------+---------+--------------+------+---------+--------------+
   
Column Info - Row Result (0xA1 161)
Column Info - Compute Result (0xA8 168)

 INT8          CHAR[n]        INT8      INT16     INT16     INT16  
+-------------+--------------+---------+---------+---------+---------+
| column name | column name  | unknown |  user   | unknown | column  |
|   length    |              |         |  type   |         |  type   |
+-------------+--------------+---------+---------+---------+---------+

 INT8          INT8       INT8       INT8       CHAR[n]      1 byte
+-------------+----------+----------+----------+------------+----------+
| column size |precision |  scale   | t length | table name | unknown  |
| (optional)  |(optional)|(optional)|(optional)| (optional) |          |
+-------------+----------+----------+----------+------------+----------+

column name length 
column name        column name in result set, not necessarily db column name
unknown            unknown (0, 16 ?)
user type          usertype column from syscolumns
unknown            always 0's
column type        (need an appendix for discussion of column types)
column size        not present for fixed size columns
precision          present only for SYBDECIMAL and SYBNUMERIC
scale              present only for SYBDECIMAL and SYBNUMERIC
t length           present only for SYBTEXT and SYBIMAGE, length of table name
table name         present only for SYBTEXT and SYBIMAGE
unknown            always 0x00
 
"tabname" (0xA4 164)

No information.  


"col info" (0xA5 165)

No information.    


compute "control" ? (0xA7 167)
"control" (0xAE 174)

Miscellaneous note (from *bdw* ?) found with 0xAE:

  has one byte for each column, 
  comes between result(238) and first row(209),
  I believe computed column info is stored here, need to investigate
 
Order By (0xA9 169)

 INT16    variable (1 byte per col)
+--------+---------+
| length | orders  |
+--------+---------+

length		Length of packet(and number of cols)
orders          one byte per order by indicating the
                column # in the output matching the
                order from Column Info and Column Names
                and data in following Row Data items.
                A 0 indicates the column is not in the
                resulting rows.

an example:
select first_name, last_name, number from employee
order by salary, number
assuming the columns are returned in the order
queried:
first_name then last_name, then number. we would have:
----------------
|  2   | 0 | 3 |
----------------
where length = 2 then the orders evaluate:
0 for salary, meaning there is no salary data returned
3 for number, meaning the 3rd data item corresponding
to a column is the number
 
Error Message (0xAA 170)
Non-error Message (0xAB 171)
Extended Error Message (0xE5 229)
 INT16    4 bytes      INT8    INT8    
+--------+------------+-------+-------+
| length | msg number | state | level |
+--------+------------+-------+-------+

 INT16      CHAR[n]   INT8       CHAR[n]   INT8       CHAR[n]   INT16  
+----------+---------+----------+---------+----------+---------+-------+
| m length | message | s length | server  | p length | process | line# |
+----------+---------+----------+---------+----------+---------+-------+

length		Length of packet
msg number	SQL message number
state		?
level		An error if level > 10, a message if level <= 10
m length	Length of message
message		Text of error/message
s length	Length of server name
server		Name of "server" ?
p length	Length of process name
process name	Stored procedure name, if any
line#		Line number of input which generated the message
 
Output Parameters (0xAC 172)

Output parameters of a stored procedure.

 INT16    INT8       CHAR[n]   5 bytes   INT8
+--------+----------+---------+---------+----------+------+
| length | c length | colname | unknown | datatype | .... | 
+--------+----------+---------+---------+----------+------+

length		Length of packet
c length	Length of colname
colname		Name of column
datatype	Type of data returned
The trailing information depends on whether the datatype is
a fixed size datatype.
				 N bytes
				+---------+
  Datatype of fixed size N	| data    |
				+---------+

				 INT8          INT8            N bytes
				+-------------+---------------+--------+
  Otherwise			| column size | actual size N | data   |
				+-------------+---------------+--------+
 
Login Acknowledgement (0xAD 173)

 INT16    INT8    4 bytes   INT8       CHAR[n]  variable
+--------+-------+---------+----------+--------+----------+
| length |  ack  | version | t length |  text  |  magic   |
+--------+-------+---------+----------+--------+----------+

length		length of packet
ack		0x01 success	4.2
		0x05 success	5.0
		0x06 failure	5.0
version		4 bytes:  major.minor.?.?
t length	length of text
text		server name?  'Microsoft SQL Server'
magic		?
   
Data - Row Result (0xD1 209)
Data - Compute Result (0xD3 211)
 INT8       variable size
+----------+--------------------+
|  token   |   row data         |
+----------+--------------------+
Row data starts with one byte (decimal 209), for variable length types, a one byte length field precedes the data, for fixed length records just the data appears.
Note: nullable integers and floats are variable length.

For example: sp_who

The first field is spid, a smallint
The second field is status a char(12), in our example "recv sleep "

The row would look like this:

  byte  0 is the token
  bytes 1-2 are a smallint in low-endian
  byte  3 is the length of the char field
  bytes 4-15 is the char field

byte  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15
hex  D1  01  00  0C  72  65  63  76  20  73  6C  65  65  70  20  20
    209   1   0  12   r   e   c   v ' '   s   l   e   e   p ' ' ' '
 
Parameter packet (0xD7 215)

No information  
Capability packet (0xE2 226)
 INT16    variable
+--------+--------------+
| length | capabilities |
+--------+--------------+

length		Length of capability string
capabilities	Server capabilities?  Related to login magic?
 
Environment change (0xE3 227)
 INT16    INT8       INT8        CHAR[n]   INT8        CHAR[n] 
+--------+----------+-----------+---------+-----------+---------+
| length | env code | t1 length |  text1  | t2 length |  text2  |
+--------+----------+-----------+---------+-----------+---------+

env code	Code for what part of environment changed
	0x01  database context
	0x03  character set
text1		?Old value
text2		?New value
 
DB RPC ? (0xE6 230)

No information.  
Param format (sent by client?) (0xEC 236)
 INT16     INT16        variable size
+---------+------------+-------------------+
| length  | number of  | parameter info    |
|         | parameters |                   |
+---------+------------+-------------------+

length            	length of message following this field
number of parameters	number of parameter formats following
list of formats		I (*bdw*) imagine it uses the column format structure.
 
Result Set (0xEE 238)
 INT16     INT16        variable size
+---------+------------+-----------------+
| length  | number of  | column info     |
|         | columns    |                 |
+---------+------------+-----------------+


Fields:
length             length of message following this field
number of columns  number of columns in the result set, this many column
                   information fields will follow.
column info        column info
 
Result Set Done (0xFD 253)
Process Done (0xFE 254)
Done Inside Process (0xFF 255)
 INT16       INT16     4 bytes
+-----------+---------+-----------+
| bit flags | unknown | row count |
+-----------+---------+-----------+

Fields:
bit flags          0x01 more results
		   0x02 invalid sql ?
		   0x10 suceeded ?
		   0x20 cancelled
unknown            2,0  /* something to do with block size perhaps */
row count          number of rows affected / returned in the result set. 
		(FIXME check if "affected / returned" is correct)
"Result Set Complete" is the end of a query that doesn't create a process on the server. I.e., it doesn't call a stored procedure.

"Process Done" is the end of a stored procedure

"Done In Process" means that a query internal to a stored procedure has finished, but the stored procedure isn't done overall.


Acknowledgements
The following people have contributed to this document:

Brian Bruns (first draft, protocol discovery)
Brian Wheeler (protocol discovery)
Mark Schaal (second draft)

(short list)