This chapter tells you how to build your own driver for erlang.
A driver in Erlang is a library written in C, that is linked to the Erlang emulator and called from erlang. Drivers can be used when C is more suitable than Erlang, to speed things up, or to provide access to OS resources not directly accessible from Erlang.
A driver can be dynamically loaded, as a shared library (known as a DLL on windows), or statically loaded, linked with the emulator when it is compiled and linked. Only dynamically loaded drivers are described here, statically linked drivers are beyond the scope of this chapter.
When a driver is loaded it is executed in the context of the emulator, shares the same memory and the same thread. This means that all operations in the driver must be non-blocking, and that any crash in the driver will bring the whole emulator down. In short: you have to be extremely careful!
This is a simple driver for accessing a postgres database using the libpq C client library. Postgres is used because it's free and open source. For information on postgres, refer to the website www.postgres.org.
The driver is synchronous, it uses the synchronous calls of the client library. This is only for simplicity, and is generally not good, since it will halt the emulator while waiting for the database. This will be improved on below with an asynchronous sample driver.
The code is quite straight-forward: all
communication between Erlang and the driver
is done with port_control/3
, and the
driver returns data back using the rbuf
.
An Erlang driver only exports one function: the driver
entry function. This is defined with a macro,
DRIVER_INIT
, and returns a pointer to a
C struct
containing the entry points that are
called from the emulator. The struct
defines the
entries that the emulator calls to call the driver, with
a NULL
pointer for entries that are not defined
and used by the driver.
The start
entry is called when the driver
is opened as a port with open_port/2
. Here
we allocate memory for a user data structure.
This user data will be passed every time the emulator
calls us. First we store the driver handle, because it
is needed in subsequent calls. We allocate memory for
the connection handle that is used by LibPQ. We also
set the port to return allocated driver binaries, by
setting the flag PORT_CONTROL_FLAG_BINARY
, calling
set_port_control_flags
. (This is because
we don't know whether our data will fit in the
result buffer of control
, which has a default size
set up by the emulator, currently 64 bytes.)
There is an entry init
which is called when
the driver is loaded, but we don't use this, since
it is executed only once, and we want to have the
possibility of several instances of the driver.
The stop
entry is called when the port
is closed.
The control
entry is called from the emulator
when the Erlang code calls port_control/3
,
to do the actual work. We have defined a simple set of
commands: connect
to login to the database, disconnect
to log out and select
to send a SQL-query and get the result.
All results are returned through rbuf
.
The library ei
in erl_interface
is used
to encode data in binary term format. The result is returned
to the emulator as binary terms, so binary_to_term
is called in Erlang to convert the result to term form.
The code is available in pg_sync.c
in the sample
directory of erts
.
The driver entry contains the functions that
will be called by the emulator. In our simple
example, we only provide start
, stop
and control
.
/* Driver interface declarations */ static ErlDrvData start(ErlDrvPort port, char *command); static void stop(ErlDrvData drv_data); static int control(ErlDrvData drv_data, unsigned int command, char *buf, int len, char **rbuf, int rlen); static ErlDrvEntry pq_driver_entry = { NULL, /* init */ start, stop, NULL, /* output */ NULL, /* ready_input */ NULL, /* ready_output */ "pg_sync", /* the name of the driver */ NULL, /* finish */ NULL, /* handle */ control, NULL, /* timeout */ NULL, /* outputv */ NULL, /* ready_async */ NULL, /* flush */ NULL, /* call */ NULL /* event */ };
We have a structure to store state needed by the driver, in this case we only need to keep the database connection.
typedef struct our_data_s { PGconn* conn; } our_data_t;
These are control codes we have defined.
/* Keep the following definitions in alignment with the * defines in erl_pq_sync.erl */ #define DRV_CONNECT 'C' #define DRV_DISCONNECT 'D' #define DRV_SELECT 'S'
This just returns the driver structure. The macro
DRIVER_INIT
defines the only exported function.
All the other functions are static, and will not be exported
from the library.
/* INITIALIZATION AFTER LOADING */ /* * This is the init function called after this driver has been loaded. * It must *not* be declared static. Must return the address to * the driver entry. */ DRIVER_INIT(pq_drv) { return &pq_driver_entry; }
Here we do some initialization, start
is called from
open_port
. The data will be passed to control
and stop
.
/* DRIVER INTERFACE */ static ErlDrvData start(ErlDrvPort port, char *command) { our_data_t* data; data = (our_data_t*)driver_alloc(sizeof(our_data_t)); data->conn = NULL; set_port_control_flags(port, PORT_CONTROL_FLAG_BINARY); return (ErlDrvData)data; }
We call disconnect to log out from the database. (This should have been done from Erlang, but just in case.)
static int do_disconnect(our_data_t* data, ei_x_buff* x); static void stop(ErlDrvData drv_data) { do_disconnect((our_data_t*)drv_data, NULL); }
We use the binary format only to return data to the emulator;
input data is a string paramater for connect
and
select
. The returned data consists of Erlang terms.
The functions get_s
and ei_x_to_new_binary
are
utitilies that is used to make the code shorter. get_s
duplicates the string and zero-terminates it, since the
postgres client library wants that. ei_x_to_new_binary
takes an ei_x_buff
buffer and allocates a binary and
copies the data there. This binary is returned in *rbuf
.
(Note that this binary is freed by the emulator, not by us.)
static char* get_s(const char* buf, int len); static int do_connect(const char *s, our_data_t* data, ei_x_buff* x); static int do_select(const char* s, our_data_t* data, ei_x_buff* x); /* Since we are operating in binary mode, the return value from control * is irrelevant, as long as it is not negative. */ static int control(ErlDrvData drv_data, unsigned int command, char *buf, int len, char **rbuf, int rlen) { int r; ei_x_buff x; our_data_t* data = (our_data_t*)drv_data; char* s = get_s(buf, len); ei_x_new_with_version(&x); switch (command) { case DRV_CONNECT: r = do_connect(s, data, &x); break; case DRV_DISCONNECT: r = do_disconnect(data, &x); break; case DRV_SELECT: r = do_select(s, data, &x); break; default: r = -1; break; } *rbuf = (char*)ei_x_to_new_binary(&x); ei_x_free(&x); driver_free(s); return r; }
In do_connect
is where we log in to the database. If the connection
was successful we store the connection handle in our driver
data, and return ok. Otherwise, we return the error message
from postgres, and store NULL
in the driver data.
static int do_connect(const char *s, our_data_t* data, ei_x_buff* x) { PGconn* conn = PQconnectdb(s); if (PQstatus(conn) != CONNECTION_OK) { encode_error(x, conn); PQfinish(conn); conn = NULL; } else { encode_ok(x); } data->conn = conn; return 0; }
If we are connected (if the connection handle is not NULL
),
we log out from the database. We need to check if a we should
encode an ok, since we might get here from the stop
function, which doesn't return data to the emulator.
static int do_disconnect(our_data_t* data, ei_x_buff* x) { if (data->conn == NULL) return 0; PQfinish(data->conn); data->conn = NULL; if (x != NULL) encode_ok(x); return 0; }
We execute a query and encodes the result. Encoding is done
in another C module, pg_encode.c
which is also provided
as sample code.
static int do_select(const char* s, our_data_t* data, ei_x_buff* x) { PGresult* res = PQexec(data->conn, s); encode_result(x, res, data->conn); PQclear(res); return 0; }
Here we simply checks the result from postgres, and
if it's data we encode it as lists of lists with
column data. Everything from postgres is C strings,
so we just use ei_x_encode_string
to send
the result as strings to Erlang. (The head of the list
contains the column names.)
void encode_result(ei_x_buff* x, PGresult* res, PGconn* conn) { int row, n_rows, col, n_cols; switch (PQresultStatus(res)) { case PGRES_TUPLES_OK: n_rows = PQntuples(res); n_cols = PQnfields(res); ei_x_encode_tuple_header(x, 2); encode_ok(x); ei_x_encode_list_header(x, n_rows+1); ei_x_encode_list_header(x, n_cols); for (col = 0; col < n_cols; ++col) { ei_x_encode_string(x, PQfname(res, col)); } ei_x_encode_empty_list(x); for (row = 0; row < n_rows; ++row) { ei_x_encode_list_header(x, n_cols); for (col = 0; col < n_cols; ++col) { ei_x_encode_string(x, PQgetvalue(res, row, col)); } ei_x_encode_empty_list(x); } ei_x_encode_empty_list(x); break; case PGRES_COMMAND_OK: ei_x_encode_tuple_header(x, 2); encode_ok(x); ei_x_encode_string(x, PQcmdTuples(res)); break; default: encode_error(x, conn); break; } }
The driver should be compiled and linked to a shared
library (DLL on windows). With gcc this is done
with the link flags -shared
and -fpic
.
Since we use the ei
library we should include
it too. There are several versions of ei
, compiled
for debug or non-debug and multi-threaded or single-threaded.
In the makefile for the samples the obj
directory
is used for the ei
library, meaning that we use
the non-debug, single-threaded version.
Before a driver can be called from Erlang, it must be
loaded and opened. Loading is done using the erl_ddll
module (the erl_ddll
driver that loads dynamic
driver, is actually a driver itself). If loading is ok
the port can be opened with open_port/2
. The port
name must match the name of the shared library and
the name in the driver entry structure.
When the port has been opened, the driver can be called. In
the pg_sync
example, we don't have any data from
the port, only the return value from the
port_control
.
The following code is the Erlang part of the synchronous
postgres driver, pg_sync.erl
.
-module(pg_sync). -define(DRV_CONNECT, 1). -define(DRV_DISCONNECT, 2). -define(DRV_SELECT, 3). -export([connect/1, disconnect/1, select/2]). connect(ConnectStr) -> case erl_ddll:load_driver(".", "pg_sync") of ok -> ok; {error, already_loaded} -> ok; E -> exit({error, E}) end, Port = open_port({spawn, ?MODULE}, []), case binary_to_term(port_control(Port, ?DRV_CONNECT, ConnectStr)) of ok -> {ok, Port}; Error -> Error end. disconnect(Port) -> R = binary_to_term(port_control(Port, ?DRV_DISCONNECT, "")), port_close(Port), R. select(Port, Query) -> binary_to_term(port_control(Port, ?DRV_SELECT, Query)).
The api is simple: connect/1
loads the driver, opens it
and logs on to the database, returning the Erlang port
if successful, select/2
sends a query to the driver,
and returns the result, disconnect/1
closes the
database connection and the driver. (It does not unload it,
however.) The connection string should be a connection
string for postgres.
The driver is loaded with erl_ddll:load_driver/2
,
and if this is successful, or if it's already loaded,
it is opened. This will call the start
function
in the driver.
We use the port_control/3
function for all
calls into the driver, the result from the driver is
returned immediately, and converted to terms by calling
binary_to_term/1
. (We trust that the terms returned
from the driver are well-formed, otherwise the
binary_to_term
calls could be contained in a
catch
.)
Sometimes database queries can take long time to
complete, in our pg_sync
driver, the emulator
halts while the driver is doing it's job. This is
often not acceptable, since no other Erlang processes
gets a chance to do anything. To improve on our
postgres driver, we reimplement it using the asynchronous
calls in LibPQ.
The asynchronous version of the driver is in the
sample files pg_async.c
and pg_asyng.erl
.
/* Driver interface declarations */ static ErlDrvData start(ErlDrvPort port, char *command); static void stop(ErlDrvData drv_data); static int control(ErlDrvData drv_data, unsigned int command, char *buf, int len, char **rbuf, int rlen); static void ready_io(ErlDrvData drv_data, ErlDrvEvent event); static ErlDrvEntry pq_driver_entry = { NULL, /* init */ start, stop, NULL, /* output */ ready_io, /* ready_input */ ready_io, /* ready_output */ "pg_async", /* the name of the driver */ NULL, /* finish */ NULL, /* handle */ control, NULL, /* timeout */ NULL, /* outputv */ NULL, /* ready_async */ NULL, /* flush */ NULL, /* call */ NULL /* event */ }; typedef struct our_data_t { PGconn* conn; ErlDrvPort port; int socket; int connecting; } our_data_t;
Here some things have changed from pg_sync.c
: we use the
entry ready_io
for ready_input
and
ready_output
which will be called from the emulator only
when there is input to be read from the socket. (Actually, the
socket is used in a select
function inside
the emulator, and when the socket is signalled,
indicating there is data to read, the ready_input
entry
is called. More on this below.)
Our driver data is also extended, we keep track of the
socket used for communication with postgres, and also
the port, which is needed when we send data to the port with
driver_output
. We have a flag connecting
to tell
whether the driver is waiting for a connection or waiting
for the result of a query. (This is needed since the entry
ready_io
will be called both when connecting and
when there is query result.)
static int do_connect(const char *s, our_data_t* data) { PGconn* conn = PQconnectStart(s); if (PQstatus(conn) == CONNECTION_BAD) { ei_x_buff x; ei_x_new_with_version(&x); encode_error(&x, conn); PQfinish(conn); conn = NULL; driver_output(data->port, x.buff, x.index); ei_x_free(&x); } PQconnectPoll(conn); int socket = PQsocket(conn); data->socket = socket; driver_select(data->port, (ErlDrvEvent)socket, DO_READ, 1); driver_select(data->port, (ErlDrvEvent)socket, DO_WRITE, 1); data->conn = conn; data->connecting = 1; return 0; }
The connect
function looks a bit different too. We connect
using the asynchronous PQconnectStart
function. After the
connection is started, we retreive the socket for the connection
with PQsocket
. This socket is used with the
driver_select
function to wait for connection. When
the socket is ready for input or for output, the ready_io
function will be called.
Note that we only return data (with driver_output
) if there
is an error here, otherwise we wait for the connection to be completed,
in which case our ready_io
function will be called.
static int do_select(const char* s, our_data_t* data) { data->connecting = 0; PGconn* conn = data->conn; /* if there's an error return it now */ if (PQsendQuery(conn, s) == 0) { ei_x_buff x; ei_x_new_with_version(&x); encode_error(&x, conn); driver_output(data->port, x.buff, x.index); ei_x_free(&x); } /* else wait for ready_output to get results */ return 0; }
The do_select
function initiates a select, and returns
if there is no immediate error. The actual result will be returned
when ready_io
is called.
static void ready_io(ErlDrvData drv_data, ErlDrvEvent event) { PGresult* res = NULL; our_data_t* data = (our_data_t*)drv_data; PGconn* conn = data->conn; ei_x_buff x; ei_x_new_with_version(&x); if (data->connecting) { ConnStatusType status; PQconnectPoll(conn); status = PQstatus(conn); if (status == CONNECTION_OK) encode_ok(&x); else if (status == CONNECTION_BAD) encode_error(&x, conn); } else { PQconsumeInput(conn); if (PQisBusy(conn)) return; res = PQgetResult(conn); encode_result(&x, res, conn); PQclear(res); for (;;) { res = PQgetResult(conn); if (res == NULL) break; PQclear(res); } } if (x.index > 1) { driver_output(data->port, x.buff, x.index); if (data->connecting) driver_select(data->port, (ErlDrvEvent)data->socket, DO_WRITE, 0); } ei_x_free(&x); }
The ready_io
function will be called when the socket
we got from postgres is ready for input or output. Here
we first check if we are connecting to the database. In that
case we check connection status and return ok if the
connection is successful, or error if it's not. If the
connection is not yet established, we simply return; ready_io
will be called again.
If we have result from a connect, indicated that we have data in
the x
buffer, we no longer need to select on
output (ready_output
), so we remove this by calling
driver_select
.
If we're not connecting, we're waiting for results from a
PQsendQuery
, so we get the result and return it. The
encoding is done with the same functions as in the earlier
example.
We should add error handling here, for instance checking that the socket is still open, but this is just a simple example.
The Erlang part of the asynchronous driver consists of the
sample file pg_async.erl
.
-module(pg_async). -define(DRV_CONNECT, $C). -define(DRV_DISCONNECT, $D). -define(DRV_SELECT, $S). -export([connect/1, disconnect/1, select/2]). connect(ConnectStr) -> case erl_ddll:load_driver(".", "pg_async") of ok -> ok; {error, already_loaded} -> ok; _ -> exit({error, could_not_load_driver}) end, Port = open_port({spawn, ?MODULE}, [binary]), port_control(Port, ?DRV_CONNECT, ConnectStr), case return_port_data(Port) of ok -> {ok, Port}; Error -> Error end. disconnect(Port) -> port_control(Port, ?DRV_DISCONNECT, ""), R = return_port_data(Port), port_close(Port), R. select(Port, Query) -> port_control(Port, ?DRV_SELECT, Query), return_port_data(Port). return_port_data(Port) -> receive {Port, {data, Data}} -> binary_to_term(Data) end.
The Erlang code is slightly different, this is because we
don't return the result synchronously from port_control
,
instead we get it from driver_output
as data in the
message queue. The function return_port_data
above
receives data from the port. Since the data is in
binary format, we use binary_to_term/1
to convert
it to Erlang term. Note that the driver is opened in
binary mode, open_port/2
is called with the option
[binary]
. This means that data sent from the driver
to the emulator is sent as binaries. Without the binary
option, they would have been lists of integers.
As a final example we demonstrate the use of driver_async
.
We also use the driver term interface. The driver is written
in C++. This enables us to use an algorithm from STL. We will
use the next_permutation
algorithm to get the next permutation
of a list of integers. For large lists (more than 100000
elements), this will take some time, so we will perform this
as an asynchronous task.
The asynchronous api for drivers are quite complicated. First
of all, the work must be prepared. In our example we do this
in output
. We could have used control
just as well,
but we want some variation in our examples. In our driver, we allocate
a structure that contains all needed for the asynchronous task
to do the work. This is done in the main emulator thread.
Then the asynchronous function is called from a driver thread,
separate from the main emulator thread. Note that the driver-
functions are not reentrant, so they shouldn't be used.
Finally, after the function is completed, the driver callback
ready_async
is called from the main emulator thread,
this is where we return the result to Erlang. (We can't
return the result from within the asynchronous function, since
we can't call the driver-functions.)
The code below is from the sample file next_perm.cc
.
The driver entry looks like before, but also contains the
call-back ready_async
.
static ErlDrvEntry next_perm_driver_entry = { NULL, /* init */ start, NULL, /* stop */ output, NULL, /* ready_input */ NULL, /* ready_output */ "next_perm", /* the name of the driver */ NULL, /* finish */ NULL, /* handle */ NULL, /* control */ NULL, /* timeout */ NULL, /* outputv */ ready_async, NULL, /* flush */ NULL, /* call */ NULL /* event */ };
The output
function allocates the work-area of the
asynchronous function. Since we use C++, we use a struct,
and stuff the data in it. We have to copy the original data,
it is not valid after we have returned from the output
function, and the do_perm
function will be called later,
and from another thread. We return no data here, instead it will
be sent later from the ready_async
call-back.
The async_data
will be passed to the do_perm
function.
We do not use a async_free
function (the last argument to
driver_async
, it's only used if the task is cancelled
programmatically.
struct our_async_data { bool prev; vector<int> data; our_async_data(ErlDrvPort p, int command, const char* buf, int len); }; our_async_data::our_async_data(ErlDrvPort p, int command, const char* buf, int len) : prev(command == 2), data((int*)buf, (int*)buf + len / sizeof(int)) { } static void do_perm(void* async_data); static void output(ErlDrvData drv_data, char *buf, int len) { if (*buf < 1 || *buf > 2) return; ErlDrvPort port = reinterpret_cast<ErlDrvPort>(drv_data); void* async_data = new our_async_data(port, *buf, buf+1, len); driver_async(port, NULL, do_perm, async_data, do_free); }
In the do_perm
we simply do the work, operating
on the structure that was allocated in output
.
static void do_perm(void* async_data) { our_async_data* d = reinterpret_cast<our_async_data*>(async_data); if (d->prev) prev_permutation(d->data.begin(), d->data.end()); else next_permutation(d->data.begin(), d->data.end()); }
In the ready_async
function, the output is sent back to the
emulator. We use the driver term format instead of ei
.
This is the only way to send Erlang terms directly to a driver,
without having the Erlang code to call binary_to_term/1
. In
our simple example this works well, and we don't need to use
ei
to handle the binary term format.
When the data is returned we deallocate our data.
static void ready_async(ErlDrvData drv_data, ErlDrvThreadData async_data) { ErlDrvPort port = reinterpret_cast<ErlDrvPort>(drv_data); our_async_data* d = reinterpret_cast<our_async_data*>(async_data); int n = d->data.size(), result_n = n*2 + 3; ErlDrvTermData* result = new ErlDrvTermData[result_n], * rp = result; for (vector<int>::iterator i = d->data.begin(); i != d->data.end(); ++i) { *rp++ = ERL_DRV_INT; *rp++ = *i; } *rp++ = ERL_DRV_NIL; *rp++ = ERL_DRV_LIST; *rp++ = n+1; driver_output_term(port, result, result_n); delete[] result; delete d; }
This driver is called like the others from Erlang, however, since
we use driver_output_term
, there is no need to call
binary_to_term. The Erlang code is in the sample file
next_perm.erl
.
The input is changed into a list of integers and sent to the driver.
-module(next_perm). -export([next_perm/1, prev_perm/1, load/0, all_perm/1]). load() -> case whereis(next_perm) of undefined -> case erl_ddll:load_driver(".", "next_perm") of ok -> ok; {error, already_loaded} -> ok; E -> exit(E) end, Port = open_port({spawn, "next_perm"}, []), register(next_perm, Port); _ -> ok end. list_to_integer_binaries(L) -> [<<I:32/integer-native>> || I <- L]. next_perm(L) -> next_perm(L, 1). prev_perm(L) -> next_perm(L, 2). next_perm(L, Nxt) -> load(), B = list_to_integer_binaries(L), port_control(next_perm, Nxt, B), receive Result -> Result end. all_perm(L) -> New = prev_perm(L), all_perm(New, L, [New]). all_perm(L, L, Acc) -> Acc; all_perm(L, Orig, Acc) -> New = prev_perm(L), all_perm(New, Orig, [New | Acc]).