Chouette - REST API Framework
Chouette is a framework for making asynchronous HTTP services. It makes some opinionated design choices, but is otherwise fairly flexible.
AnyEvent is used as the glue to connect all the asynchronous libraries, although Chouette depends on Feersum and therefore EV for its event loop. It uses Feersum in PSGI mode so it can use Plack for request parsing, and has support for Plack::Middleware wrappers. Feersum is the least conservative choice in the stack but there aren't very many alternatives (Twiggy is a possibility but it is somewhat buggy and you need a hack to use unix sockets).
Chouette generally assumes that its input will be application/x-www-form-urlencoded
. Plack::Request::WithEncoding is used so that text is properly decoded (we recommend UTF-8 of course). For output, the default is application/json
encoded with JSON::XS. Both the input and output types can be modified, although this is only partially documented so far.
Chouette apps can optionally load a config file and its format is YAML
, loaded with the YAML module. Regexp::Assemble is used for efficient route-dispatch.
The above aside, Chouette's main purpose is to glue together several of my own modules into a cohesive whole. These modules have been designed to work together and I have used them to build numerous services, some of which handle a considerable amount of traffic and/or have very complicated requirements.
Chouette was extracted from some of these services I have built before, and I have put in the extra effort required so that all the modules work together in the ways they were designed:
- AnyEvent::Task
-
Allows us to perform blocking operations without holding up other requests.
- Callback::Frame
-
Makes exception handling simple and convenient. You can
die
anywhere and it will only affect the request being currently handled.Important note: If you are using 3rd-party libraries that accept callbacks, please understand how Callback::Frame works. You will usually need to pass
fub {}
instead ofsub {}
to these libraries. See the EXCEPTIONS section for more details. - Session::Token
-
For random identifiers such as session tokens (obviously).
- Log::Defer
-
Structured logging, properly integrated with AnyEvent::Task so your tasks can log messages into the proper request log contexts.
Note that Chouette also depends on Log::Defer::Viz so
log-defer-viz
will be available for viewing logs. - Log::File::Rolling
-
Store logs in files and rotate them periodically. Also maintains a current symlink so you can simply run the following in a shell and you'll always see the latest logs as you need them:
$ log-defer-viz -F /var/myapi/logs/myapi.current.log
Chouette will always depend on AnyEvent::Task, Callback::Frame, Session::Token, and Log::Defer so if your app also uses these modules then it is sufficient to depend on Chouette
alone.
Where does the name "Chouette" come from? A chouette is a multi-player, fast-paced backgammon game with lots of stuff going on at once, kind of like an asynchronous REST API server... Hmmm, a bit of a stretch isn't it? To be honest it's just a cool name and I love backgammon, especially chouettes with friends and beer. :)
To start a server, create a Chouette
object. The constructor accepts a hash ref with the following parameters. Most are optional. See the bin/myapi
file below for a full example.
config_defaults
-
This hash is where you provide default config values. These values can be overridden by the config file.
You can use the config store for values specific to your application (it is accessible with the
config
method of the context), but here are the values thatChouette
itself looks for:var_dir
- This directory must exist and be writable.Chouette
will use this to store log files and AnyEvent::Task sockets.listen
- This is the location the Chouette server will listen on. Examples:8080
127.0.0.1:8080
unix:/var/myapi/myapi.socket
logging.file_prefix
- The prefix for log file names (default isapp
).logging.timezone
- Eithergmtime
orlocaltime
(gmtime
is default, see Log::File::Rolling).The only required config parameters are
var_dir
andlisten
(though these can be omitted from the defaults assuming they will be specified in the config file, see below). config_file
-
If you want a config file, this path is where it will be read from. The file's format is YAML. The values in this file over-ride the values in
config_defaults
. If this parameter is not provided then it will not attempt to load a config file and defaults will be used. routes
-
Routes are specified as a hash-ref of route paths, mapping to hash-refs of methods, mapping to package+function names or callbacks. For example:
routes => { '/myapi/resource' => { POST => 'MyAPI::Resource::create', GET => 'MyAPI::Resource::get_all', }, '/myapi/resource/:resource_id' => { GET => 'MyAPI::Resource::get_by_id', POST => sub { my $c = shift; die "400: can't update ID " . $c->route_params->{resource_id}; }, }, '/myapi/upload' => { PUT => 'MyAPI::Upload::upload', } },
For each route, if a package+function name is used it will try to
require
the package specified, and obtain the function specified for each HTTP method. If the package or function doesn't exists, an error will be thrown.You can use
:param
path elements in your routes to extract parameters from the path. They are accessible via theroute_params
method of the context (seelib/MyAPI/Resource.pm
below).Note that routes are combined with Regexp::Assemble so we don't have to loop over every possible route for every request, in case you have a lot of routes. For example, here is the regexp used for the above routes:
\A/myapi/(?:resource(?:/(?<resource_id>[^/]+)\z(?{2})|\z(?{1}))|upload\z(?{0}))
See the
bin/myapi
file below for an example. pre_route
-
A package+function or callback that will be called with a context and a resume callback. If the function determines the request processing should continue, it should call the resume callback.
See the
lib/MyAPI/Auth.pm
file below for an example of the function. middleware
-
Any array-ref of Plack::Middleware packages. Each element is either a string representing a package+function, or an array-ref where the first element is the package+function and the rest of the elements are the arguments to the middleware.
The strings representing packages can either be prefixed with
Plack::Middleware::
or not. If not, it will try torequire
the package as is and if that doesn't exist, it will try again with thePlack::Middleware::
prefix.middleware => [ 'Plack::Middleware::ContentLength', 'ETag', ['Plack::Middleware::CrossOrigin', origins => '*'], ['Plack::Middleware::RealIP', header => 'X-Real-IP', trusted_proxy => [qw(127.0.0.1/24)], ], ],
tasks
-
This is a hash-ref of AnyEvent::Task servers/clients to create.
tasks => { db => { pkg => 'LPAPI::Task::DB', checkout_caching => 1, client => { timeout => 20, }, server => { hung_worker_timeout => 60, }, }, },
Route handlers can acquire checkouts by calling the
task
method on the context object.checkout_caching
means that if a checkout is obtained and released, it will be cached for the duration of the request so if another checkout for this task is obtained, then the original will be returned. This is useful forpre_route
handlers that use DBI for example, because we want the authenticate handler to run in the same transaction as the handler (for both correctness and efficiency reasons).Additional arguments to AnyEvent::Task::Client and <AnyEvent::Task::Server> can be passed in via
client
andserver
.See the
bin/myapi
,lib/MyAPI/Task/PasswordHasher.pm
, andlib/MyAPI/Task/DB.pm
files for examples. quiet
-
If set, suppress the start-up message which looks like so:
=============================================================================== Chouette 0.100 PID = 31713 UID/GIDs = 1000/1000 4 20 24 27 30 46 113 129 1000 Listening on: http://0.0.0.0:8080 Follow log messages: log-defer-viz -F /var/myapi/logs/myapi.current.log ===============================================================================
After the Chouette
object is obtained, you should call serve
or run
. They are basically the same except serve
returns whereas run
enters the AnyEvent event loop. These are equivalent:
$chouette->run;
and
$chouette->serve;
AE::cv->recv;
For every request a Chouette::Context
object is created. This object is passed into the handler for the request. Typically we name the object $c
. Your code interacts with the request via the following methods on the context object:
respond
-
The respond method sends a JSON response, the contents of which are encoded from the first argument:
$c->respond({ a => 1, b => 2, });
Note: After responding, this method returns and your code continues. This is useful if you wish to do additional work after sending the response. However, if you call
respond
on this context again an error will logged. The second response will not be sent (it can't be since the connection is probably already closed).If you wish to stop processing after sending the response, you can
die
with the result fromrespond
since it returns a special object for this purpose:die $c->respond({ a => 1, });
See the EXCEPTIONS section for more details on the use of exceptions in Chouette.
respond
takes an optional second argument which is the HTTP response code (defaults to 200):$c->respond({ error => "access denied" }, 403);
Note that processing continues here also. If you wish to terminate the processing right away, prefix with
die
as above, or use the following shortcut:die "403: access denied";
The client will receive an HTTP response with the Feersum default message ("Forbidden" in this case) and the JSON body will be
{"error":"access denied"}
.This works too, except the value of
error
in the JSON body of the response will just be "HTTP code 403":die 403;
done
-
If you wish to stop processing but not send a response:
$c->done;
You will need to send a response later, usually from an async callback. Note: If the last reference to the context is destroyed without a response being sent, the message
no response was sent, sending 500
will be logged and a 500 "internal server error" response will be sent.You don't ever need to call
done
. You can justreturn
from the handler instead.done
is only for convenience in case you are deeply nested in callbacks and don't want to worry about writing a bunch of nested returns. respond_raw
-
Similar to
respond
except it doesn't assume JSON encoding:$c->respond_raw(200, 'text/plain', 'here is some plain text');
logger
-
Returns the Log::Defer object associated with the request:
$c->logger->info("some stuff is happening"); { my $timer = $c->logger->timer('doing big_computation'); big_computation(); }
See the Log::Defer docs for more details. For viewing the log messages, check out Log::Defer::Viz.
config
-
Returns the
config
hash. See the "CHOUETTE OBJECT" section for details. req
-
Returns the Plack::Request object created for this request.
my $name = $c->req->parameters->{name};
res
-
One would think this would return a Plack::Response object. Unfortunately this isn't yet implemented and will instead throw an error.
generate_token
-
Generates a random string using a default-config Session::Token generator. The generator is created when the first token is needed so as to avoid a "cold" entropy pool immediately after a reboot (see the Session::Token docs).
task
-
Returns an AnyEvent::Task checkout object for the task with the given name:
$c->task('db')->selectrow_hashref(q{ SELECT * FROM sometable WHERE id = ? }, undef, $id, sub { my ($dbh, $row) = @_; die $c->respond($row); });
Checkout options can be passed after the task name:
$c->task('db', timeout => 5)->selectrow_hashref(...);
See AnyEvent::Task for more details.
Assuming you are familiar with asynchronous programming, most of Chouette should feel straightforward. The only thing that might be unfamiliar is how exceptions are used.
The first unusual thing about how Chouette uses exceptions is that it uses them for error conditions, in contrast to many other asynchronous frameworks.
Most asynchronous frameworks are unable to use exceptions to signal errors since an error may occur in a callback being run from the event loop. If this callback throws an exception, there will be nothing to catch it, except perhaps a catch block installed by the event loop. Even if the event loop does catch it, it won't know which connection the exception is for, and therefore won't be able to send a 500 error to that connection or add a message to that connection's log.
Consider the AnyEvent::DBI library. This is how its error handling works:
$dbh->exec("SELECT * FROM no_such_table", sub {
my ($dbh, $rows, $rv) = @_;
if ($#_) {
# success
} else {
# failure. error message is in $@
}
});
Even if exec
failed, the callback still gets called. Whether or not it succeeded is indicated by its parameters. You can think of this as a sort of "in-band" signalling. The fact that there was an error, and what exactly that error was, needs to be indicated by the callback's parameters in some way. Unfortunately every library does this slightly differently. Another alternative used by some libraries is to accept 2 callbacks, one of which is called in the success case, and the other in the failure case.
But with both of these methods, what should the callback do when it is notified of an error? It can't just die
because nothing will catch the exception. With the EV event loop you will see this:
EV: error in callback (ignoring): failure: ERROR: relation "no_such_table" does not exist
Even if you wrap an eval
or a Try::Tiny try {} catch {}
around the code the same thing happens. The try/catch is in effect while installing the callback, but not when the callback is called.
As a consequence of all this, asynchronous web frameworks usually cannot indicate errors with exceptions. Instead, they require you to respond to the client from inside the callback:
$dbh->exec("SELECT * FROM no_such_table", sub {
my ($dbh, $rows, $rv) = @_;
if (!$#_) {
$context->respond_500_error("DB error: $@");
return;
}
# success
});
There are several down-sides to this approach:
The error must be handled locally in each callback, rather than once in a catch-all error handler.
Everywhere an error might occur needs to have access to the context object. This often requires passing it as an argument around everywhere.
You might forget to handle an error (or it might be too inconvenient so you don't bother) and your success-case code will run on garbage data.
Perhaps most importantly, if some unexpected exception is thrown by your callback (or something that it calls) then the event loop will receive an exception and nothing will get logged or replied to.
For these reasons, Chouette uses Callback::Frame to deal with exceptions. The idea is that the exception handling code is carried around with your callbacks. For instance, this is how you would accomplish the same thing with Chouette:
my $dbh = $c->task('db');
$dbh->selectrow_arrayref("SELECT * FROM no_such_table", undef, sub {
my ($dbh, $rows) = @_;
# success
# even if I do die here it will get routed to the right request
});
The callback will only be invoked in the success case. If selectrow_arrayref
fails, an exception will be raised in the dynamic scope that was in effect when the callback was installed. The same goes if an exception is thrown by the callback passed to selectrow_arrayref
. Because Chouette installs a catch
handler for each request, the appropriate errors will be sent to the client and added to the Chouette logs.
Important note: Libraries like AnyEvent::Task (which is what task
in the above example uses) are Callback::Frame-aware. This means that you can pass sub {}
callbacks into them and they will automatically convert them to fub {}
callbacks for you (but you can pass in fub
s too, that's harmless).
When using 3rd-party libraries, you must pass fub {}
instead. Also, you'll need to figure out how the library handles error cases and throw exceptions as appropriate. For example, if you really wanted to use AnyEvent::DBI (even though the AnyEvent::Task version is superior in pretty much every way) this is what you would do:
$dbh->exec("SELECT * FROM no_such_table", fub {
my ($dbh, $rows, $rv) = @_;
if (!$#_) {
die "DB error: $@";
}
# success
});
Note that the sub
has been changed to fub
and an exception is thrown for the error case.
In summary, when installing callbacks you must use fub
except when the library is Callback::Frame-aware (in which case fub
is optional).
Please see the Callback::Frame documentation for more specifics.
The other unusual thing about how Chouette uses exceptions is that it uses them for control flow as well as errors.
As you can see in the respond
method documentation of the "CHOUETTE OBJECT" section, you can die
with the result of the respond
method:
die $c->respond({ status => 'ok' });
This works because respond
returns a special object specifically intended for this purpose. When it gets an exception, the main catch block checks if it is this object. If so, it just ignores the exception. This lets you terminate your current callback without worrying about return
ing.
This catch block also checks if your exception starts with 3 digits followed by a word-break. If so, it considers this a special exception intended to send an HTTP response. For example, the following code will send a 404 Not Found response:
die "404: no such resource";
The body of the response will be:
{"error":"no such resource"}
You can even just throw a number:
die 404;
Some people might consider this usage of exceptions to be kind of a hack, but it does make for really nice code if you'll give it a chance.
These files represent a complete-ish Chouette application that I have extracted from a real-world app. Warning: untested!
bin/myapi
-
#!/usr/bin/env perl use common::sense; use Chouette; my $chouette = Chouette->new({ config_file => '/etc/myapi.conf', config_defaults => { var_dir => '/var/myapi', listen => '8080', logging => { file_prefix => 'myapi', timezone => 'localtime', }, }, middleware => [ 'Plack::Middleware::ContentLength', ['Plack::Middleware::RealIP', header => 'X-Real-IP', trusted_proxy => [qw(127.0.0.1/24)], ], ], pre_route => 'MyAPI::Auth::authenticate', routes => { '/myapi/unauth/login' => { POST => 'MyAPI::User::login', }, '/myapi/resource' => { POST => 'MyAPI::Resource::create', GET => 'MyAPI::Resource::get_all', }, '/myapi/resource/:resource_id' => { GET => 'MyAPI::Resource::get_by_id', }, }, tasks => { passwd => { pkg => 'MyAPI::Task::PasswordHasher', }, db => { pkg => 'MyAPI::Task::DB', checkout_caching => 1, ## so same dbh is used in authenticate and handler }, }, }); $chouette->run;
lib/MyAPI/Auth.pm
-
package MyAPI::Auth; use common::sense; sub authenticate { my ($c, $cb) = @_; if ($c->{env}->{PATH_INFO} =~ m{^/myapi/unauth/}) { return $cb->(); } my $session = $c->req->parameters->{session}; $c->task('db')->selectrow_hashref(q{ SELECT user_id FROM session WHERE session_token = ? }, undef, $session, sub { my ($dbh, $row) = @_; die 403 if !$row; $c->{user_id} = $row->{user_id}; $cb->(); }); } 1;
lib/MyAPI/User.pm
-
package MyAPI::User; use common::sense; sub login { my $c = shift; my $username = $c->req->parameters->{username}; my $password = $c->req->parameters->{password}; $c->task('db')->selectrow_hashref(q{ SELECT user_id, password_hashed FROM myuser WHERE username = ? }, undef, $username, sub { my ($dbh, $row) = @_; die 403 if !$row; $c->task('passwd')->verify_password($row->{password_hashed}, $password, sub { die 403 if !$_[1]; my $session_token = $c->generate_token(); $dbh->do(q{ INSERT INTO session (session_token, user_id) VALUES (?, ?) }, undef, $session_token, $row->{user_id}, sub { $dbh->commit(sub { die $c->respond({ sess => $session_token }); }); }); }); }); } 1;
lib/MyAPI/Resource.pm
-
package MyAPI::Resource; use common::sense; sub create { my $c = shift; die "500 not implemented"; } sub get_all { my $c = shift; $c->logger->warn("denying access to get_all"); die 403; } sub get_by_id { my $c = shift; my $resource_id = $c->route_params->{resource_id}; die $c->respond({ resource_id => $resource_id, }); } 1;
lib/MyAPI/Task/PasswordHasher.pm
-
package MyAPI::Task::PasswordHasher; use common::sense; use Authen::Passphrase::BlowfishCrypt; use Encode; sub new { my ($class, %args) = @_; my $self = {}; bless $self, $class; open($self->{dev_urandom}, '<:raw', '/dev/urandom') || die "open urandom: $!"; setpriority(0, $$, 19); ## renice our process so we don't hold up more important processes return $self; } sub hash_password { my ($self, $plaintext_passwd) = @_; read($self->{dev_urandom}, my $salt, 16) == 16 || die "bad read from urandom"; return Authen::Passphrase::BlowfishCrypt->new(cost => 10, salt => $salt, passphrase => encode_utf8($plaintext_passwd // '')) ->as_crypt; } sub verify_password { my ($self, $crypted_passwd, $plaintext_passwd) = @_; return Authen::Passphrase::BlowfishCrypt->from_crypt($crypted_passwd // '') ->match(encode_utf8($plaintext_passwd // '')); } 1;
lib/MyAPI/Task/DB.pm
-
package MyAPI::Task::DB; use common::sense; use AnyEvent::Task::Logger; use DBI; sub new { my ($class, $config) = @_; ## $config is your app's config, so you can get username/password/etc my $dbh = DBI->connect("dbi:Pg:dbname=myapi", '', '', {AutoCommit => 0, RaiseError => 1, PrintError => 0, }) || die "couldn't connect to db"; return $dbh; } sub CHECKOUT_DONE { my ($dbh) = @_; $dbh->rollback; } 1;
More documentation can be found in the modules linked in the DESCRIPTION section.
Doug Hoyte, <doug@hcsw.org>
Copyright 2017 Doug Hoyte.
This module is licensed under the same terms as perl itself.