Skip to content

hoytech/Chouette

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAME

Chouette - REST API Framework

DESCRIPTION

Chouette is a framework for making asynchronous HTTP services. It makes some opinionated design choices, but is otherwise fairly flexible.

AnyEvent is used as the glue to connect all the asynchronous libraries, although Chouette depends on Feersum and therefore EV for its event loop. It uses Feersum in PSGI mode so it can use Plack for request parsing, and has support for Plack::Middleware wrappers. Feersum is the least conservative choice in the stack but there aren't very many alternatives (Twiggy is a possibility but it is somewhat buggy and you need a hack to use unix sockets).

Chouette generally assumes that its input will be application/x-www-form-urlencoded. Plack::Request::WithEncoding is used so that text is properly decoded (we recommend UTF-8 of course). For output, the default is application/json encoded with JSON::XS. Both the input and output types can be modified, although this is only partially documented so far.

Chouette apps can optionally load a config file and its format is YAML, loaded with the YAML module. Regexp::Assemble is used for efficient route-dispatch.

The above aside, Chouette's main purpose is to glue together several of my own modules into a cohesive whole. These modules have been designed to work together and I have used them to build numerous services, some of which handle a considerable amount of traffic and/or have very complicated requirements.

Chouette was extracted from some of these services I have built before, and I have put in the extra effort required so that all the modules work together in the ways they were designed:

AnyEvent::Task

Allows us to perform blocking operations without holding up other requests.

Callback::Frame

Makes exception handling simple and convenient. You can die anywhere and it will only affect the request being currently handled.

Important note: If you are using 3rd-party libraries that accept callbacks, please understand how Callback::Frame works. You will usually need to pass fub {} instead of sub {} to these libraries. See the EXCEPTIONS section for more details.

Session::Token

For random identifiers such as session tokens (obviously).

Log::Defer

Structured logging, properly integrated with AnyEvent::Task so your tasks can log messages into the proper request log contexts.

Note that Chouette also depends on Log::Defer::Viz so log-defer-viz will be available for viewing logs.

Log::File::Rolling

Store logs in files and rotate them periodically. Also maintains a current symlink so you can simply run the following in a shell and you'll always see the latest logs as you need them:

$ log-defer-viz -F /var/myapi/logs/myapi.current.log

Chouette will always depend on AnyEvent::Task, Callback::Frame, Session::Token, and Log::Defer so if your app also uses these modules then it is sufficient to depend on Chouette alone.

Where does the name "Chouette" come from? A chouette is a multi-player, fast-paced backgammon game with lots of stuff going on at once, kind of like an asynchronous REST API server... Hmmm, a bit of a stretch isn't it? To be honest it's just a cool name and I love backgammon, especially chouettes with friends and beer. :)

CHOUETTE OBJECT

To start a server, create a Chouette object. The constructor accepts a hash ref with the following parameters. Most are optional. See the bin/myapi file below for a full example.

config_defaults

This hash is where you provide default config values. These values can be overridden by the config file.

You can use the config store for values specific to your application (it is accessible with the config method of the context), but here are the values that Chouette itself looks for:

var_dir - This directory must exist and be writable. Chouette will use this to store log files and AnyEvent::Task sockets.

listen - This is the location the Chouette server will listen on. Examples: 8080 127.0.0.1:8080 unix:/var/myapi/myapi.socket

logging.file_prefix - The prefix for log file names (default is app).

logging.timezone - Either gmtime or localtime (gmtime is default, see Log::File::Rolling).

The only required config parameters are var_dir and listen (though these can be omitted from the defaults assuming they will be specified in the config file, see below).

config_file

If you want a config file, this path is where it will be read from. The file's format is YAML. The values in this file over-ride the values in config_defaults. If this parameter is not provided then it will not attempt to load a config file and defaults will be used.

routes

Routes are specified as a hash-ref of route paths, mapping to hash-refs of methods, mapping to package+function names or callbacks. For example:

routes => {
    '/myapi/resource' => {
        POST => 'MyAPI::Resource::create',
        GET => 'MyAPI::Resource::get_all',
    },

    '/myapi/resource/:resource_id' => {
        GET => 'MyAPI::Resource::get_by_id',
        POST => sub {
            my $c = shift;
            die "400: can't update ID " . $c->route_params->{resource_id};
        },
    },

    '/myapi/upload' => {
        PUT => 'MyAPI::Upload::upload',
    }
},

For each route, if a package+function name is used it will try to require the package specified, and obtain the function specified for each HTTP method. If the package or function doesn't exists, an error will be thrown.

You can use :param path elements in your routes to extract parameters from the path. They are accessible via the route_params method of the context (see lib/MyAPI/Resource.pm below).

Note that routes are combined with Regexp::Assemble so we don't have to loop over every possible route for every request, in case you have a lot of routes. For example, here is the regexp used for the above routes:

\A/myapi/(?:resource(?:/(?<resource_id>[^/]+)\z(?{2})|\z(?{1}))|upload\z(?{0}))

See the bin/myapi file below for an example.

pre_route

A package+function or callback that will be called with a context and a resume callback. If the function determines the request processing should continue, it should call the resume callback.

See the lib/MyAPI/Auth.pm file below for an example of the function.

middleware

Any array-ref of Plack::Middleware packages. Each element is either a string representing a package+function, or an array-ref where the first element is the package+function and the rest of the elements are the arguments to the middleware.

The strings representing packages can either be prefixed with Plack::Middleware:: or not. If not, it will try to require the package as is and if that doesn't exist, it will try again with the Plack::Middleware:: prefix.

middleware => [
    'Plack::Middleware::ContentLength',
    'ETag',
    ['Plack::Middleware::CrossOrigin', origins => '*'],
    ['Plack::Middleware::RealIP', header => 'X-Real-IP', trusted_proxy => [qw(127.0.0.1/24)], ],
],
tasks

This is a hash-ref of AnyEvent::Task servers/clients to create.

tasks => {
    db => {
        pkg => 'LPAPI::Task::DB',
        checkout_caching => 1,
        client => {
            timeout => 20,
        },
        server => {
            hung_worker_timeout => 60,
        },
    },
},

Route handlers can acquire checkouts by calling the task method on the context object.

checkout_caching means that if a checkout is obtained and released, it will be cached for the duration of the request so if another checkout for this task is obtained, then the original will be returned. This is useful for pre_route handlers that use DBI for example, because we want the authenticate handler to run in the same transaction as the handler (for both correctness and efficiency reasons).

Additional arguments to AnyEvent::Task::Client and <AnyEvent::Task::Server> can be passed in via client and server.

See the bin/myapi, lib/MyAPI/Task/PasswordHasher.pm, and lib/MyAPI/Task/DB.pm files for examples.

quiet

If set, suppress the start-up message which looks like so:

===============================================================================

Chouette 0.100

PID = 31713
UID/GIDs = 1000/1000 4 20 24 27 30 46 113 129 1000
Listening on: http://0.0.0.0:8080

Follow log messages:
    log-defer-viz -F /var/myapi/logs/myapi.current.log

===============================================================================

After the Chouette object is obtained, you should call serve or run. They are basically the same except serve returns whereas run enters the AnyEvent event loop. These are equivalent:

$chouette->run;

and

$chouette->serve;
AE::cv->recv;

CONTEXT OBJECT

For every request a Chouette::Context object is created. This object is passed into the handler for the request. Typically we name the object $c. Your code interacts with the request via the following methods on the context object:

respond

The respond method sends a JSON response, the contents of which are encoded from the first argument:

$c->respond({ a => 1, b => 2, });

Note: After responding, this method returns and your code continues. This is useful if you wish to do additional work after sending the response. However, if you call respond on this context again an error will logged. The second response will not be sent (it can't be since the connection is probably already closed).

If you wish to stop processing after sending the response, you can die with the result from respond since it returns a special object for this purpose:

die $c->respond({ a => 1, });

See the EXCEPTIONS section for more details on the use of exceptions in Chouette.

respond takes an optional second argument which is the HTTP response code (defaults to 200):

$c->respond({ error => "access denied" }, 403);

Note that processing continues here also. If you wish to terminate the processing right away, prefix with die as above, or use the following shortcut:

die "403: access denied";

The client will receive an HTTP response with the Feersum default message ("Forbidden" in this case) and the JSON body will be {"error":"access denied"}.

This works too, except the value of error in the JSON body of the response will just be "HTTP code 403":

die 403;
done

If you wish to stop processing but not send a response:

$c->done;

You will need to send a response later, usually from an async callback. Note: If the last reference to the context is destroyed without a response being sent, the message no response was sent, sending 500 will be logged and a 500 "internal server error" response will be sent.

You don't ever need to call done. You can just return from the handler instead. done is only for convenience in case you are deeply nested in callbacks and don't want to worry about writing a bunch of nested returns.

respond_raw

Similar to respond except it doesn't assume JSON encoding:

$c->respond_raw(200, 'text/plain', 'here is some plain text');
logger

Returns the Log::Defer object associated with the request:

$c->logger->info("some stuff is happening");

{
    my $timer = $c->logger->timer('doing big_computation');
    big_computation();
}

See the Log::Defer docs for more details. For viewing the log messages, check out Log::Defer::Viz.

config

Returns the config hash. See the "CHOUETTE OBJECT" section for details.

req

Returns the Plack::Request object created for this request.

my $name = $c->req->parameters->{name};
res

One would think this would return a Plack::Response object. Unfortunately this isn't yet implemented and will instead throw an error.

generate_token

Generates a random string using a default-config Session::Token generator. The generator is created when the first token is needed so as to avoid a "cold" entropy pool immediately after a reboot (see the Session::Token docs).

task

Returns an AnyEvent::Task checkout object for the task with the given name:

$c->task('db')->selectrow_hashref(q{ SELECT * FROM sometable WHERE id = ? },
                                  undef, $id, sub {
    my ($dbh, $row) = @_;

    die $c->respond($row);
});

Checkout options can be passed after the task name:

$c->task('db', timeout => 5)->selectrow_hashref(...);

See AnyEvent::Task for more details.

EXCEPTIONS

Assuming you are familiar with asynchronous programming, most of Chouette should feel straightforward. The only thing that might be unfamiliar is how exceptions are used.

ERROR HANDLING

The first unusual thing about how Chouette uses exceptions is that it uses them for error conditions, in contrast to many other asynchronous frameworks.

Most asynchronous frameworks are unable to use exceptions to signal errors since an error may occur in a callback being run from the event loop. If this callback throws an exception, there will be nothing to catch it, except perhaps a catch block installed by the event loop. Even if the event loop does catch it, it won't know which connection the exception is for, and therefore won't be able to send a 500 error to that connection or add a message to that connection's log.

Consider the AnyEvent::DBI library. This is how its error handling works:

$dbh->exec("SELECT * FROM no_such_table", sub {
    my ($dbh, $rows, $rv) = @_;

    if ($#_) {
        # success
    } else {
        # failure. error message is in $@
    }
});

Even if exec failed, the callback still gets called. Whether or not it succeeded is indicated by its parameters. You can think of this as a sort of "in-band" signalling. The fact that there was an error, and what exactly that error was, needs to be indicated by the callback's parameters in some way. Unfortunately every library does this slightly differently. Another alternative used by some libraries is to accept 2 callbacks, one of which is called in the success case, and the other in the failure case.

But with both of these methods, what should the callback do when it is notified of an error? It can't just die because nothing will catch the exception. With the EV event loop you will see this:

EV: error in callback (ignoring): failure: ERROR:  relation "no_such_table" does not exist

Even if you wrap an eval or a Try::Tiny try {} catch {} around the code the same thing happens. The try/catch is in effect while installing the callback, but not when the callback is called.

As a consequence of all this, asynchronous web frameworks usually cannot indicate errors with exceptions. Instead, they require you to respond to the client from inside the callback:

$dbh->exec("SELECT * FROM no_such_table", sub {
    my ($dbh, $rows, $rv) = @_;

    if (!$#_) {
        $context->respond_500_error("DB error: $@");
        return;
    }

    # success
});

There are several down-sides to this approach:

  • The error must be handled locally in each callback, rather than once in a catch-all error handler.

  • Everywhere an error might occur needs to have access to the context object. This often requires passing it as an argument around everywhere.

  • You might forget to handle an error (or it might be too inconvenient so you don't bother) and your success-case code will run on garbage data.

  • Perhaps most importantly, if some unexpected exception is thrown by your callback (or something that it calls) then the event loop will receive an exception and nothing will get logged or replied to.

For these reasons, Chouette uses Callback::Frame to deal with exceptions. The idea is that the exception handling code is carried around with your callbacks. For instance, this is how you would accomplish the same thing with Chouette:

my $dbh = $c->task('db');

$dbh->selectrow_arrayref("SELECT * FROM no_such_table", undef, sub {
    my ($dbh, $rows) = @_;

    # success

    # even if I do die here it will get routed to the right request
});

The callback will only be invoked in the success case. If selectrow_arrayref fails, an exception will be raised in the dynamic scope that was in effect when the callback was installed. The same goes if an exception is thrown by the callback passed to selectrow_arrayref. Because Chouette installs a catch handler for each request, the appropriate errors will be sent to the client and added to the Chouette logs.

Important note: Libraries like AnyEvent::Task (which is what task in the above example uses) are Callback::Frame-aware. This means that you can pass sub {} callbacks into them and they will automatically convert them to fub {} callbacks for you (but you can pass in fubs too, that's harmless).

When using 3rd-party libraries, you must pass fub {} instead. Also, you'll need to figure out how the library handles error cases and throw exceptions as appropriate. For example, if you really wanted to use AnyEvent::DBI (even though the AnyEvent::Task version is superior in pretty much every way) this is what you would do:

$dbh->exec("SELECT * FROM no_such_table", fub {
    my ($dbh, $rows, $rv) = @_;

    if (!$#_) {
        die "DB error: $@";
    }

    # success
});

Note that the sub has been changed to fub and an exception is thrown for the error case.

In summary, when installing callbacks you must use fub except when the library is Callback::Frame-aware (in which case fub is optional).

Please see the Callback::Frame documentation for more specifics.

CONTROL FLOW

The other unusual thing about how Chouette uses exceptions is that it uses them for control flow as well as errors.

As you can see in the respond method documentation of the "CHOUETTE OBJECT" section, you can die with the result of the respond method:

die $c->respond({ status => 'ok' });

This works because respond returns a special object specifically intended for this purpose. When it gets an exception, the main catch block checks if it is this object. If so, it just ignores the exception. This lets you terminate your current callback without worrying about returning.

This catch block also checks if your exception starts with 3 digits followed by a word-break. If so, it considers this a special exception intended to send an HTTP response. For example, the following code will send a 404 Not Found response:

die "404: no such resource";

The body of the response will be:

{"error":"no such resource"}

You can even just throw a number:

die 404;

Some people might consider this usage of exceptions to be kind of a hack, but it does make for really nice code if you'll give it a chance.

EXAMPLE

These files represent a complete-ish Chouette application that I have extracted from a real-world app. Warning: untested!

bin/myapi
#!/usr/bin/env perl

use common::sense;

use Chouette;

my $chouette = Chouette->new({
    config_file => '/etc/myapi.conf',

    config_defaults => {
        var_dir => '/var/myapi',
        listen => '8080',

        logging => {
            file_prefix => 'myapi',
            timezone => 'localtime',
        },
    },

    middleware => [
        'Plack::Middleware::ContentLength',
        ['Plack::Middleware::RealIP', header => 'X-Real-IP', trusted_proxy => [qw(127.0.0.1/24)], ],
    ],

    pre_route => 'MyAPI::Auth::authenticate',

    routes => {
        '/myapi/unauth/login' => {
            POST => 'MyAPI::User::login',
        },

        '/myapi/resource' => {
            POST => 'MyAPI::Resource::create',
            GET => 'MyAPI::Resource::get_all',
        },

        '/myapi/resource/:resource_id' => {
            GET => 'MyAPI::Resource::get_by_id',
        },
    },

    tasks => {
        passwd => {
            pkg => 'MyAPI::Task::PasswordHasher',
        },
        db => {
            pkg => 'MyAPI::Task::DB',
            checkout_caching => 1, ## so same dbh is used in authenticate and handler
        },
    },
});

$chouette->run;
lib/MyAPI/Auth.pm
package MyAPI::Auth;

use common::sense;

sub authenticate {
    my ($c, $cb) = @_;

    if ($c->{env}->{PATH_INFO} =~ m{^/myapi/unauth/}) {
        return $cb->();
    }

    my $session = $c->req->parameters->{session};

    $c->task('db')->selectrow_hashref(q{ SELECT user_id FROM session WHERE session_token = ? },
                                      undef, $session, sub {
        my ($dbh, $row) = @_;

        die 403 if !$row;

        $c->{user_id} = $row->{user_id};

        $cb->();
    });
}

1;
lib/MyAPI/User.pm
package MyAPI::User;

use common::sense;

sub login {
    my $c = shift;

    my $username = $c->req->parameters->{username};
    my $password = $c->req->parameters->{password};

    $c->task('db')->selectrow_hashref(q{ SELECT user_id, password_hashed FROM myuser WHERE username = ? }, undef, $username, sub {
        my ($dbh, $row) = @_;

        die 403 if !$row;

        $c->task('passwd')->verify_password($row->{password_hashed}, $password, sub {
            die 403 if !$_[1];

            my $session_token = $c->generate_token();

            $dbh->do(q{ INSERT INTO session (session_token, user_id) VALUES (?, ?) },
                     undef, $session_token, $row->{user_id}, sub {

                $dbh->commit(sub {
                    die $c->respond({ sess => $session_token });
                });
            });
        });
    });
}

1;
lib/MyAPI/Resource.pm
package MyAPI::Resource;

use common::sense;

sub create {
    my $c = shift;
    die "500 not implemented";
}

sub get_all {
    my $c = shift;
    $c->logger->warn("denying access to get_all");
    die 403;
}

sub get_by_id {
    my $c = shift;
    my $resource_id = $c->route_params->{resource_id};
    die $c->respond({ resource_id => $resource_id, });
}

1;
lib/MyAPI/Task/PasswordHasher.pm
package MyAPI::Task::PasswordHasher;

use common::sense;

use Authen::Passphrase::BlowfishCrypt;
use Encode;


sub new {
    my ($class, %args) = @_;

    my $self = {};
    bless $self, $class;

    open($self->{dev_urandom}, '<:raw', '/dev/urandom') || die "open urandom: $!";

    setpriority(0, $$, 19); ## renice our process so we don't hold up more important processes

    return $self;
}

sub hash_password {
    my ($self, $plaintext_passwd) = @_;

    read($self->{dev_urandom}, my $salt, 16) == 16 || die "bad read from urandom";

    return Authen::Passphrase::BlowfishCrypt->new(cost => 10,
                                                  salt => $salt,
                                                  passphrase => encode_utf8($plaintext_passwd // ''))
                                            ->as_crypt;

}

sub verify_password {
    my ($self, $crypted_passwd, $plaintext_passwd) = @_;

    return Authen::Passphrase::BlowfishCrypt->from_crypt($crypted_passwd // '')
                                            ->match(encode_utf8($plaintext_passwd // ''));
}

1;
lib/MyAPI/Task/DB.pm
package MyAPI::Task::DB;

use common::sense;

use AnyEvent::Task::Logger;

use DBI;


sub new {
    my ($class, $config) = @_;

    ## $config is your app's config, so you can get username/password/etc

    my $dbh = DBI->connect("dbi:Pg:dbname=myapi", '', '', {AutoCommit => 0, RaiseError => 1, PrintError => 0, })
        || die "couldn't connect to db";

    return $dbh;
}


sub CHECKOUT_DONE {
    my ($dbh) = @_;

    $dbh->rollback;
}

1;

SEE ALSO

More documentation can be found in the modules linked in the DESCRIPTION section.

Chouette github repo

AUTHOR

Doug Hoyte, <doug@hcsw.org>

COPYRIGHT & LICENSE

Copyright 2017 Doug Hoyte.

This module is licensed under the same terms as perl itself.