-
Notifications
You must be signed in to change notification settings - Fork 0
pawjy/dongry
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
 |  | |||
Repository files navigation
=head1 NAME Dongry::Database - Dongry MySQL database client object =head1 SYNOPSIS use Dongry::Database; my $db = Dongry::Database->new (sources => {default => {dsn => $dsn}, master => {dsn => $dsn, writable => 1}}); $db->execute ('CREATE TABLE table1 (col1 INT, date DATE) ENGINE=InnoDB'); $db->execute ('INSERT INTO table1 (col1) VALUES (?), (?)', [10, 2]); my $transaction = $db->transaction; $db->insert ('table1', [{col1 => 12, date => '2012-01-02'}]); $db->update ('table1', {date => '2001-10-02'}, where => {col1 => {'<', 5}}); $transaction->commit; $db->select ('table1', {col1 => $value})->each_as_row (sub { warn $_->get ('date')->ymd ('/'), "\n"; }); my $table = $db->table ('table1'); # Dongry::Table my $query = $db->query (...); # Dongry::Query =head1 DESCRIPTION B<Dongry> is a lightweight Perl interface to MySQL or MariaDB databases. It provides different levels of abstractions for applications to handle database operations in easy, memorable, and safe ways without stealing full control for how and when SQL queries are executed from them. The C<Dongry::Database> module contains the implementation of the database object, which is the most fundamental and important object in Dongry. All of the features in Dongry can be directly or indirectly accessible from the database object, as described in the following sections. In addition to the B<deprecated> synchronous processing model, Dongry also supports the asynchronous event-based model by using the L<AnyEvent::MySQL::Client> module as its backend. Since both modes can be switched or combined easily, you can transit one from the other, or you can partially change the processing mode, in very low cost. + - - - - - - - - - - - - - - - - - - - - - + : : : Application : : : +---+------------+ : | D | Query obj. | : | o +------------+-----------+ : | n | Schema-aware ops. | : | g +------------------------+------+ : | r | Structured SQL ops. | : | y +-------------------------------+---+ : | | Bare SQL w/named placeholders | : | +-----------------------------------+---+ | | Bare SQL ops. | | +---------------------------------------+ | | Connection management | +---+---------------+-----------------------+ | DBI | | | DBD::mysql |AnyEvent::MySQL::Client| | DBD::MariaDB | | +-------------------+-----------------------+ | ^ | | v | +-------------------------------------------+ | MySQL or MariaDB Database | +-------------------------------------------+ Fig. Various levels of features in Dongry =head1 METHODS OF DATABASE OBJECT The most important object in Dongry is the database object. The database object is an instance of the C<Dongry::Database> class. It has following methods: =head2 Instantiation There are two ways to instantiate database objects. One is the C<new> constructor method, by which you can always obtain a new instance of the database object. The other is the C<load> method to lookup the database object from the registry by the short name for the object. =over 4 =item $db = Dongry::Database->new (%opts) Create and return a new instance of the L<Dongry::Database> database object. Options C<sources>, C<master_only>, C<onconnect>, C<onerror>, C<schema>, and C<table_name_normalizer> can be specified as key/value pairs. See following subsections for their semantics and acceptable values. Example: my $db = Dongry::Database->new (sources => { default => {dsn => ..., username => ..., password => ...}, master => {dsn => ..., username => ..., password => ...}, }, onconnect => sub { my ($self, %args) = @_; warn $args{source_name}, ": Connected"; }); =item $Dongry::Database::Registry->{$db_name} = {...} =item $reg = Dongry::Database->create_registry There is the database definition registry, which can be accessed by the hash reference C<$Dongry::Database::Registry>. In addition, an application can create a local registry object by invoking the C<create_registry> class method. Registries are used with the C<load> method, as described in the next subsection. =item $db = Dongry::Database->load ($db_name) =item $db = $reg->load ($db_name) Instead of creating a new instance each time, a named database object instance can be retrieved by the C<load> method. Before the first invocation with the database name, the application has to register a short name for the database object with its initialization options, as a key/value pair of C<%$Dongry::Database::Registry> hash (for the global registry) or the C<< %{$reg->{Registry}} >> hash (for a local registry). The hash key is the short database name and the hash value is the hash reference containing options used to instantiate the database object. Following options allowed for the C<new> method is also applicable here: C<sources>, C<master_only>, C<onconnect>, C<onerror>, C<schema>, and C<table_name_normalizer>. Alteranatively, some of these values can be specified as code reference which is invoked to get the actual value when an object is initialized, by prefixing C<get_> to the hash key, i.e. C<get_sources>, C<get_onconnect>, C<get_onerror>, C<get_schema>, and C<get_table_name_normalizer>. The code reference will be invoked without any argument. The code reference must return a value that is allowed for the option (e.g. C<get_onconnect> code reference must return an C<onconnect> handler code reference). Then, you can use the short database name as the argument to the C<load> method to get the instance initialized with the specified options. The instantiation happens only once (per registry). Invocations of the method with the same database name always return the same database object instance. You can control the lifetime of the instances by accessing (e.g. clearing or C<local>izing) the C<$Dongry::Database::Instances> hash reference (for the global registry) or the C<< $reg->{Instances} >> hash reference (for a local registry). Example: $Dongry::Database::Registry->{user} = { sources => { master => ..., default => ..., }, onconnect => sub { warn "User connected"; }, get_schema => sub { return My::DatabaseInfo->schema }, }; $db = Dongry::Database->load ('user'); $db2 = Dongry::Database->load ('user'); # same as $db =back =head2 Connections A database object can have multiple (physically different) connections to databases. A connection is identified by a short string known as I<data source name>, e.g. C<default> and C<master>. Multiple connections can be used to adopt to different database server deployment strategies, such as: =over 4 =item A single database server. In this case, the only data source name C<master> should be used. =item A master-slave database server pair. In this case, the data source name C<master> should be used for the master server and C<default> for the slave server. Then the C<default> data source name is used for read-only SQL statements and the C<master> for any other, unless explicitly specified. =item A master-slave database server pair with additional servers for "heavy" queries. In this case, in addition to C<master> and C<default> data source names, any application-specific non-false short string, such as C<heavy> or C<batch>, can be used to identify additional servers. Then such additional source names can be specified to SQL executions as options. =item A complex set of database servers. In this case, any application-specific non-false short string can be used as data source names, but the appropriate data source name must be specified for all SQL executions explicitly. =back Operations in a transaction are always executed with the C<master> data source name. It might also be useful to define synchronous and asyncrhonous data source names for transitional applications that are only partially operated in terms of L<AnyEvent>. =over 4 =item $db->source ($source_name => $source_info) =item $source_info = $db->source ($source_name) =item $db = Dongry::Database->new (sources => {$source_name => $source_info, ...}) Get or set data source information for the specified data source name. It is a hash reference containg following key/value pairs: =over 4 =item dsn => string (REQUIRED) The "dsn" string used to connect the database. See L<DBI>, <https://metacpan.org/pod/distribution/DBD-mysql/lib/DBD/mysql.pm#connect>, and <https://metacpan.org/dist/DBD-MariaDB/view/lib/DBD/MariaDB.pod#connect> for more information. =item username => string The user name used to connect the database. This is optional if the database or the database driver module does not require user name. If both of C<username> and C<dsn>'s C<user> are specified, how they are used is database driver dependent. For the L<AnyEvent::MySQL::Client> mode, C<username> is used if defined. =item password => string The password used to connect the database. This is optional if the database or the database driver module does not require password. If both of C<password> and C<dsn>'s C<password> are specified, how they are used is database driver dependent. For the L<AnyEvent::MySQL::Client> mode, C<password> is used if defined. =item writable => boolean Whether the database is writable or not. If not writable, the database module refuse to issue write operations as far as possible. Default is false (i.e. not writable). =item anyevent => boolean Whether the asynchronous mode using L<AnyEvent::MySQL::Client> should be enabled or not. This option should be set to true (i.e. the asynchronous mode should be used). =item label => string Informative name of the data source. It is not used to connect to the data source. This value might or might not be useful for debugging. Default is same as C<dsn>. =item sql_comment => string Comment string to be embedded at the end of the SQL statements. As described in L</"INTERPREATAION OF STRINGS">, non-ASCII characters, as well as utf8-flaged strings, should not be specified. This option might be useful for debugging and loggin purposes. =back Although it is in theory possible to modify the hash reference after it is set to the database object, such modifications might not be reflected to the actual connection, depending on the exact timing of the modification and establishment of the connection. Therefore, you should not try to modify these values once it is set to the database object. TLS or SSL can be used to connect to the server by specifying C<mysql_*> or C<mariadb_*> options in C<dsn>. However, please note that old versions of L<DBD::mysql>, which is used in the synchronous mode through L<DBI> with dsn C<dbi:mysql:>, might not be compiled with SSL-support enabled, in which case L<DBD::mysql> silently ignores C<mysql_*> options (!). Once connected, the application can determine whether TLS is used or not by the following code: $tls_enabled = $db->execute ('SHOW STATUS LIKE "Ssl_cipher"') ->first->{Value} ne ''; =item $db->onconnect ($code) =item $code = $db->onconnect =item $db = Dongry::Database->new (onconnect => $code) Get or set the code reference which is invoked when a new connection is established. Use of the C<onconnect> handler is B<deprecated>. The code reference, when invoked, receives the "self" argument followed by zero or more key/value pairs. The "self" argument is the database object. The key/value pairs would contain: =over 4 =item source_name => string (ALWAYS) The name of the data source used to establish the connection. =back The default C<onconnect> handler does nothing. Please be aware the C<onconnect> handler is not safe place to execute SQL statements in general because it could result in establishing another connection within the C<onconnect> handler. It is, however, safe to execute an SQL statement using the same data source name as given by the C<source_name> option. It might be an interesting use case of this handler to execute some configuration statements (e.g. C<SET time_zone> statement) for the connection in the handler. Strictly speaking, there might not be a physical connection to the database server when the C<onconnect> handler is invoked. In particular, if the C<anyevent> option of the data source is set, the C<onconnect> handler is invoked after the L<AnyEvent::MySQL::Client> object is instantiated, but the actual connection might not been established at that time. You can assume that the C<onconnect> handler is invoked before any SQL statements or start of transactions are queued to execute with the connection, even when the C<anyevent> option is true. Example: $Dongry::Database::Registry->{hoge} = { ... onconnect => sub { my ($self, %args) = @_; $self->execute ('set time_zone = "+00:00"', [], even_if_read_only => 1, ## This is important! source_name => $args{source_name}); }, }: =item $db->onerror ($code) =item $code = $db->onerror =item $db = Dongry::Database->new (onerror => $code) Get or set the code reference which is invoked when an error is detected during various operations, such as connection to databases or execution of queries. Use of the C<onerror> handler is B<deprecated>. The code reference, when invoked, receives the "self" argument followed by zero or more key/value pairs. The "self" argument is the database object. The key/value pairs would contain: =over 4 =item source_name => string (ALWAYS) The name of the data source for the relevant database connection. =item text => string A short string describing the situation, if available. This is typically an error message from the underlying L<DBI> or database driver modules. =item sql => string The SQL statement being executed, if any. =item file_name => string =item line => number The location of the error. These values are only available in asynchronous mode. In synchronous mode, the L<Carp> module is useful to detect the error location. In asynchronous mode, however, the module only denotes somewhere in the implementation of the event loop. =back The default C<onerror> handler prints the error message to the standard error output. Please don't do anything much more complex than just reporting the error to someone else. In particular, don't try to execute another SQL statement in the error handler, as it would likely result in another SQL execution error, which would recursively invoke the same error handler. Error handling in synchronous (L<DBI> and L<DBD::mysql> or L<DBD::MariaDB>) mode and asynchronous (L<AnyEvent::MySQL::Client>) mode are different in several ways. In the synchronous mode, errors are also handled in synchronously. If the connection or an SQL statement causes an error because of a method invocation, any C<onerror> handler is invoked and then an exception is thrown within the method. If you specify an error handler, you can throw an exception (or simply C<die>) within the handler. Please note that the C<cb> (callback) function given to the method to access the database, if any, is not invoked for the error in synchronous mode. You can continue to use the connection to execute subsequent SQL statements, if desired. (Please note that if the error is fatal and permanent, the subsequent executions would also result in error. In such situation, you might want to invoke the C<disconnect> method explicitly to reestablish the connection.) In the asynchronous mode of L<AnyEvent::MySQL::Client>, as SQL statements are executed asynchronously, errors are also reported asynchronously. If the method to access to the database accepts the C<cb> (callback) option, the callback function is invoked even if there is an error, but with the argument describing the error (See the documentation of the C<execute> method for details). Then, the C<onerror> handler of the database object is invoked for the error. No exception is thrown by Dongry. The C<onerror> handler and C<cb> functions should not throw any exception in asynchronous mode. That means that if another statement is enqueued after an errorneous statement, that statement I<IS> executed after the error. If it is not desired, the result of the previous statement must be checked before the next statement is enqueued. =item $db->connect ($source_name, %opts) Connect to the data source whose name is specified by the argument, if any, or the C<default> data source, otherwise. If there is already a connection established for the specified data source, the method does nothing. After a connection has been established, the C<onconnect> handler is invoked. You don't have to invoke this method explicitly in general. This method is automatically invoked at the first time an operation that requires the connection is enforced. Returning from the method does not necessarily mean the physical connection to the database has been established. In particular, if the asynchronous mode is used for the connection, the method only does instantiate the L<AnyEvent::MySQL::Client> method and the actual connection is established asynchronously. In the asynchronous mode, the returned object is I<thenable>. The following option is available: =over 4 =item cb => code-reference (DEPRECATED) The callback function invoked when the connection has been established (or immediately if the connection is already established). In the synchronous mode, the code reference is synchronously invoked when the connection has been successfully established. In the asynchronos mode, the code reference is asynchronously invoked when the connection has been successfully or erronerously established, except when the C<connect> method is invoked within the C<onconnect> handler for the same C<source_name>, in which case the callback is synchronously invoked. The code reference receives two arguments: the database object and a return object which contains whether the connection has been successfully established (C<< $_[1]->is_success >>) or not (C<< $_[1]->is_error >>). The code reference should not throw any exception. See the C<onerror> method for rationale. In the asynchronous mode, the C<then> method of the returned object should be used instead of the callback function. =back =item $db->disconnect ($source_name, %opts) If a data source name is specified, disconnect the connection for the data source. Otherwise, disconnect all the connections hold by the database object. If there is no connection for the data source, the method does nothing. Any B<synchronous> connection is disconnected immediately. If there is an uncommitted synchronous transaction, it is rollbacked. Any B<asynchronous> connection is disconnected once it becomes free. That is, any ongoing or queued SQL statement or asynchronous transaction is still executed after the C<disconnect> method call. The application should explicitly invoke this method to grecefully shutdown the connection to the server. Otherwise some of the ongoing or queued statements might be discarded by the client or by the server. In the asynchronous mode, the returned object is I<thenable>. The following option is available: =over 4 =item cb => code-reference (DEPRECATED) The callback function invoked when the connection has been closed (or immediately if the connection is already closed). In the synchronous mode, the code reference is synchronously invoked when the connection has been closed. In the asynchronos mode, the code reference is asynchronously invoked when the connection has been closed. The code reference receives a arguments: the database object. The code reference should not throw any exception. See the C<onerror> method for rationale. In the asynchronous mode, the C<then> method of the returned object should be used instead of the callback function. =back =back By default SQL executions are automatically committed (i.e. C<AutoCommit> option of the L<DBI> object is set to true). Use of transaction object can override this behavior: =over 4 =item $transaction = $db->transaction (when master is in synchronous mode) If the source C<master> is in the synchronous mode, the method starts a transaction and returns the synchronous transaction object for the transaction. The synchronous transaction object is a "guard" object; The transaction is effective as long as the object is live. Once you get a synchronous transaction object, you can invoke SQL operations as usual, through various methods of Dongry. In addition, the C<lock> option for C<select> and other retrieval methods becomes available in the transaction. Within a transaction you can only use the C<master> data source. The default data source is C<master>, not C<default>, within a transaction. =item $db->transaction->then (sub { $transaction = $_[0] }) (when master is in asynchronous mode) If the source C<master> is in the asynchronous mode, the method returns a promise that is fulfillwed after a transaction is started. The promise is fulfilled with an asynchronous transaction object for the transaction. =back Once you get an B<asynchronous> transaction object, you can invoke SQL operations using the methods of the asynchronous transaction object: =over 4 =item $result = $transaction->execute (...) =item $result = $transaction->select (...) =item $result = $transaction->insert (...) =item $result = $transaction->update (...) =item $result = $transaction->delete (...) Send an SQL statement. These methods are equivalent to the methods of the Dongry database object but SQL statements are associated with the transaction. The C<source_name> option is not allowed. They are always assumed as C<master>. The C<select> method supports the C<lock> option. =back After execution of zero or more SQL statements, you have to commit or rollback the transaction to save or discard the modifications, using a method of the synchronous or asynchronous transaction object: =over 4 =item $transaction->commit (%opts) Commit the transaction. Changes will be persisted by the database. The C<cb> option is available. See C<execute> method for more information. The result object for the C<commit> method only contains whether there is an error or not and description for the error, if any. =item $transaction->rollback (%opts) Rollback the transaction. Changes will be discarded by the database. The C<cb> option is available. See C<execute> method for more information. The result object for the C<rollback> method only contains whether there is an error or not and description for the error, if any. =item $string = $transaction->debug_info Return a string that shortly describes the object. It might or might not be useful for debugging. =back Exactly one of C<commit> and C<rollback> methods must be invoked to terminate the transaction. Once one of them are invoked, the synchronous or asynchronous transaction object lost its effect, even if it has not been destroyed by Perl. This is different from usual "guard" objects, including the "force source" object described below. If you forgot to commit or rollback the changes, they will be discarded when the synchronous or asynchronous transaction object is destroyed by Perl. Don't rely on this behavior since it is not guaranteed by Perl exactly when an object is destroyed and the rollback statement might not be successfully sent to the server depending on how the event loop and object destruction interact together. After the C<commit> or C<rollback> method is invoked on a B<synchronous> transaction object, you can start another synchronous transaction session by calling the C<< $db->transaction >> method again. A new synchronous transaction cannot be initiated while there is another synchronous transaction. During there is an B<asynchronous> transaction object, the Dongry database object's statement methods are also available but they are I<not> associated with the transaction. Such statements are sent to the server after the conclusion of the transaction. Likewise, a new asynchronous transaction can be started even when there is anothor ongoing transaction. When you just want to use a particular data source for some operations, you can use the "force source" feature: =over 4 =item $force = $db->force_source_name ($source_name) Return a "guard" object that forces the use of a data source specified as the argument. While the I<$force> object is in scope (or before the C<end> method of the object is invoked), the C<execute> method and the other methods to execute SQL statements act as if the C<source_name> option with the given source name is specified, except that if the C<source_name> is explicitly specified to use another data source for the method, the method would report an error instead. It is an error to call this method while there is an active I<$force> object. In other word, you cannot force a data source while another data source is forced. Likewise, you cannot start a transaction when a data source is forced; and you cannot force a data source within a transaction. The transaction automatically forces the C<master> data source. This method is not supported in asynchronous mode. =back The "force" object has following methods: =over 4 =item $force->end End to force the data source. Although this method is automatically called when the I<$force> object is destroyed by Perl, it is encouraged to call this method explicitly as Perl does not guarantee the exact timing for the destruction of an object. =item $string = $force->debug_info Return a short string that might or might not be useful for debugging. =back The scope of the forcing object affects the queueing, not the actual execution in asynchronous mode, as for the transaction object. =head2 Bare SQL operations The bare SQL execution API of Dongry, i.e. the C<execute> method, executes SQL statements provided by the application as is, with few modifications. It would be useful if you'd like to issue a complex or rarely-used kind of SQL statement and if you really and completely understand what you are doing. For common operations such as insertion and selection, please consider using more abstract APIs described in following sections. =over 4 =item $result = $db->execute ($sql, [$value1, $value2, ...], %opts) =item $result = $db->execute ($sql, {param1 => $value1, ...}, %opts) Execute an SQL statement. B<SQL with anonymous placeholders>: The first argument is the SQL statement to execute, which possibly contains placeholders (C<?>). The second argument is an array reference, which contains values bound to the placeholders. The second argument can be omitted (i.e. can be C<undef>) if there are no placeholders. B<SQL with named placeholders>: The first argument is the SQL statement to execute, which possibly contains named placeholders such as C<:params>. The second argument is a hash reference, which contains name-value pairs bound to the placeholders. For more information on the I<named placeholders>, see L<Dongry::SQL>. In addition, following key/value pairs can be specified as options: =over 4 =item source_name => string The name of the data source used to select the database connection. If there is no connection established for the data source, the C<connect> method with the source name is automatically invoked and then the SQL statement is executed over that connection. If this option is not specified, an appropriate data source is determined. In general the C<master> data source is chosen. However, if the SQL is known to read-only, i.e. it begins by C<SELECT>, C<DESC>, or C<SHOW>, the C<default> data source is chosen. Note that transactions and "force source" objects would modify this default. =item must_be_writable => boolean If this option is set to true, the method would raise an error without executing the SQL statement when the selected data source is I<not> marked as I<writable>. If this option is set to true and the C<source_name> option is I<not> specified, the data source is set to C<master> even if the SQL statement is considered as read-only. =item even_if_read_only => boolean If this option is set to true, the method would not raise an error even when the SQL statement is not considered as read-only but the selected data source is I<not> marked as I<writable>. This option would be useful to issue an C<SET> statement to set configuration options for the connection, as the method does not consider the C<SET> statements as read-only but it should be applied even to a read-only data source. You should not specify both C<must_be_writable> and C<even_if_read_only> options at the same time. =item table_name => table-name If this option is set to a value, it is used as the value of the result object's C<table_name> property, if the SQL execution has been succeeded. Otherwise, the result object returned by the C<execute> method has the C<table_name> value of C<undef>, which means that C<*_as_row> methods on the result object and the C<each_as_row_cb> option of this method cannot be used. =item each_cb => code-reference =item each_as_row_cb => code-reference The callback functions that is invoked whenever rows are received from the server. They behave as if the argument of the C<each> or C<each_as_row> method of the result object returned by the C<execute> method. In asynchronous mode, as C<each> and C<each_as_row> methods of the return object are not available, these callbacks must be used. In synchronous mode, both those methods and these callbacks are available, but if callbacks are specified, methods are disabled. Anyway, at most one of these callbacks can be specified. The callback is not expected to throw any exception. =item cb => code-reference (DEPRECATED) The callback function invoked when the SQL has been executed. In the synchronous mode, the code reference is synchronously invoked when the SQL has successfully been executed. In the asynchronous mode, the code reference is asynchronously invoked when the SQL has been executed, successfully or erroneously. The code reference receives two arguments: the database object and the result object that would be returned by the method in synchronous mode. In asynchronous mode, the result object describes the error. The code reference should not throw any exception in asynchronous mode. See the C<onerror> method for rationale. In the asynchronous mode, the C<then> method of the returned object should be used instead of the callback function. =back The method would return a result object which is I<not> bound to any table. The number of affected rows (or the rows to be fetched) can be retrieved from the C<row_count> method of the result object. If the SQL statement returns rows (e.g. C<SELECT> or C<SHOW>), they are accessible from various methods of the result object. The method does not return a value in asynchronous mode. In the asynchronous mode, the returned object is I<thenable>. =back =head2 Structured SQL operations Methods described in this section construct an SQL statement from the given arguments and then execute it. Commonly-used patterns of SQL statements such as C<SELECT> and C<INSERT> are supported by these methods such that an application does not have to concatenate, quote, or otherwise modify bare SQL statements by itself in most cases. =over 4 =item $result = $db->insert ($table_name, [$values1, $values2, ...], %opts) Insert one or more rows into the table specified by the first argument. The second argument must be an array reference which contains one or more hash references, each of them is a row to be inserted (represented as column name/value pairs). Unspecified columns will be set to their default values by the database. In addition, following options can be specified as key/value pairs: =over 4 =item source_name => string The name of the data source used to select the database connection. For more information on how data source is selected, see the description for the C<execute> method. =item duplicate => "ignore" / "replace" / arrayref / hashref How "duplicate entry" error is handled by the database. If C<ignore> is specified, an C<INSERT IGNORE> statement is issued such that any duplicate insertion will be ignored without errors being reported. Please note that this statement might ignore any other possible error as well. If C<replace> is specified, an C<REPLACE> statement is issued such that any duplicate rows will be discarded before the insertion. If an array or hash reference is specified, the C<ON DUPLICATE KEY UPDATE> clause is appended to the C<INSERT> statement with the values specified in the array or hash. The format of the hash reference is same as the second argument of the C<update> method. An array whose content is equal to a hash can be specified to preserve order of key/value pairs. Note that the order of SQL fragments are significant when they contain column names. If this option is not specified, then a normal C<INSERT> statement is issued such that any duplication would be reported as an SQL execution error through the C<onerror> handler. =item cb => code-reference (DEPRECATED) The callback function invoked after the execution of the SQL statement. See the C<execute> method for more information. In the asynchronous mode, the C<then> method of the returned object should be used instead of the callback function. =back This method, in the synchronous mode, returns a result object bound to the table specified as the first argument. The number of affected rows can be retrieved from the C<row_count> method of the result object. Althugh inserted rows can be accessible from various methods of the result object, please note that the object only holds the data as received as the argument to the C<insert> method. It might not reflect any SQL-level default values, auto-increment values, server-side value normalization, handling of C<duplicate> options, use of bare SQL fragments, and so on. If you really like to know what is inserted (or not inserted) to the database, you have to select the inserted row(s) by yourself. In the case you have explicitly specified the primary key values in the argument, the C<reload> method of table row objects obtained from the result object might be useful for this purpose. Example: $db->insert ('mytable', [ {id => 12, name => 'Foo', date => 0}, {id => 13, name => 'Bar'}, {id => 14, name => undef, date => '2012-03-01'}, ]); # INSERT INTO `mytable` (`id`, `name`, `date`) VALUES # ('12', 'Foo', '0'), # ('13', 'Bar', DEFAULT), # ('14', NULL, '2012-03-01') In the asynchronous mode, the returned object is I<thenable>. =item $result = $db->select ($table_name, $where, %opts) Select rows from the table specified as the first argument. Conditions to select rows, i.e. the C<WHERE> clause of the SQL statement, must be specified as the second argument, in that format described in L<Dongry::SQL>. In addition, following options can be specified as key/value pairs: =over 4 =item source_name => string The name of the data source used to select the database connection. For more information on how data source is selected, see the description for the C<execute> method. =item must_be_writable => boolean If this option is set to true, the method would raise an error without executing the SQL statement if the selected data source is marked as I<writable>. For more information on how this option applies, see the description for the C<execute> method. =item distinct => boolean If this options is set to true, the C<DISTINCT> keyword is put in the statement such that only a row is returned for a duplicate set of rows. =item fields => fields-struct The columns to retrieve, to be specified just after the C<SELECT> keyword of the issued SQL statement. The value must be a fields struct, as described in L<Dongry::SQL>. Examples: [undef, 'c1'] # *, `c1` [{-count => undef}, 'c1', 'c2'] # COUNT(*), `c1`, `c2` [{-count => 'c1', as => 'c', distinct => 1}] # COUNT(DISTINCT `c1`) AS `c` =item group => [column1, column2, ...] The columns to use grouping, i.e. column names for the C<GROUP BY> clause in the SQL statement. Column names must be specified as an array reference of strings. =item order => [column1 => direction, column2 => direction, ...] The columns and their orders for the sorting, i.e. values for the C<ORDER BY> clause in the SQL statement. The value must be a order struct, as described in L<Dongry::SQL>. Examples: [col1 => 1, col2 => -1] # `col1` ASC, `col2` DESC [col1 => 'ASC', col2 => 'DESC'] # `col1` ASC, `col2` DESC =item offset => number, limit => number The offset and the number of rows to retrieve, i.e. values for the C<LIMIT> clause in the SQL statement. They must be specified as non-negative integers. If C<offset> is specified but C<limit> is not specified, C<limit> is defaulted to C<1>. Otherwise, the default for C<offset> is zero and the default for C<limit> is inifinity. =item lock => update/share The lock mode for the C<SELECT> operation. If the value is C<update>, C<FOR UPDATE> option is included in the C<SELECT> statement. If the value is C<share>, C<LOCK IN SHARE MODE> option is included in the C<SELECT> statement. =item each_cb => code-reference =item each_as_row_cb => code-reference The callback function invoked whenever rows are received from the server. See the C<execute> method for more information. =item cb => code-reference (DEPRECATED) The callback function invoked after the execution of the SQL statement. See the C<execute> method for more information. In the asynchronous mode, the C<then> method of the returned object should be used instead of the callback function. =back This method returns a result object bound to the table specified as the first argument. The number of available rows can be retrieved from the C<row_count> method of the result object. The selected rows can be accessible from one of C<all>, C<each>, and C<first> methods, or their C<_as_row(s)> variants C<all_as_rows>, C<each_as_row>, and C<first_as_row>. Examples: my $result = $db->select ('table1', {col1 => 'hoge', col2 => 123, date => {'<=', '2001-02-02'}}, order => [date => -1, col1 => 1, col2 => -1]); # SELECT * FROM `table1` # WHERE `col1` = 'hoge' AND `col2` = '123' AND # `date` <= '2001-02-02' # ORDER BY `date` DESC, `col1` ASC, `col2` DESC; $result->each (sub { my $values = shift; warn $values->{date}, "\t", $values->{col1}, "\n"; }); my $result = $db->select ('table1', {col1 => 'hoge'}, fields => [{-count => undef, distinct => 1, as => 'count'}, 'col1', 'col2'], group => ['col2']); # SELECT COUNT(DISTINCT *) AS `count` FROM `table1` # WHERE `col1` = 'hoge' GROUP BY `col2` $result->all_as_rows->each (sub { my $row = $_[0]; warn $row->get ('col1'), ",", $row->get ('col2'); warn $row->get ('count'); }); In the asynchronous mode, the returned object is I<thenable>. Example: $db->source (ae => {dsn => $dsn, anyevent => 1}); my $cv = AnyEvent->condvar; $db->select ('foo', {user_id => $uid}, source_name => 'ae')->then (sub { print $_[0]->all->map (sub { $_->get ('name') })->join ("\t"); })->catch (sub { warn $_[0]; })->then (sub { $cv->send; }); $cv->recv; =item $result = $db->update ($table_name, $values, where => $where, %opts) Update rows in the specified table using the SQL C<UPDATE> statement. The new values must be specified by the second argument, as hash reference containing key/value pairs representing the column names and their new values. A value can be specified as a scalar value (a text, number, or C<undef> value), or a bare SQL fragment object (a return value of the C<bare_sql_fragment> method), which would be inserted to the value portion of the SQL statement. The conditions to specify the updated rows, i.e. the C<WHERE> clause, must be specified as the thrid argument, in the same format as the C<select> method. In addition, following options can be specified as key/value pairs: =over 4 =item source_name => string The name of the data source used to select the database connection. For more information on how data source is selected, see the description for the C<execute> method. =item duplicate => "ignore" Specifies how "duplicate entry" error is handled by the database. If C<ignore> is specified, an C<UPDATE IGNORE> statement is issued such that any attempt to generate duplicate rows is ignored without errors being reported. Please note that this statement might ignore any other possible errors as well. If this option is not specified, then a normal C<UPDATE> statement is issued such that any duplication would be reported as an SQL execution error through the C<onerror> handler. =item order => [column1 => direction, column2 => direction, ...] The columns and their orders for the sorting, i.e. values for the C<ORDER BY> clause in the SQL statement. The value must be a order struct, as described in L<Dongry::SQL>. This option would have no effect unless there is also the C<limit> option. =item limit => number The maximum number of rows to update, i.e. the value for the C<LIMIT> clause in the SQL statement. It must be specified as a non-negative integer. If the option is not specified, all the matching rows will be updated. Unlike the C<select> method, the C<offset> option is not available for this method. =item cb => code-reference (DEPRECATED) The callback function invoked after the execution of the SQL statement. See the C<execute> method for more information. In the asynchronous mode, the C<then> method of the returned object should be used instead of the callback function. =back This method returns a result object bound to the table specified as the first argument. The number of affected rows can be retrieved from the C<row_count> method of the result object. Methods such as C<all>, C<each>, and C<first> are not available for the result object generated by this method. Example: $db->update ('table1', {col1 => 12, col2 => $db->bare_sql_fragment ('col2 + 2')}, where => {created => {'<=', '2012-01-01 00:00:00'}}); # UPDATE `table1` SET `col1` = 12, `col2` = col2 + 2 # WHERE `created` <= '2012-01-01 00:00:00' In the asynchronous mode, the returned object is I<thenable>. =item $baresql = $db->bare_sql_fragment ($sql_fragment) Return a bare SQL fragment object for the given SQL statement fragment. In general, values specified for the C<update> or C<select> methods are interpreted as is, possibly stringified. However, you might sometimes want to specify a SQL expression such as C<columnname + 1> or C<NOW()>. You can specify such a SQL fragment by wrapping it by the bare SQL fragment object using this method. Bare SQL fragment objects can be specified for the values of C<update> method, and the C<duplicate> values option of the C<insert> method. CAUTION! SQL statement fragment is inserted into the SQL statement constructed by those methods as is without any modifications or validations. It means that you have to prevent the fragment from syntax error, SQL injection vulnerability, or any other possible errors. =item $db->delete ($table_name, $where, %opts) Delete rows in the specified table. The second argument must specify the conditions to select the rows to delete, in the same format as the second argument of the C<select> method. In addition, following options can be specified as key/value pairs: =over 4 =item source_name => string The name of the data source used to select the database connection. For more information on how data source is selected, see the description for the C<execute> method. =item order => [column1 => direction, column2 => direction, ...] The columns and their orders for the sorting, i.e. values for the C<ORDER BY> clause in the SQL statement. The value must be a order struct, as described in L<Dongry::SQL>. This option would have no effect unless there is also the C<limit> option. =item limit => number The maximum number of rows to delete, i.e. the value for the C<LIMIT> clause in the SQL statement. It must be specified as a non-negative integer. If the option is not specified, all the matching rows will be deleted. Unlike the C<select> method, the C<offset> option is not available for this method. =item cb => code-reference (DEPRECATED) The callback function invoked after the execution of the SQL statement. See the C<execute> method for more information. In the asynchronous mode, the C<then> method of the returned object should be used instead of the callback function. =back This method returns a result object bound to the table specified as the first argument. The number of affected rows can be retrieved from the C<row_count> method of the result object. Methods such as C<all>, C<each>, and C<first> are not available for the result object generated by this method. Example: $db->delete ('table1', {created => {'<=', '2012-01-01 00:00:00'}}); # DELETE FROM `table1` WHERE `created` <= '2012-01-01 00:00:00' In the asynchronous mode, the returned object is I<thenable>. =item $db->set_tz ($tz_offset, source_name => $source_name, cb => $cb) Set the time zone for the database connection, using C<SET time_zone> statement. The time zone offset argument, which is default to UTC, must be in the format understood by the database, i.e. C<+01:00>. The C<source_name> option, defaulted to the C<master>, specifies the data source for which the time zone is set. The C<cb> (callback) function, if specified, is invoked after the execution of the SQL statement. See the C<execute> method for more information. The C<cb> function is B<deprecated>. The default time zone used by the database when this statement is not issued depends on the configuration and the implementation of the database in use. Example. Setting the time zone used in the current connection to the database: $db->onconnect (sub { my ($self, %args) = @_; $self->set_tz ('+00:00', source_name => $args{source_name}); }); =item $db->uuid_short ($n, source_name => $source_name)->then (...) Get the result of the C<UUID_SHORT()> function, repeatedly invoked I<$n> times. This method is equivalent to the C<execute> method with SQL statement C<SELECT UUID_SHORT()> where the function is repeated as specified by the argument. This method does not work in the synchronous mode. In the asynchronous mode, the returned object is I<thenable>. It is to be fulfilled with an array reference of the obtained values (i.e. I<$n> UUID short numbers). The C<source_name> option, defaulted to the C<master>, specifies the data source over which the SQL statement is executed. =back =head2 Schema-aware operations The following methods provide access to schema-aware APIs. What are schema-aware APIs? Methods mentioned in the previous sections are not schema-aware. That is, no SQL data is handled differently from strings for the purpose of those methods. The schema-aware APIs are built on top of them to offer an abstract layer for convertion of database and Perl data types. For example, if you associated the C<created> column (declared as SQL C<TIMESTAMP> data type in the database schema) with the C<timestamp_as_DateTime> type handler (defined in L<Dongry::Type::DateTime>), you can use a Perl L<DateTime> object to set the value for the column through a schema-aware method, or you can retrieve the column value as a C<DateTime> object from a schema-aware method. =over 4 =item $db->schema ($schema) =item $schema = $db->schema =item $db = Dongry::Database->new (schema => $schema) Get or set the schema definition as used by the database object. The schema definition has to be set before any schema-aware operation is actually invoked. Please note that although the I<schema> concept of Dongry is modelled after the schema of SQL and database implementations, they have no direct relationship in fact. You can declare a C<INT> column in SQL as C<date> in Dongry, even though such a broken mapping would not be useful. If you changed the schema in the database, you might also have to modify the schema definition for Dongry. Although this could be inconvinient for some use cases, it would give an application great control over how they handle data. A I<schema> is just a hash reference, where key/value pairs are table names and table schema for them. A I<table schema> is also a hash reference with following key/value pairs: =over 4 =item type => type-definition Define the type handler for columns in the table. The value must be a hash reference containing key/value pairs representing column names and its types. A type is a string as described in L<Dongry::Type>. Types are used to intepret column values in schema-aware operations such as C<find> and C<create> methods of the table object. =item primary_keys => [$column1, $column2, ...] Define the primary keys for the table. The value must be an array reference containing the column names used as the columns of primary keys for the table. Primary keys are used to identify the row in operations of the table row object, such as C<update> and C<delete>. The specified columns do not have to be defined as primary keys in the actual database schema in fact. However, there should at least unique keys for the columns, or the operations over the row object would cause unexpected results. =item default => default-definition Define the default values for columns in the table. The value must be a hash reference whose key/value pairs representing column names and their default values. A default value can be specified as the code reference, in which case the code reference would be invoked each time the default value is necessary. The code reference is invoked without any argument. It is expected to return the default value. In any case, the default value is interpreted as in the data type of the column, as defined by the C<type> clause. Default values are used when rows are inserted through a schema-aware method such as the C<create> method of the table object. =back Even though all of these keys in the table schema is optional, the table schema it self must be specified in the schema to enable schema-aware APIs for the table. =item $db->table_name_normalizer ($code) =item $code = $db->table_name_normalizer =item $db = Dongry::Database->new (table_name_normalizer => $code) Get or set the code to normalize table names for the purpose of table schema lookup. Sometimes a number of tables in the database share the same table structure. For example, consider a database where C<hoge_1>, C<hoge_2>, C<hoge_3>, and other tables share the common definition. Rather than repeating the table schema definitions for all tables, you have to register just a table definition by setting the following table name normalizer code: $db->table_name_normalizer (sub { my $name = shift; $name =~ s/_[0-9]+$/_n/; return $name; }); $db->schema ({ foo_n => { ... }, bar_n => { ... }, }); $db->table ('foo_2') # -> foo_n $db->table ('bar_130') # -> bar_n =item $table = $db->table ($table_name) Create a table object for the specified table. The table object provides schema-aware APIs for the table. For more information on the table object see documentation for the L<Dongry::Table> module. =item $query = $db->query (%opts) Create a query object with the specified options. The query object is an abstraction for a particular schema-aware C<SELECT> operation. If the C<query_class> option is specified, the class is used to instantiate the query object. Otherwise, the L<Dongry::Query> class is used. For more information, including the list of the other available options, see documentation for the L<Dongry::Query> module. =back =head2 Development There is a method to facilitate development of applications: =over 4 =item $string = $db->debug_info Return a not-so-long string to describe the database object. It might or might not be useful for debugging. =back You might want to dump the SQL statements issued by Dongry for the purpose of development. Though Dongry itself does not have such debugging feature. Perl modules such as L<DBIx::ShowSQL>, L<Devel::KYTProf>, and L<DBIx::QueryLog> for synchronous mode, or L<AnyEvent::MySQL::Client::ShowLog> for asynchronous mode, would be useful. If the C<SQL_DEBUG> environmental variable has a true value, Dongry loads the L<DBIx::ShowSQL> or L<AnyEvent::MySQL::Client::ShowLog> module. If the C<SQL_DEBUG> environmental variable contains substring C<embed_caller>, or if the C<$Dongry::Database::EmbedCallerInSQL> variable has a true value, Dongry embeds the file name and the line number of the method invocations that request SQL executions into the SQL statement, as an SQL comment. =head1 METHODS OF RESULT OBJECT SQL execution methods of the database object, such as C<execute>, C<select>, C<insert>, and so on, return the result object. In the synchronous mode, the object returned by the method itself contains the result of the method. In the asynchronous mode, the object returned by the method does not contain the result, as it is not yet available. If the returned object is referred to as I<thenable>, the returned object has the C<then> method. Callback functions registered by the C<then> method will receive the actual result object of the original method as the argument. The callback function specified by the C<cb> option also receives a result object as the second argument in many methods. Such an result object also contains the result of the method. A result object holds the execution result of the SQL statement, which can be accessible from following methods: =over 4 =item $boolean = $result->is_success =item $boolean = $result->is_error Return whether the result is success or error. =item $string = $result->error_text If the result is error, return a short text that describes the error. If no more information is available, or if the result is not error, C<undef> is returned. =item $string = $result->error_sql If the result is error during the execution of an SQL statement, return the SQL statement that is being executed when the error is reported. If it is not available, or if the result is not error, C<undef> is returned. =item $number = $result->row_count Return the number of rows affected by the SQL execution (for write operations), or the number of rows found by the SQL execution (for read operations). Please note that the semantics of the number depends on the kind of the executed SQL statement and the database implementation and might not be always useful for the application. For example, the row count for the C<INSERT IGNORE> statement might not be equal to the number of rows actually inserted into the table. =item $result->each ($code) Iterate for each row in the result. The argument must be a code reference. The code is invoked for each row, with an argument which represents the row as a hash reference where key/value pairs represent column names and their values. If there is no row in the result, the code reference will not be invoked at all. This method is not available in asynchronous mode. Use C<each_cb> option instead. =item $list = $result->all Return a list which contains all the rows in the result. The method returns a list object which contains zero or more hash references representing the rows. The hash reference consists of key/value pairs representing column names and their values. =item $values = $result->first Return the first row in the result. If there is a row, the method returns the row as a hash reference where key/value pairs represent column names and their values. If there is no row in the result, C<undef> is returned. =item $string = $result->debug_info Return a short string to describe the result object. It might or might not be useful for debugging. =back The result object is either bound to a table or not bound to any table. If a method receives the table name explicitly as an argument (e.g. the C<select> method of the database object), the result object of the method is bound to that table. Otherwise (e.g. the C<execute> method of the database object) the result object is not bound to any table. When the result object is bound to a table, following schema-aware methods can be used: =over 4 =item $name = $result->table_name Return the name of the table bound to the result object, if any, or C<undef>, otherwise. =item $result->each_as_row ($code) Iterate for rows like the C<each> method does, but gives the table row object for each row to the specified code reference instead of a hash reference. This method is not available in asynchronous mode. Use C<each_as_row_cb> option instead. =item $list = $result->all_as_rows Return a list of the table rows like the C<all> method does, but puts the table row objects for the rows in the list instead of hash references. =item $row = $result->first_as_row Return the first table row, if any, like the C<first> method does, but gives the table row object for the first row instead of a hash reference. =back Methods returning table row objects can be called only once for a result object. For example, if the C<each> method is once called, the other methods can no longer be called. In addition, the C<each> method itself become invalid. As described in the section of the C<insert> method, row data or table row objects accessible from the result object returned by the C<insert> method is based on the argument to the method and might be different from the actual data on the database. A I<thenable> result object has the following method: =over 4 =item $promise = $result->then ($onfulfill, $onreject) Register callback functions invoked when the operation of the original method has succeeded or failed, respectively. One of these callback is invoked once the operation result is available, with the result object containing the actual result of the method. This method is actually the C<then> method of a promise object. Both fulfill and reject callbacks can be omitted. The method returns a new promise. The new promise is to be resolved with the returned value of the invoked callback function, or to be rejected by the exception thrown by the callback function. =back A I<thenable> result object behaves as if it were a L<Promise> object, except there is no C<catch> method. As it has the C<then> method, it can be used as the return value of a promise callback or the argument of the C<< Promise->resolve >> method to create a promise to be resolved with the actual result object. For more information on promise, see documentation of L<Promise> <https://github.com/wakaba/perl-promise>. =head1 COMPATIBILITY Although the synchronous mode of Dongry is built on the top of the L<DBI>'s standard interface to various kinds of database backends, its full functionality can be used only when the database to be connected is MySQL version 5. The asynchronous mode of Dongry requires L<AnyEvent::MySQL::Client>, which only supports MySQL client/server protocol as implemented by MySQL version 5. Some of SQL statements generated by Dongry are not supported by other database management systems. Some of features provided by Dongry do not work as intended depending on the database engine in use. Specifically, transaction does not work for tables using the MyISAM engine. =head1 INTERPREATAION OF STRINGS Lower-level interfaces of Dongry, i.e. methods of the database object such as C<execute>, C<select>, C<insert>, and so on, leave the interpretation of byte- or character-string (or the utf8 flag) to the underlying layers. The underlying L<DBI> module used in the synchronous mode, in turn, delegates the handling of it to the database driver such as L<DBD::mysql> and L<DBD::MariaDB>. See the documentation for database driver modules for their implementation. The L<AnyEvent::MySQL::Client> module requires the input being non-utf8 flagged byte string. For portability, you should only use byte strings in queries and values. Higher-level interfaces of Dongry, namely the schema-aware APIs of the table and table row objects, handles strings as defined by the data type of the column. Unless the type is explicitly specified, you should always specify or receive a byte (or unflagged) string. If a column is expected to contain a character string, the column should be declared as of C<text> or C<text_as_ref> such that you can specify or receive a character (or flagged) string. If you write a type handler, be careful for the parser to interpret byte strings and for the serializer to return character strings if the data type is defined as sequence(s) of characters. Dongry does intentionally not support non-ASCII table and column names. How they are interpreted if used is unspecified. Example: my $result = $db->execute ('SELECT * FROM table1 WHERE name = :name', {name => encode 'utf-8', $name}); $name = decode 'utf-8', $result->first->{name}; $db->schema ({table1 => {type => {name => 'text'}}}); $row = $db->table ('table1')->find ({name => $flagged_name}); $flagged_name = $row->get ('name'); $unflagged_name = $row->get_bare ('name'); =head1 AVAILABILITY The latest version of Dongry is available from the Git repository <https://github.com/wakaba/dongry>. =head1 DEPENDENCY Perl 5.12 or later is required. For the synchronous mode, the modules L<DBI> and L<DBD::mysql> or L<DBD::MariaDB> are required. For the asynchronous mode, the L<AnyEvent::MySQL::Client> module in the GitHub repository <https://github.com/wakaba/perl-anyevent-mysql-client> is required. Dongry uses the L<List::Ish> module as list object implementation by default, but this can be configured by the C<$Dongry::Database::ListClass> variable. You can use other modules with similar API, such as L<List::Rubyish> and L<DBIx::MoCo::List>, instead. The L<List::Ish> module is available at <https://github.com/wakaba/perl-ooutils/blob/master/lib/List/Ish.pm>. Additional modules might be used by L<Dongry::Type> modules, but they are not loaded unless you are explicitly using them. See their documentations for more information. The GitHub repository of Dongry contains submodules. Submodule C<modules/perl-ooutils> contains the L<List::Ish> module. Submodule C<modules/perl-json-functions-xs> is required by L<Dongry::Type::JSON>. The submodule C<modules/perl-rdb-utils> is used for debugging and testing. =head1 SEE ALSO L<DBI>, L<DBD::mysql>, L<DBD::MariaDB>. L<AnyEvent::MySQL::Client> <https://github.com/wakaba/perl-anyevent-mysql-client>. MySQL Reference Manuals <https://dev.mysql.com/doc/>. L<DBIx::ShowSQL> <https://github.com/wakaba/perl-rdb-utils/blob/master/lib/DBIx/ShowSQL.pm>. L<Promise> <https://github.com/wakaba/perl-promise>. =head1 AUTHOR Wakaba <wakaba@suikawiki.org>. =head1 HISTORY Design and interfaces of various features in Dongry is largely inspired by Perl modules L<DBI>, L<AnyEvent::DBI>, L<DBIx::MoCo>, L<DBIx::MoCo::Query>, L<DBIx::MoCo::ColumnMethods>, L<DBIx::MoCo::TableExtras>, L<SQL::Abstract>, L<SQL::NamedPlaceholder>, and L<List::Rubyish>. Thanks to hatz48. This Git repository was located at <https://github.com/wakaba/dongry> until 7 March, 2022. =head1 LICENSE Copyright 2011-2024 Wakaba <wakaba@suikawiki.org>. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut