MariaDB Community Server 10.5 has many enhancements that will provide an improved user experience with faster performance and improved manageability.
Earlier versions of MariaDB ColumnStore have been available to the open source community as a separate fork of MariaDB. Now, with MariaDB Community Server 10.5, ColumnStore 1.5 is included as a pluggable storage engine making a columnar storage engine for analytics easily available to MariaDB Community Server users out-of-the-box.
MariaDB ColumnStore can be deployed as the analytics component of MariaDB's single stack Hybrid Transactional/Analytical Processing (HTAP) solution or as a standalone columnar database for interactive, ad hoc analytics at scale.
The following is a walkthrough to get you up and running with a MariaDB Community Server (with ColumnStore) (using a Docker Image) and an HTAP database instance in less than 15 minutes.
Before getting started with this walkthrough you need to have Docker installed.
This walk-through will step you through the process of installing, accessing and configuring HTAP within a MariaDB Community Server (with ColumnStore) instance.
Pull down the MariaDB Community Server (with ColumnStore) image and create a new container by executing the following command in a terminal window:
$ docker run -d -p 3306:3306 --name mcs_container mariadb/columnstore
$ docker exec -it mcs_container bash
$ yum install git
$ git clone https://github.com/mariadb-corporation/dev-example-htap-community.git
This repository includes the following schemas:
- innodb_db (database)
- airlines (InnoDB table)
- airports (InnoDB table) - all airports within the United States of America
- flights (InnoDB table)
- columnstore_db (database)
- flights (ColumnStore table)
In this sample, the create_and_load.sh script will be used to create the schemas (via schema.sql) and load the following tables:
- innodb_db.airlines - using data/airlines.csv
- innodb_db.airports - using data/airports.csv
- columnstore.db_flights - using data/flights.csv
Execute the script to create schemas and load data.
$ ./create_and_load.sh
You should output similar to the following:
creating schema...
schema created
loading data...
- airlines.csv loaded into innodb_db.airlines
- airports.csv loaded into innodb_db.airports
Locale = en_US.UTF-8Column delimiter : ,
Enclosed by Character : "
Using table OID 3042 as the default JOB ID
Input file(s) will be read from : /dev-example-htap-community
Job description file : /var/lib/columnstore/data/bulk/tmpjob/3042_D20200626_T165130_S148211_Job_3042.xml
Log file for this job: /var/lib/columnstore/data/bulk/log/Job_3042.log
2020-06-26 16:51:30 (1343) INFO : successfully loaded job file /var/lib/columnstore/data/bulk/tmpjob/3042_D20200626_T165130_S148211_Job_3042.xml
2020-06-26 16:51:30 (1343) INFO : Job file loaded, run time for this step : 0.043752 seconds
2020-06-26 16:51:30 (1343) INFO : PreProcessing check starts
2020-06-26 16:51:30 (1343) INFO : input data file /dev-example-htap-community/data/flights.csv
2020-06-26 16:51:30 (1343) INFO : PreProcessing check completed
2020-06-26 16:51:30 (1343) INFO : preProcess completed, run time for this step : 0.0360899 seconds
2020-06-26 16:51:30 (1343) INFO : No of Read Threads Spawned = 1
2020-06-26 16:51:30 (1343) INFO : No of Parse Threads Spawned = 3
2020-06-26 16:51:30 (1343) INFO : For table columnstore_db.flights: 679996 rows processed and 679996 rows inserted.
Confirm that schema creation and data loading has been successful.
- Access MariaDB
$ mariadb
- Confirm databases have been created.
$ SHOW DATABASES();
You should see the following output:
+---------------------+
| Database |
+---------------------+
| calpontsys |
| columnstore_db |
| columnstore_info |
| infinidb_querystats |
| information_schema |
| innodb_db |
| mysql |
| performance_schema |
+---------------------+
- View table content (and confirm they have data)
$ SELECT COUNT(*) FROM innodb_db.airlines;
$ SELECT COUNT(*) FROM innodb_db.airports;
$ SELECT COUNT(*) FROM columnstore_db.flights;
And because there is a Cross Engine user included within the Community Server 10.5 container you can also test out cross engine (ex. innodb_db <-> columnstore_db) joins.
$ SELECT a.airline, AVG(f.dep_delay) FROM innodb_db.airlines a INNER JOIN columnstore_db.flights f ON a.iata_code = f.carrier GROUP BY a.airline ORDER BY a.airline;
+-------------------------+------------------+
| airline | AVG(f.dep_delay) |
+-------------------------+------------------+
| Air Wisconsin Airlines | 9.6194 |
| Alaska Airlines | 11.3068 |
| Allegiant Air | 13.1918 |
| American Airlines | 9.8433 |
| American Eagle Airlines | 10.0892 |
| CommutAir | 34.8564 |
| Compass AIrlines | 6.9968 |
| Delta Airlines | 7.7799 |
| Empire Airlines | 9.2762 |
| Endeavor Air | 10.4156 |
| ExpressJet | 18.7828 |
| Frontier Airlines | 13.4475 |
| GoJet Airlines | 20.8848 |
| Hawaiian Airlines | 1.3769 |
| Horizon Air | 10.8914 |
| JetBlue Airways | 25.3461 |
| Mesa Airlines Inc | 19.5124 |
| Piedmont Airlines | 8.7041 |
| PSA Airlines | 13.5451 |
| Republic Airways | 10.6948 |
| Skywest Airlines | 16.4908 |
| Southwest Airlines | 12.4597 |
| Spirit Airlines | 10.4121 |
| Trans States Airlines | 16.5218 |
| United Airline | 11.9805 |
+-------------------------+------------------+
25 rows in set (0.154 sec)
- Exit MariaDB
$ exit
The following configuration changes will be used to set up replication between innodb_db and columnstore_db.
Open /etc/my.cnf.d/columnstore.cnf for editing.
$ vi /etc/my.cnf.d/columnstore.cnf
Make the following configuration changes:
# Required for Schema Sync
server-id = 1
log_bin=mariadb-bin
log_slave_updates = OFF
binlog_format = STATEMENT
# HTAP filtering rules
# 1. Transactions are replicated from itself
replicate_same_server_id = ON
# 2. Only write queries that touch innodb_db to the binary log
binlog_do_db = innodb_db
# 3. Rewrite innodb_db to columnstore_db prior to applying transaction
replicate_rewrite_db = innodb_db->columnstore_db
Optional: If you want to restrict replication to occur on certain tables you can use replicate_wild_do_table
.
# 4. Only replicate tables that begin with "htap"
replicate_wild_do_table = columnstore_db.htap%
Save file and exit.
$ mariadb < sql/replication.sql
$ columnstore-restart
You can test replication by inserting a new record into innodb_db.flights
and confirming that it exists in columnstore_db.flights
.
- Access MariaDB
$ mariadb
- Insert a new record into
innodb_db.flights
.
$ USE innodb_db;
$ INSERT INTO flights VALUES (2020, 6, 25, 1, '2020-6-25', 'DL', 'N9999', 1000, 'ORD', 'LAX', '0600', '0600', 10);
- Confirm the new record exists in
columnstore_db.flights
.
$ SELECT * FROM columnstore_db.flights WHERE year = 2020;
- Deploy HTAP with MariaDB ColumnStore 1.5 and MariaDB Community 10.5
- Sign up for MariaDB SkySQL
- Official MariaDB SkySQL Documentation
Please feel free to submit PR's, issues or requests to this project project directly.
If you have any other questions, comments, or looking for more information on MariaDB please check out:
Or reach out to us diretly via: