es2mssql

A demo of a basic pipeline that extracts data from Elasticsearch into Microsoft SQL Server.

Setting it all up

Get docker and docker-compose (at least version 1.29.0)
Run docker-compose up
- NOTE: On first start up logstash will throw some errors due to the source and destination databases not being available at this point
When it's all up navigate to http://localhost:5601 (kibana) and load the sample ecommerce dataset.
Then connect to the SQL Server running at port 9999 and run the queries in sqlserver/schema.sql
Restart the docker containers and wait for a minute or so, the data will start getting pushed to SQL Server

How does it all work

A logstash configuration is defined in logstash/es-loader.conf. This configuration specifies a pipeline that extracts data from Elasticsearch, transforms it a bit and passes it over to SQL server.

The extraction is specified in the input section. elasticsearch is set as the source and is configured to pull new orders from the kibana_sample_data_ecommerce index that was created in Setting it all up above. New orders are pulled every 10 minutes as specified by the schedule parameter. The schedule's value uses a syntax similar to crontab(5). A query that does the extraction from Elasticsearch is also defined. It simply pulls all documents with an order date that's at least 10 minutes before the current time.

Transformation is defined in the filter section. The transformation in this setup is done on the customer_gender field only. It simply extracts the first letter and converts it to uppercase. This is achieved by applying the mutate filter plugin twice. First the text is converted to uppercase and then the trailing ALE and EMALE are stripped out.

Loading is specified in the output section. Two outputs have been defined in this configuration. The first being the jdbc output and the following, just STDOUT. For the jdbc output, we specify the driver to be used in communicating with SQL Server. A connection string and a query that binds to the outputs of filter stage are also provided. If there is a need to split the data into multiple tables then multiple jdbc outputs can be specified, each with its own query. * NOTE: The jdbc output doesn't come builtin with logstash, it needs to be installed manually. Refer to logstash/Dockerfile for how to do this. You also need to get a database driver which you can get from Microsoft through a quick web search.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
logstash		logstash
sqlserver		sqlserver
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

es2mssql

Setting it all up

How does it all work

References

About

Releases

Packages

Languages

kwalter94/es2mssql

Folders and files

Latest commit

History

Repository files navigation

es2mssql

Setting it all up

How does it all work

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages