ds-test-tools
is a small library to help test Datasplash pipelines.
[com.oscaro/ds-test-tools "0.2.1"]
Then:
(ns your.project
(:require [ds-test-tools.core :as dt]))
The only function you need is dt/run-pipeline
.
It takes inputs as Clojure data, mapping of keys to result files, and a function to call in order to build the pipeline. It dumps the input data in the appropriate files; builds a configuration map; pass it to your function; run the pipeline; collect the results; and return them to you.
;; Your pipeline
(defn my-job [conf p]
(->> p
(ds/read-edn-file (:numbers conf))
(ds/map inc)
(ds/write-edn-file (str (:output conf) "/higher.edn"))))
(let [{:keys [result]} (dt/run-pipeline
{:numbers [1 2 3 4]}
{:result "higher"}
my-job)]
(println (sort result))) ; '(2 3 4 5)
The inputs config map uses the same format as your configuration map. Your build function should take a map of keywords to file paths:
(defn my-job [{:keys [people houses output]} p]
(let [people (ds/read-edn-file people p)
houses (ds/read-edn-file houses p)]
(->> (ds/join-by (fn [p h] [p :lives-in h])
[[people :house-id {:type :required}]
[houses :id {:type :required}]])
(ds/write-edn-file (tio/join-path output "housing.edn")))))
The pipeline above would use the following inputs config map:
{:people [{:name "John" :house-id 1} {:name "Jane" :house-id 2} ...]
:houses [{:id 1 :name "Red House"} {:id 2 :name "Green House"} ...]}
By default, it assumes you use EDN as inputs and outputs. You can change that
by setting the :reader
(used to read outputs) and :writer
(used to write
inputs) keys in the optional options map:
(dt/run-pipeline
{:reader :jsons ; or :edns (the default)
:writer :jsons} ; or :edns (the default)
inputs-config
outputs-config)
If you have mixed formats or something else than EDN/JSONS, you can also provide a function of two (readers) or three (writers) arguments:
(dt/run-pipeline
{:reader (fn [k filename] ; k is the key in your inputs-config map
(case k
:my-jsons-output (tio/read-jsons-file filename)
:my-edns-output (tio/read-edns-file filename)))
:writer (fn [k filename data]
(case k
:my-jsons-input (tio/write-jsons-file filename data)
:my-csv-input (tio/write-csv-file filename data)
:my-text-input (tio/write-text-file filename data)))}
inputs-config
outputs-config)
Copyright © 2018-2019 Oscaro
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.