Skip to content

4. UDF transformations

@dawrutowicz edited this page Jun 8, 2019 · 1 revision

cleanframes allows to provide a custom transformation implemented in scala code which will be resolved as a UDF transformation.

Define a simple UdfModel case class:

case class UdfModel(col1: Option[Int],
                    col2: Option[Double],
                    col3: Option[Float])

Define scala functions used for transformations:

import java.lang._

implicit val col1Transformer: String => Int = a => Integer.parseInt(a) * 2
implicit val col2Transformer: String => Double = (a: String) => Double.parseDouble(a) * 3
implicit val col3transformer: String => Float = a => { Float.parseFloat(a) + 100 }

Add imports relevant imports:

import cleanframes.instances.tryToOption._
import cleanframes.instances.higher._

First import allows us to use scala function and executes it within scala.util.Try instance and maps result to scala.Option.

Second import materializes above transformation within udf function as a cleanframes.Cleaner instance.

Rename columns and call the API:

import cleanframes.syntax._ 

frame
 .toDF("col1", "col2", "col3")
 .clean[UdfModel]

Notice that there is no imports from cleanframes.instances package.

Full running code example can be found here.