A simple library and CLI tool to converter a CSV file to Avro file. The main challenge when doing this is generating the Avro Schema. To that end, this tool borrows very heavily from Spark's CSV Inference code.
- JDK v1.8+
You can download the latest CLI from the release page
There are two modules in this tool:
- cli
- lib
Command line tool to convert a CSV to Avro
csv-avro-conveter -i input.csv -o output.avro
This can be used in any project where the conversion might be required.
import me.jairam.csv.CsvReader
import me.jairam.avro.AvroWriter
import me.jairam.schema.Builder.buildSchema
val csvReader = new CsvReader(inputFile)
val avroWriter = new AvroWriter(outputFile)
for {
rows <- csvReader.rows()
internalSchema <- csvReader.inferSchema()
avroSchema <- buildSchema(internalSchema, input.getName)
} {
avroWriter.write(rows, avroSchema)
}