Query

Query is a command line tool to view big data files (avro, parquet, orc, csv, json and all spark supported formats) as tables and query them from an interactive console. It supports syntax colouring and autocomplete of keywords, table names and column names. Csv, json, avro, parquet, orc formats are supported. The tool is written over spark sql 3 and supports the formats that spark supports and also auto-detects the columns of the tables.

Local files/directories can be mounted as well as any spark supported path (i.e. s3n though not tested yet). This means that the command can be run on a developers box but access data on any path that spark recognizes.

Note: feel free to ask questions in the "Discussions" board at the top of this github page.

Installation

scala-cli is the only requirement to use query. The recommended way is to checkout this repository and start modifying the example scala-cli scripts in the examples directory:

git clone https://github.com/kostaskougios/query.git
cd query/examples
cat Readme.md

For example here is a script that mounts tweets tables in parquet and avro format: tweets.sc

and this is the script that creates the sample tweets data: generate-sample-data.sc

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
etc/img		etc/img
examples		examples
project		project
query/src		query/src
scripts		scripts
test-data/test-scan		test-data/test-scan
.gitignore		.gitignore
.java-version		.java-version
.scalafmt.conf		.scalafmt.conf
Readme.md		Readme.md
build.sbt		build.sbt
publish.sbt		publish.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Query

Installation

Screenshots

Ubuntu

MacOs

About

Releases

Packages

Languages

kostaskougios/query

Folders and files

Latest commit

History

Repository files navigation

Query

Installation

Screenshots

Ubuntu

MacOs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages