Skip to content

Commit

Permalink
improve nesting
Browse files Browse the repository at this point in the history
  • Loading branch information
PS-davetemplin committed Jul 12, 2019
1 parent a88d55d commit 6e62e3a
Show file tree
Hide file tree
Showing 5 changed files with 47 additions and 13 deletions.
3 changes: 2 additions & 1 deletion .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@
"host": "127.0.0.1",
"program": "${fileDirname}",
"env": {
"GOPATH": "c:/Users/Dave/go"
"GOPATH": "c:/Users/DaveTemplin/go" // MAKE SURE THIS PATH IS CORRECT!
},
"cwd": "../bigdata",
"args": ["pricespider/map_clients"],
"showLog": true
}
]
Expand Down
19 changes: 10 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
# Bigboy
Extract data from SQL Server, PostgreSQL, or MySQL, transforming SQL-to-JSON or JSON-to-JSON.
High data-rate SQL-to-JSON extraction from SQL Server, PostgreSQL, or MySQL.

Written by Dave Templin

# Overview
Bigboy is a tool that extracts data from SQL Server, PostgreSQL, or MySQL databases and transforms SQL-to-JSON or JSON-to-JSON; basically performing the **E** and the **T** part of **ETL** *(Extract/Transform/Load)*. The tool provides a simple model for configuring SQL extraction queries and optionally Javascript functions for transformations. A simple but powerful command-line interface (CLI) makes it easy to perform both adhoc and batch processing scenarios (BASH, CRON, etc.). The tool is also designed to maximize available local compute resources to extract and transform massive data volumes in a time-efficient way.
Bigboy is basically a **SQL-TO-JSON** tool that extracts data from SQL Server, PostgreSQL, or MySQL databases.
It is designed to handle extremely high data extraction rates by running multiple concurrent queries.
The tool provides a simple configuration model for managing any number of extractions.
It also exposes a simple and minimal command-line interface (CLI) that works great for adhoc or batch/cron use-cases.

## Features
* Extract data from SQL Server, PostgreSQL, or MySQL
* Perform SQL-to-JSON or JSON-to-JSON transformations
* Perform simple SQL-to-JSON transformations
* Nest rows to form complex hierarchical (or document-oriented) data
* Leverage Javascript functions to perform arbitrarily complex data transformations
* Define command driven parameters to create dynamic queries and scripts
* Define configuration parameters to manage dynamic queries
* Run queries in parallel from a configurable thread pool for high data rates
* Combine data from multiple different database sources
* Apply timezone to dates stored without a timezone
* Configure the tool to maximize local compute resources and minimize processing time

## Quickstart



# Concepts

## Connections
Expand All @@ -31,6 +31,7 @@ fetch, prefetch

## Transforms
nest, script, split, timezone
_ for value only nesting



Expand Down Expand Up @@ -148,6 +149,7 @@ Install [golang](https://golang.org/dl/)
$ go get github.com/denisenkom/go-mssqldb
$ go get github.com/lib/pq
$ go get github.com/go-sql-driver/mysql
$ go get golang.org/x/crypto/md4 # required if cross building from windows
$ git clone https://github.com/davetemplin/bigboy.git
$ go build
```
Expand All @@ -160,7 +162,6 @@ $ build mac
```



# References

There are lots of ways to approach ETL, and lots of vendors that want to sell you a solution!
Expand Down
1 change: 1 addition & 0 deletions build.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ SETLOCAL
SET GOOS=windows
SET GOARCH=amd64
go build -o bin/windows/bigboy.exe
COPY /Y %GOROOT%\lib\time\zoneinfo.zip bin\windows\
ENDLOCAL
GOTO end

Expand Down
8 changes: 6 additions & 2 deletions nest.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,13 @@ func queryNest(db *sql.DB, nest *Nest, list []map[string]interface{}) error {
slice := make([]interface{}, 0)
for _, child := range children {
if val, ok := child["_parent"]; ok {
if val.(int64) == parent[nest.ParentKey].(int64) {
if to_uint64(val) == to_uint64(parent[nest.ParentKey]) {
delete(child, "_parent")
slice = append(slice, child)
if _, ok := child["_"]; ok {
slice = append(slice, child["_"])
} else {
slice = append(slice, child)
}
}
}
}
Expand Down
29 changes: 28 additions & 1 deletion utilities.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,37 @@ func jsonWriteln(file *os.File, obj map[string]interface{}) error {
return nil
}

func to_uint64(value interface{}) uint64 {
switch value.(type) {
case int:
return uint64(value.(int))
case int8:
return uint64(value.(int8))
case int16:
return uint64(value.(int16))
case int32:
return uint64(value.(int32))
case int64:
return uint64(value.(int64))
case uint:
return uint64(value.(uint))
case uint8:
return uint64(value.(uint8))
case uint16:
return uint64(value.(uint16))
case uint32:
return uint64(value.(uint32))
case uint64:
return uint64(value.(uint64))
default:
panic("cannot convert value to uint64")
}
}

func take(list []map[string]interface{}, key string) []uint64 {
result := make([]uint64, 0)
for _, obj := range list {
result = append(result, obj[key].(uint64))
result = append(result, to_uint64(obj[key]))
}
return result
}
Expand Down

0 comments on commit 6e62e3a

Please sign in to comment.