Release DIVE Update (2/1/16 - 2/9/16) · CenterForCollectiveLearning/DIVE-backend

DIVE is located at two password-protected sites, with the latest development version on staging (please use this to test) and the weekly stable release on the stable site:

Staging: staging.usedive.com. (password: macro)
Stable: usedive.com. (password: macro. Limited datasets)

For any bug reports or feature requests, please e-mail us personally or at dive@media.mit.edu.

[Feature] Conditionals / Filters

We implemented categorical and quantitative filters on single visualizations, as well as the ability to combine conditionals with AND/OR. Previously we only allowed single categorical conditionals (e.g. only sales in DIVISION = ASIA).

Testing link: http://staging.usedive.com/projects/6/datasets/17/visualize/builder/2243

[Feature] Binning

We implemented four more procedural binning rules in addition to the previous, default Freedman binning rule. Different rules work better for certain distributions (e.g. Doane's formula for non-normal data), while others (like Square-root) are faster.

We also allow users to manually specify the number of equal-sized bins to use.

Testing link: http://staging.usedive.com/projects/1/datasets/1/visualize/builder/1016

[Feature] Summary Tables + Marginal Values

We've formatted field summaries into a grid. Upon selecting a field, marginal value tables are shown.

Testing link: http://staging.usedive.com/projects/3/datasets/3/analyze/summary

Testing link: http://staging.usedive.com/projects/4/datasets/60/analyze/summary

[Feature] Error messaging

For every asynchronous task (dataset upload, transformation, visualization), if an error occurs we now return server-side stack traces that are logged to the console. This reduces hanging for the user, and makes debugging easier for us.

[Performance] Asynchronous dataset transformation

We moved converted transformations (pivoting, reducing, and joining) from synchronous to asynchronous processes. This allows us to offload computation onto worker processes, and prevents request time-out for the user.

[Performance] Optimization

Serialization

We've reduced serialization time (converting server-side returns into a browser-usable format) up to 50%, using less type checks and casts.

Caching intermediate data frames

If multiple visualizations using the same intermediate grouped data frames, we cache the intermediate data frames to eliminate the biggest bottleneck in mapping visualizations specs to visualization data.

Bug fixes

Fixed decimal formatting, so no more ridiculously long numbers.
Fixing hanging on large dataset transformations

Next Up

Visualization configuration
Minimum compose
Correlating ad spend and sales data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DIVE Update (2/1/16 - 2/9/16)