Skip to content

DIVE Update (2/1/16 - 2/9/16)

Pre-release
Pre-release
Compare
Choose a tag to compare
@kevinzenghu kevinzenghu released this 09 Feb 22:09
· 906 commits to master since this release

DIVE is located at two password-protected sites, with the latest development version on staging (please use this to test) and the weekly stable release on the stable site:

Staging: staging.usedive.com. (password: macro)
Stable: usedive.com. (password: macro. Limited datasets)

For any bug reports or feature requests, please e-mail us personally or at dive@media.mit.edu.


[Feature] Conditionals / Filters

We implemented categorical and quantitative filters on single visualizations, as well as the ability to combine conditionals with AND/OR. Previously we only allowed single categorical conditionals (e.g. only sales in DIVISION = ASIA).

image
Testing link: http://staging.usedive.com/projects/6/datasets/17/visualize/builder/2243


[Feature] Binning

We implemented four more procedural binning rules in addition to the previous, default Freedman binning rule. Different rules work better for certain distributions (e.g. Doane's formula for non-normal data), while others (like Square-root) are faster.

We also allow users to manually specify the number of equal-sized bins to use.

image
image
Testing link: http://staging.usedive.com/projects/1/datasets/1/visualize/builder/1016


[Feature] Summary Tables + Marginal Values

We've formatted field summaries into a grid. Upon selecting a field, marginal value tables are shown.

image
Testing link: http://staging.usedive.com/projects/3/datasets/3/analyze/summary

image
Testing link: http://staging.usedive.com/projects/4/datasets/60/analyze/summary


[Feature] Error messaging

For every asynchronous task (dataset upload, transformation, visualization), if an error occurs we now return server-side stack traces that are logged to the console. This reduces hanging for the user, and makes debugging easier for us.

image


[Performance] Asynchronous dataset transformation

We moved converted transformations (pivoting, reducing, and joining) from synchronous to asynchronous processes. This allows us to offload computation onto worker processes, and prevents request time-out for the user.


[Performance] Optimization

Serialization

We've reduced serialization time (converting server-side returns into a browser-usable format) up to 50%, using less type checks and casts.

Caching intermediate data frames

If multiple visualizations using the same intermediate grouped data frames, we cache the intermediate data frames to eliminate the biggest bottleneck in mapping visualizations specs to visualization data.


Bug fixes

  • Fixed decimal formatting, so no more ridiculously long numbers.
  • Fixing hanging on large dataset transformations

Next Up

  • Visualization configuration
  • Minimum compose
  • Correlating ad spend and sales data