DIVE Update (2/1/16 - 2/9/16)
Pre-releaseDIVE is located at two password-protected sites, with the latest development version on staging (please use this to test) and the weekly stable release on the stable site:
Staging: staging.usedive.com. (password: macro)
Stable: usedive.com. (password: macro. Limited datasets)
For any bug reports or feature requests, please e-mail us personally or at dive@media.mit.edu.
[Feature] Conditionals / Filters
We implemented categorical and quantitative filters on single visualizations, as well as the ability to combine conditionals with AND/OR. Previously we only allowed single categorical conditionals (e.g. only sales in DIVISION = ASIA).
Testing link: http://staging.usedive.com/projects/6/datasets/17/visualize/builder/2243
[Feature] Binning
We implemented four more procedural binning rules in addition to the previous, default Freedman binning rule. Different rules work better for certain distributions (e.g. Doane's formula for non-normal data), while others (like Square-root) are faster.
We also allow users to manually specify the number of equal-sized bins to use.
Testing link: http://staging.usedive.com/projects/1/datasets/1/visualize/builder/1016
[Feature] Summary Tables + Marginal Values
We've formatted field summaries into a grid. Upon selecting a field, marginal value tables are shown.
Testing link: http://staging.usedive.com/projects/3/datasets/3/analyze/summary
Testing link: http://staging.usedive.com/projects/4/datasets/60/analyze/summary
[Feature] Error messaging
For every asynchronous task (dataset upload, transformation, visualization), if an error occurs we now return server-side stack traces that are logged to the console. This reduces hanging for the user, and makes debugging easier for us.
[Performance] Asynchronous dataset transformation
We moved converted transformations (pivoting, reducing, and joining) from synchronous to asynchronous processes. This allows us to offload computation onto worker processes, and prevents request time-out for the user.
[Performance] Optimization
Serialization
We've reduced serialization time (converting server-side returns into a browser-usable format) up to 50%, using less type checks and casts.
Caching intermediate data frames
If multiple visualizations using the same intermediate grouped data frames, we cache the intermediate data frames to eliminate the biggest bottleneck in mapping visualizations specs to visualization data.
Bug fixes
- Fixed decimal formatting, so no more ridiculously long numbers.
- Fixing hanging on large dataset transformations
Next Up
- Visualization configuration
- Minimum compose
- Correlating ad spend and sales data