Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when normalizing a BigInt variable #2154

Open
juba opened this issue Aug 28, 2024 · 4 comments · May be fixed by #2155
Open

Error when normalizing a BigInt variable #2154

juba opened this issue Aug 28, 2024 · 4 comments · May be fixed by #2155
Assignees
Labels
bug Something isn’t working

Comments

@juba
Copy link
Contributor

juba commented Aug 28, 2024

When data contains a BigInt variable, trying to apply normalize on this variable throws an error: TypeError: Cannot convert a BigInt value to a number.

I've created a small notebook to reproduce the issue:

https://observablehq.com/@juba/plot-normalize-bigint-error

It seems that the error may come from this line, but I'm far from an expert on this domain so I may be mistaken:

https://github.com/observablehq/plot/blob/0032c11160e6340ce44307675d93bed382d10446/src/transforms/normalize.js#L46C1-L47C1

@juba juba added the bug Something isn’t working label Aug 28, 2024
@Fil
Copy link
Contributor

Fil commented Aug 28, 2024

Yes you can fix this example by doing

 function normalizeBasis(basis) {
   return {
     mapIndex(I, S, T) {
-      const b = +basis(I, S);
+      const b = Number(basis(I, S));
       for (const i of I) {
-        T[i] = S[i] === null ? NaN : S[i] / b;
+        T[i] = S[i] === null ? NaN : Number(S[i]) / b;
       }
     }
   };

to try, just run yarn prepublishOnly in your repo, then upload the file to your notebook and add a cell that says:

Plot = require(await FileAttachment("plot.umd.min.js").url())

However I don't think this is enough, because we need to fix all the different bases.

In that case the patch is simpler:

 function normalizeBasis(basis) {
   return {
     mapIndex(I, S, T) {
+      S = S.map(Number);
       const b = +basis(I, S);
       for (const i of I) {
         T[i] = S[i] === null ? NaN : S[i] / b;

but we should avoid doing this if the values are already numbers (not big).

So maybe instead we do:

--- a/src/options.js
+++ b/src/options.js
@@ -54,7 +54,7 @@ function maybeTypedMap(data, f, type) {
   return map(data, isNumberType(type) ? (d, i) => coerceNumber(f(d, i)) : f, type); // allow conversion from BigInt
 }

-function maybeTypedArrayify(data, type) {
+export function maybeTypedArrayify(data, type) {
   return type === undefined
     ? arrayify(data) // preserve undefined type
     : isArrowVector(data)
diff --git a/src/transforms/normalize.js b/src/transforms/normalize.js
index f4d225cd..2f426020 100644
--- a/src/transforms/normalize.js
+++ b/src/transforms/normalize.js
@@ -1,6 +1,6 @@
 import {extent, deviation, max, mean, median, min, sum} from "d3";
 import {defined} from "../defined.js";
-import {percentile, taker} from "../options.js";
+import {maybeTypedArrayify, percentile, taker} from "../options.js";
 import {mapX, mapY} from "./map.js";

 export function normalizeX(basis, options) {
@@ -43,6 +43,7 @@ export function normalize(basis) {
 function normalizeBasis(basis) {
   return {
     mapIndex(I, S, T) {
+      S = maybeTypedArrayify(S, Float64Array);
       const b = +basis(I, S);
       for (const i of I) {
         T[i] = S[i] === null ? NaN : S[i] / b;

@Fil
Copy link
Contributor

Fil commented Aug 28, 2024

Also let's create a more meaningful example / test plot. We often get BigInt as the result of a sql count, so maybe simulate this like so:

classes = d3
  .bin()(olympians.map((d) => d.height))
  .map((bin) => ({ weightclass: bin.x0, count: BigInt(bin.length) }))

// default basis
Plot.barY(
  classes,
  Plot.normalizeY({ x: "weightclass", y: "count" })
).plot({ x: { interval: 0.05 } })

// test a different basis
Plot.barY(
  classes,
  Plot.normalizeY("median", { x: "weightclass", y: "count" })
).plot({ x: { interval: 0.05 } })

Fil added a commit that referenced this issue Aug 28, 2024
@Fil Fil self-assigned this Aug 28, 2024
@Fil Fil linked a pull request Aug 28, 2024 that will close this issue
@juba
Copy link
Contributor Author

juba commented Aug 28, 2024

I can confirm that the issue seems fixed with your PR at #2155.

Just as a side note, I get these BigInt values because I'm passing arrow data coming from pandas and polars, which themselves seem to now import CSV files with 64 bits integers by default...

Thanks!

@Fil
Copy link
Contributor

Fil commented Aug 28, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn’t working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants