Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grid_data can fail due to floating point approximations #8

Open
Sbozzolo opened this issue Jan 23, 2021 · 11 comments
Open

grid_data can fail due to floating point approximations #8

Sbozzolo opened this issue Jan 23, 2021 · 11 comments
Labels
bug Something isn't working

Comments

@Sbozzolo
Copy link
Owner

Sbozzolo commented Jan 23, 2021

In HierarchicalGridData, we often check if a point belongs to a grid. In practice, the check consists in:

if np.any(point < (self.lowest_vertex)) or np.any(point >= (self.highest_vertex)):
    return False
return True

where we have the two vertices of the grid computed as self.x0 - 0.5 * self.dx and self.x1 + 0.5 * self.dx. Floating-point arithmetic can lead to false positives for values on the edge of a cell. This can manifest itself in an error in the method merge_refinement_levels.

EDIT:
Here is a file that (probably) triggers the problem: https://pastebin.com/raw/A536eJmA

@Sbozzolo Sbozzolo added the bug Something isn't working label Feb 11, 2021
@Sbozzolo
Copy link
Owner Author

Sbozzolo commented May 1, 2021

With version 1.1.0 we introduced a new algorithm to find which component contains a point. The older mechanism is kept as fallback for when the grids do not have constant refinement factors, which is uncommon. This means that this bug has now a much narrower scope.

@AuroraDysis
Copy link

In the latest version, the error still appears. Maybe we should find a better solution?

Cell In[4], line 14, in read_hc_index(sim, i)
     12 var = sim.gf.z["hc"]
     13 it = var.iterations[i]
---> 14 merged = var[it].merge_refinement_levels().to_GridSeries()
     15 return merged

File ~/anaconda3/envs/default/lib/python3.10/site-packages/kuibit/grid_data.py:2716, in HierarchicalGridData.merge_refinement_levels(self, resample)
   2713 new_shape = ((self.x1 - self.x0) / new_dx + 1.5).astype(int)
   2714 new_shape = np.array([s if s > 0 else 1 for s in new_shape])
-> 2716 return self.to_UniformGridData(
   2717     new_shape,
   2718     self.x0,
   2719     x1=None,
   2720     dx=new_dx,
   2721     time=self.time,
   2722     iteration=self.iteration,
   2723     resample=resample,
   2724 )

File ~/anaconda3/envs/default/lib/python3.10/site-packages/kuibit/grid_data.py:2677, in HierarchicalGridData.to_UniformGridData(self, shape, x0, x1, dx, resample, **kwargs)
   2653 """Combine the refinement levels into a :py:class:`~.UniformGridData` specified
   2654 by the given ``shape``, ``x0``, and ``dx`` or ``x1``.
   2655 
   (...)
   2673 
   2674 """
   2675 grid = UniformGrid(shape, x0, x1, dx, **kwargs)
-> 2677 return self.to_UniformGridData_from_grid(grid, resample=resample)

File ~/anaconda3/envs/default/lib/python3.10/site-packages/kuibit/grid_data.py:2647, in HierarchicalGridData.to_UniformGridData_from_grid(self, grid, resample)
   2631 def to_UniformGridData_from_grid(self, grid, resample=False):
   2632     """Combine the refinement levels into a :py:class:`~.UniformGridData`
   2633     on the specified :py:class:`~.UniformGrid`.
   2634 
   (...)
   2643 
   2644     """
   2645     return UniformGridData(
   2646         grid,
-> 2647         self.evaluate_with_spline(grid, piecewise_constant=(not resample)),
   2648     )

File ~/anaconda3/envs/default/lib/python3.10/site-packages/kuibit/grid_data.py:2617, in HierarchicalGridData.evaluate_with_spline(self, x, ext, piecewise_constant)
   2615 for (ref_level, comp), points_indices in level_comps.items():
   2616     points = points_arr[level_comps[ref_level, comp]]
-> 2617     evaluated_points = level_comps_data[
   2618         (ref_level, comp)
   2619     ].evaluate_with_spline(
   2620         points, ext=ext, piecewise_constant=piecewise_constant
   2621     )
   2622     ret[points_indices] = evaluated_points
   2624 # Finally, we have to reshape the array to the correct form.

File ~/anaconda3/envs/default/lib/python3.10/site-packages/kuibit/grid_data.py:633, in UniformGridData.evaluate_with_spline(self, x, ext, piecewise_constant)
    627 x_arr = x_arr.reshape(-1, x_arr.shape[-1])
    629 if piecewise_constant or (
    630     self.num_dimensions != self.num_extended_dimensions
    631 ):
--> 633     ret = self._nearest_neighbor_interpolation(x_arr, ext=ext)
    635 else:
    636     # We are here only with method = linear
    638     if self.invalid_spline:

File ~/anaconda3/envs/default/lib/python3.10/site-packages/kuibit/grid_data.py:547, in UniformGridData._nearest_neighbor_interpolation(self, points, ext)
    543 if ext == 2:
    544     if np.any(outside_indices):
    545         # For ext = 2, we simply have the raise an error if we have
    546         # any outside_index
--> 547         raise ValueError("Point outside the grid")
    548     # NumPy fancy indexing consists in a list of N tuples each
    549     # representing a coordinate, so we have to reshape the indices.
    550     # Here we use this trick:
   (...)
    553     # at the time from each dimension. Finally, we convert this iterator
    554     # to a tuple
    555     take_indices = tuple(zip(*indices))

ValueError: Point outside the grid

@Sbozzolo
Copy link
Owner Author

Sbozzolo commented Jan 25, 2023 via email

@AuroraDysis
Copy link

Sure, you can download the file from this link
https://we.tl/t-sW0ryEkqX5

@Sbozzolo
Copy link
Owner Author

Sure, you can download the file from this link https://we.tl/t-sW0ryEkqX5

Thanks, I can reproduce the problem. It has to do with component 8, refinement level 3. Here are some points:

0	0	3 8 0	768 768 88544	0	0 0 -44.9000000000002	-1.49617547692393e-06
0	0	3 8 0	768 768 88576	0	0 0 -44.8000000000002	-1.19394220589493e-06
0	0	3 8 0	768 768 88608	0	0 0 -44.7000000000002	-9.09242417991389e-07

kuibit thinks that the point -44.65 belongs to this component, and it is off by a tiny fraction.

Unfortunately, I don't know how to deal with this. I do not want to introduce checks for floating point precision in dealing with coordinates because that's an hot path and it would tremendously slow down a lot of other code. I also cannot think of any easy workaround.

What I would recommend is: this bug exists but it should be rather uncommon. If you change any grid parameter in your simulation, it should go away. If you keep running into this problem, I'll try harder to think if we can somehow catch it without too much performance penalty. Please, keep report any occurrence of this bug you might run into.

@AuroraDysis
Copy link

AuroraDysis commented Jan 26, 2023

What I would recommend is: this bug exists but it should be rather uncommon. If you change any grid parameter in your simulation, it should go away. If you keep running into this problem, I'll try harder to think if we can somehow catch it without too much performance penalty. Please, keep report any occurrence of this bug you might run into.

I could indeed avoid this error by adjusting the grid parameter, but it's not the first time it's happened, so perhaps we can fix it by adding an additional validation method. In this way, one can avoid this mistake until a real solution is found.

@chcheng3
Copy link
Contributor

chcheng3 commented Feb 7, 2023

I would also like to report a related issue with floating point approximations when using rotation180_symmetry_undone. Specifically, when I'm using the method on a UniformGridData object with x0=[0. -255.] and x1=[255. 255.] I had the error

File ~/.local/lib/python3.10/site-packages/kuibit/grid_data.py:1072, in UniformGridData._roto_reflection_symmetry_undone(self, dimension, parity, second_reflect_dimension)
   1066 if second_reflect_dimension is not None:
   1067     # The grid has to be symmetric if we reflect along a second dimension
   1068     if (
   1069         self.x0[second_reflect_dimension]
   1070         != -self.x1[second_reflect_dimension]
   1071     ):
-> 1072         raise RuntimeError(
   1073             f"Grid is not symmetric in the {second_reflect_dimension} direction"
   1074         )
   1076 # See self.grid.coordinates_to_indices():
   1077 # We convert the coordinate 0.0 to the index, this will always overestimate, so
   1078 # it will always be the first positive
   1079 index_first_positive = int(
   1080     (0.0 - self.x0[dimension]) / self.dx[dimension] + 0.5
   1081 )

RuntimeError: Grid is not symmetric in the 1 direction

Indeed if I check the error between x0[1] and -x1[1] it's a small nonzero number 5.684341886080802e-14. At least for this instance, a quick fix is to change the if statement to check within a small tolerance like

            if (
                not np.isclose(self.x0[second_reflect_dimension],
                    -self.x1[second_reflect_dimension],
                    atol=1e-13)
            ):

@Sbozzolo
Copy link
Owner Author

Sbozzolo commented Feb 7, 2023

I would also like to report a related issue with floating point approximations when using rotation180_symmetry_undone. Specifically, when I'm using the method on a UniformGridData object with x0=[0. -255.] and x1=[255. 255.] I had the error

File ~/.local/lib/python3.10/site-packages/kuibit/grid_data.py:1072, in UniformGridData._roto_reflection_symmetry_undone(self, dimension, parity, second_reflect_dimension)
   1066 if second_reflect_dimension is not None:
   1067     # The grid has to be symmetric if we reflect along a second dimension
   1068     if (
   1069         self.x0[second_reflect_dimension]
   1070         != -self.x1[second_reflect_dimension]
   1071     ):
-> 1072         raise RuntimeError(
   1073             f"Grid is not symmetric in the {second_reflect_dimension} direction"
   1074         )
   1076 # See self.grid.coordinates_to_indices():
   1077 # We convert the coordinate 0.0 to the index, this will always overestimate, so
   1078 # it will always be the first positive
   1079 index_first_positive = int(
   1080     (0.0 - self.x0[dimension]) / self.dx[dimension] + 0.5
   1081 )

RuntimeError: Grid is not symmetric in the 1 direction

Indeed if I check the error between x0[1] and -x1[1] it's a small nonzero number 5.684341886080802e-14. At least for this instance, a quick fix is to change the if statement to check within a small tolerance like

            if (
                not np.isclose(self.x0[second_reflect_dimension],
                    -self.x1[second_reflect_dimension],
                    atol=1e-13)
            ):

Yes, this is definitively a bug: I shouldn't have compared floating points with a bare equal. However, the tolerance I've been using is 1e-14, is it enough for your case?

Do you want to open a PR?

If yes,

  • Fork the next branch
  • Add your change
  • Make sure to lint your code
  • Add a note in the NEWS.md
  • Open a PR against next

The test will probably still fail because of a incompatibility with Python 3.11 (which will be solved when the next version of numba is released)

@chcheng3
Copy link
Contributor

chcheng3 commented Feb 7, 2023

However, the tolerance I've been using is 1e-14, is it enough for your case?

In my case the tolerance had to be larger than 5.68e-14, and I imagine there can easily be cases where the error between the floats is even larger. Should we make the tolerance larger (at the expense of coding this value differently from the others)?

@Sbozzolo
Copy link
Owner Author

Sbozzolo commented Feb 7, 2023

However, the tolerance I've been using is 1e-14, is it enough for your case?

In my case the tolerance had to be larger than 5.68e-14, and I imagine there can easily be cases where the error between the floats is even larger. Should we make the tolerance larger (at the expense of coding this value differently from the others)?

From the documentation of np.isclose (https://numpy.org/doc/stable/reference/generated/numpy.isclose.html):

For finite values, isclose uses the following equation to test whether two floating point values are equivalent.

absolute(a - b) <= (atol + rtol * absolute(b))

So, in your case atol=1e-14, rtol=1e-05, and b=255, so a-b <= 2.55e-3 should be satisfied for your values.

(Unless I am not understanding how np.isclose works. This would be entirely possible because np.isclose is really counter-intuitive, as opposed to `math.isclose)

@chcheng3
Copy link
Contributor

chcheng3 commented Feb 7, 2023

So, in your case atol=1e-14, rtol=1e-05, and b=255, so a-b <= 2.55e-3 should be satisfied for your values.

Ahhh, I believe you are correct. I didn't consider the relative tolerance at all. I am making the pull request: #34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants