Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probably incorrect error message. #497

Closed
DennisYurichev opened this issue Jul 18, 2024 · 21 comments · Fixed by #513
Closed

Probably incorrect error message. #497

DennisYurichev opened this issue Jul 18, 2024 · 21 comments · Fixed by #513

Comments

@DennisYurichev
Copy link

I tried (unsuccessfully) to find a next term for https://oeis.org/A348651

Using GMP:

#include <stdio.h>
#include <stdint.h>
#include <gmp.h>

#define _FAC_13 6227020800

void f1 (uint64_t x)
{
        mpz_t tmp;

        mpz_init (tmp);
        mpz_fac_ui (tmp, x);
        printf ("%ld\n", mpz_popcount(tmp));
}

int main (void)
{
        f1(_FAC_13);
};

On an instance with 300G of RAM, it consumed up to ~80-100G and stopped with the error: "gmp: overflow in mpz type".
Ouch!

Also I tried gmpy:

from gmpy2 import fac, popcount

def A348651(n):
    return popcount(fac(fac(n)))

print (A348651(13))

It also consumed that much RAM and stopped with error "Floating point exception".

Maybe you should know about such behaviour.

@skirpichev
Copy link
Contributor

Yeah, that's something expected. See e.g. #280: the GMP library can't nicely workaround out of memory condition, it just die with some error messages, which might vary with the GMP version. (Please, next time specify versions of the gmpy2 and used libraries.)

Probably we should at least add a warning in docs.

Not sure if it worth, but as it was noted in the referenced issue: it's possible to implement big integers (only basic arithmetic, of course) on top of low-level GMP API, this will not suffer from this problem.

@DennisYurichev
Copy link
Author

(Please, next time specify versions of the gmpy2 and used libraries.)

Recent, of course.

Not sure if it worth, but as it was noted in the referenced issue: it's possible to implement big integers (only basic arithmetic, of course) on top of low-level GMP API, this will not suffer from this problem.

Using only limb functions and structures?

@skirpichev
Copy link
Contributor

Using only limb functions and structures?

Yes, sec 8 of the manual - Low-level functions.

@skirpichev
Copy link
Contributor

Oh, I forgot. In fact I did docs update, you can see warnings on my fork, which displayed up to date docs: https://gmpy-skirpichev.readthedocs.io/en/latest/overview.html

@DennisYurichev, how this looks?

@casevh, https://gmpy2.readthedocs.io/ again out of sync, this now display pre-2.2.0 docs. Maybe we could really solve #358?

After update, probably this issue should be closed. @casevh, new integer type (based on low-level GMP API) does make sense for you?

@DennisYurichev
Copy link
Author

Well, I'm not a gmpy expert, can't say what is the right solution.

@skirpichev
Copy link
Contributor

@DennisYurichev, my question was about docs. That's just matter of user experience. Is this a a right place? Will this prevent issues like this or wording in docs is too vague and lacks something?

@DennisYurichev
Copy link
Author

Being in shoes of manual's maintainer, I would write something like: "In case of variable overflow, may print 'Floating point error' message, this is a known quirk, to be fixed in future versions".
IMHO!

@skirpichev
Copy link
Contributor

Unfortunately, it's not something that could be fixed in the gmpy2 alone. That seems to be a quirk of some OS (M$ Windows, I guess, because message looks totally misleading: e.g. what kind of "Floating point error"?).

On my system I got:

$ sudo swapoff -a
$ ./configure --enable-fat --enable-shared --disable-static --with-pic -q && make -s
[...]
$ gcc a.c -lgmp
$ LD_LIBRARY_PATH=.libs ./a.out
GNU MP: Cannot allocate memory (size=2316966024)
Aborted
$ python3.12 -m venv xxx
$ . xxx/bin/activate
$ pip install gmpy2
[...]
$ python
Python 3.12.0+ (heads/3.12:2162512d71, Jan 13 2024, 11:25:20) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gmpy2 import fac, popcount, mp_version, __version__
>>> __version__
'2.2.1'
>>> mp_version()
'GMP 6.3.0'
>>> popcount(fac(fac(13)))
GNU MP: Cannot allocate memory (size=2316966024)
Aborted

@DennisYurichev
Copy link
Author

I haven't access to that huge RAM instance anymore, but I remember I installed gmpy via pip3. But of course, it could be older versions.

@casevh
Copy link
Collaborator

casevh commented Aug 5, 2024

GMP aborts when it encounters an error that it can't handle. Two such errors are out-of-memory and divide-by-zero. There may be others (mpz maximum size overflow which may be handled differently than out-of-memory).

gmpy2 tries to catch all the divide-by-zero situations and raise the appropriate exception. To trap out-of-memory, gmpy2 would need to install custom memory handlers into GMP. This was done in the gmpy 1.10 to 1.13 era.

Unfortunately, no consistent error message appears across all errors and all platforms.

@skirpichev
Copy link
Contributor

gmpy2 tries to catch all the divide-by-zero situations and raise the appropriate exception.

Indeed, GMP just call abort() in this case too, per default. But we can catch this:

$ cat a.c
#include <signal.h>
#include <setjmp.h>
#include <stdio.h>
#include <gmp.h>
jmp_buf excep;
void handler(int signum)
{
    printf("gmp_errno: %d\n", gmp_errno);
    longjmp(excep, 1);
}
int main(void)
{
    mpz_t z, d;
    mpz_init_set_si(z, 123);
    mpz_init_set_si(d, 0);
    signal(SIGFPE, handler);
    printf("before\n");
    if (setjmp(excep) == 0)
        mpz_cdiv_q(z, z, d);
    else
        printf("Exception caught\n");
    printf("after\n");
    return 0;
}
$ gcc a.c -lgmp -lm && ./a.out ; echo $?
before
gmp_errno: 2
Exception caught
after
0

But divide-by-zero is not one error, that raises SIGFPE. Other example is MPZ_OVERFLOW: this exception happens on reallocation errors e.g. in mpz/realloc.c. I would guess, this is something happened in the second OP example. And that also signals out-of-memory condition.

In principle, that case could be caught just as above. But I worry that using longjmp here is unsafe just for same reasons as for custom allocation functions (see below).

To trap out-of-memory, gmpy2 would need to install custom memory handlers into GMP. This was done in the gmpy 1.10 to 1.13 era.

Are you sure that this is doable with recent GMP? Unlike above scenario, here GMP just calls abort() in memory.c.

You can replace these functions with custom allocation functions, but manual says: "There’s currently no defined way for the allocation functions to recover from an error such as out of memory, they must terminate program execution. A longjmp or throwing a C++ exception will have undefined results. This may change in the future."

@casevh
Copy link
Collaborator

casevh commented Aug 5, 2024

But divide-by-zero is not one error, that raises SIGFPE. Other example is MPZ_OVERFLOW: this exception happens on reallocation errors e.g. in mpz/realloc.c. I would guess, this is something happened in the second OP example. And that also signals out-of-memory condition.

Apologies for not using precise terminology. I was using error to refer to the various scenarios that cause GMP abort/crash/whatever.

You can replace these functions with custom allocation functions, but manual says: "There’s currently no defined way for the allocation functions to recover from an error such as out of memory, they must terminate program execution. A longjmp or throwing a C++ exception will have undefined results. This may change in the future."

The custom memory allocation functions can't prevent GMP from aborting, but by inspecting the return value, they can raise the MemoryError exception. The program still aborts but at least there usually is a consistent error message.

I more interested in modifying the memory allocation function for debugging crashes (how much memory was actually requested) and rounding up the size of the memory request to decrease reallocs. I'm also interested in using mimalloc when it used by CPython. There are challenges with shared libraries. When I have time, I'll create an issue for custom memory allocators.

@skirpichev
Copy link
Contributor

skirpichev commented Aug 5, 2024

I was using error to refer to the various scenarios that cause GMP abort/crash/whatever.

Looking on the code, I think there are several kind of such errors:

  1. domain error (GMP_ERROR_SQRT_OF_NEGATIVE, GMP_ERROR_DIVISION_BY_ZERO) - that should be handled by gmpy2 right now.
  2. GMP_ERROR_MPZ_OVERFLOW, coming from GMP's mpz_init2(), mpz_realloc2() and _mpz_realloc() (second example OP)
  3. various errors from default GMP's allocation functions (those looks meaningful, like "GNU MP: Cannot allocate memory (size=2316966024)").

The custom memory allocation functions can't prevent GMP from aborting

They can, but I doubt it's safe.

The program still aborts but at least there usually is a consistent error message.

I don't think this is a solution. Custom allocation functions can customize behaviour for 3) scenario. But for 1) and 2) - it seems, the correct approach is using a signal handler to test gmp_errno and issue an appropriate error message. (Like above example, just without longjmp/setjmp.)

I'll provide such patch. Now docs seems up to date and I think this will address this issue. Hard part if to figure out how to test this for all cases.

I'm also interested in using mimalloc when it used by CPython.

Not sure about debugging, but this does make sense for me.

@skirpichev
Copy link
Contributor

skirpichev commented Sep 13, 2024

Here is patch for review: #513

I think that I know why OP got "Floating point exception". Since GMP 6.3.0, some errors in mpz/realloc.c handled, using __gmp_exception() (that do raise(SIGFPE)). Proposed patch catch this signal and emit a helpful (I hope) message.

Previously, error message "gmp: overflow in mpz type" was printed on such conditions.

I believe, it's a GMP issue and should be reported to upstream. Can someone do this? (I can't, see #465)

@DennisYurichev, you told us, that you installed gmpy2 via pip in the second case. That might be binary wheel. Then, probably one come with bundled GMP of latest version (6.3+).

But in the first case, you obviously were not with up to date GMP version (you can download tarball for 6.3.0 and verify that "gmp: overflow in mpz type" messages now absent in the codebase).

@dimpase
Copy link
Contributor

dimpase commented Sep 13, 2024

not me, that's @DennisYurichev ?

@skirpichev
Copy link
Contributor

@casevh, do you think there is no upstream issue?

@dimpase
Copy link
Contributor

dimpase commented Sep 14, 2024

do you think there is no upstream issue?

@skirpichev I can help with reporting to the GMP list - please contact me per email.

@skirpichev
Copy link
Contributor

JFR, here is the thread: https://gmplib.org/list-archives/gmp-discuss/2024-September/006967.html General outcome seems to be "that's your problem". So, we probably will use #513 approach in a long term :(

@dimpase, thank for help!

@dimpase
Copy link
Contributor

dimpase commented Sep 26, 2024

JFR, here is the thread: https://gmplib.org/list-archives/gmp-discuss/2024-September/006967.html General outcome seems to be "that's your problem". So, we probably will use #513 approach in a long term :(

By the way, the whole thing is compiler-specific too (something that was skipped there).
Would be interesting to get to the core of this - is it glibc-specific, or gcc-specific, or both?

E.g. a gcc-compiled native integer zero division does indeed throw SIGFPE on x86_64, but with clang you don't seem to get no exception at all.

@dimpase
Copy link
Contributor

dimpase commented Sep 26, 2024

It's definitely platform-dependent; on OpenBSD with GMP 6.3.0 and gmpy2 2.2.1 I get

GNU MP: Cannot allocate memory (size=4296015888)

whenever I try running the C and the Python code in #497 (comment)
That's built with clang 16.0.6 (and not linked with glibc, it's a rather different runtime)

@skirpichev
Copy link
Contributor

Reaction on SIGFPE is, probably, system-specific. Though, usually it is a message like "floating-point exception".

But in this case:

I get GNU MP: Cannot allocate memory (size=4296015888)

  • no, it's a message from GMP. For certain memory conditions they print some error to stderr and then call abort(). But for others (division by zero, square root of negative or overflow in mpz) - since 6.3 they just call raise(SIGFPE).

I suspect your system has less memory than OP, so you got a memory error first, not an overflow in mpz type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants