Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ppci-cc: alignment problem for vararg functions arguments #97

Open
tstreiff opened this issue Jun 25, 2020 · 1 comment
Open

ppci-cc: alignment problem for vararg functions arguments #97

tstreiff opened this issue Jun 25, 2020 · 1 comment
Labels

Comments

@tstreiff
Copy link
Contributor

When calling a vararg function (like printf), the IR generator allocates a memory block on the stack and fills it with the variable arguments.
In doing so, padding is inserted when needed to cover the alignement constraints.

The callee receives the memory block and uses a pointer to read the expected type, then increments the pointer with the size of the expected type, so it does not handle padding.

This does not work in all cases where padding is inserted by the caller since it is ignored by the callee.

Typical case that does not work (and crashes most of the time):

int i;
char *pc;
printf("%d %s", i, pc);

For x86_64, this creates a 16byte block filled as follows;

  • the 4byte i is stored at offset 0
  • a 4byte padding at offset 4
  • the 8byte pc is stored at offset 8

Printf uses va_arg(int) then va_arg(char *) and will not skip any padding:

  • read the 4byte i at offset 0 (OK)
  • read the 8byte pc at offset 4 (wrong)

Two solutions:

  1. Either the caller never uses any padding
  2. Or arguments are all aligned on the strongest alignement contraint (8byte on x86_64)

Solution 2) is the only that is compliant with the alignment constraints.
x86 is tolerant towards misaligned data but other architectures are much more sensitive.

The strongest alignment constraint could be computed once (and put in the context) by taking the strongest alignment among int, long, and pointer types. The information would then be used in varrag callee and caller IR generation.

@windelbouwman
Copy link
Owner

Thanks for this detailed analysis!

I'm at the moment not very satisfied how vararg is implemented. It is by no means compatible with linux x86_64 printf for example.

I had a look at this document: https://web.archive.org/web/20160801075139/http://www.x86-64.org/documentation/abi.pdf

Seems like the va_list type is a specific thing per architecture.

One way to handle this properly would be to add additional IR-code instructions I assume, and have a sort of polyfill method which falls back to this old method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants