Discussion:
Making x86_64 PLT elision configurable
Nick Clifton
2018-09-05 16:16:35 UTC
Permalink
Hi H.J, Hi Jan,

I would like to add a run time option, with a configurable default,
which would stop the linker from optimizing away x86_64 PLT entries.
The reason being that certain tools like ltrace and rtld-audit rely
upon global symbols having PLT entries that they can intercept. When
these entries are optimized away, the tools stop working.

I get that having these PLT entries might slow the program down, but I
think that a user should be able to choose whether the speed of the
linked binary is more important that being able to use the tools on
it. And toolchain builders ought to be able to decide the default
behaviour for their linker.

So I have two questions for you:

* Do you have any objections to this idea ?

* Are you happy for me to create a patch, or would you rather
make one yourselves ?

Cheers
Nick

PS. Just to be clear, I am starting with the x86_64, but I would expect
that whatever option we create would be architecture neutral and
apply to all targets that want to optimise away PLT entries.

PPS. Specifically I am talking about changing this piece of code in
bfd/elfxx-x86_64.c:elf_x86_allocate_dynrelocs():

if (htab->plt_got != NULL
&& h->type != STT_GNU_IFUNC
&& !h->pointer_equality_needed
&& h->plt.refcount > 0
&& h->got.refcount > 0)
{
/* Don't use the regular PLT if there are both GOT and GOTPLT
reloctions. */
h->plt.offset = (bfd_vma) -1;

/* Use the GOT PLT. */
eh->plt_got.refcount = 1;
}

to something like:

if (info->allow_plt_elision
&& htab->plt_got != NULL
&& h->type != STT_GNU_IFUNC
&& !h->pointer_equality_needed
&& h->plt.refcount > 0
&& h->got.refcount > 0)
{
...
Florian Weimer
2018-09-05 19:18:43 UTC
Permalink
Post by Nick Clifton
I would like to add a run time option, with a configurable default,
which would stop the linker from optimizing away x86_64 PLT entries.
The reason being that certain tools like ltrace and rtld-audit rely
upon global symbols having PLT entries that they can intercept. When
these entries are optimized away, the tools stop working.
I get that having these PLT entries might slow the program down, but I
think that a user should be able to choose whether the speed of the
linked binary is more important that being able to use the tools on
it. And toolchain builders ought to be able to decide the default
behaviour for their linker.
If we look at this problem differently (and had effectively unbounded
time), we could pick up the work on address significance tables and
teach the dynamic linker to synthesize a stub whenever a non-significant
address relocation is used (to record the who-calls-what information),
and use a generic stub which only has what-is-called information for
address-significant relocations.

Thanks,
Florian

Loading...