Reintroduce a tweaked version of the IP-based caching implementation.
Implementing a custom "new" operator has the two following desirable
properties:
- make the code more standalone, not depending on "new" from libcxx.
- teach this allocator to return nullptr on memory shortage ("noexcept")
so it can fail gracefully. If we can't allocate an item, we just don't
cache it.
That should be more resilient to memory shortages and thus more usable
from libexecinfo.
ok rsadowski@ robert@
I've been overzealous when backing out some unrelated changes.
Re-apply requested by robert@
-------------------
Linux still doesn't actually implement IBT for userland. And by the pace
things are going, it will take another decade before it does. But OpenBSD
has it enabled *by default* already.
Drop the #ifdef __linux__. This should hurt other OSes when they finally
catch up with us.
ok robert@, tb@
-------------------
Linux still doesn't actually implement IBT for userland. And by the pace
things are going, it will take another decade before it does. But OpenBSD
has it enabled *by default* already.
Drop the #ifdef __linux__. This should hurt other OSes when they finally
catch up with us.
ok robert@, tb@
-------------------
Initial IP-based caching implementation with O(logn) lookup.
Caching implemented via red-black trees, this can be improved and
further work is on-going to bring it closer to GNU's performance that
uses a LRU-MRU 8-entries based caching algorithm.
Prompted by robert@ who run into a runtime of an executed macro of
5 minutes in libreoffice. With this the execution is reduced to 58 seconds.
C++11 tips from espie@, rsadowski@
Tested by robert@
OK mortimer@, kettenis@.
--------------------
Make the unwind cache tread-safe by declaring it thread_local. Solves
segfaults seen on exception handling. ok kettenis@
--------------------
code at the tentative return address in order to check for a possible
inter-space thunk ("export stub" as called by HP), and in this case, unfolds
a bit more to retrieve the stub return address.
When the code pages are not readable, which is the case by default under
OpenBSD/hppa, this causes an immediate segmentation fault. But there is no
use of multiple space registers under OpenBSD either, so such thunks are
never created by ld(1). We can therefore override the logic by simply
returning the tentative return address, which is correct and does not require
read permission on the code pages.
ok jca@
UBSan and execute-only are mutually exclusive as -fsanitize=function
implied by UBSan sanitizes indirect function calls in which a
destination is only considered valid if preceded by a well known
sequence (0xc105cafe). This of course requires being able to read the
text segment, making it incompatible with execute-only.
Running UBSan on OpenBSD now requires the options:
-fsanitize=undefined -fsanitize-minimal-runtime Wl,--no-execute-only
ok sthen@, tb@
%L%. are wrong.
The fact that this never caused any assembler complaints hints that the two
occurrences of this wrong construct are never hit - and in fact, commenting
them out entirely does not appear to change anything in generated code.
set up in variadic functions with the "new" frame layout, make sure that
variadic functions always setup a frame pointer, even if they qualify for
omitting them (e.g. variadic leaf functions which do not call alloca or use
exceptions), as their save area is relative to %r30.
This buglet amazingly went unnoticed as it was apparently not blatant enough
to break the gcc testsuite - the only piece of code it broke horribly was
the mvme88k boot blocks.
The preserve_{most,all} attributes do not work with retguard so let's warn
the user about them while also ignoring them completely.
ok kettenis@, miod@
basic block reordering.
Basic block span computation has been fixed, hopefully for good, in m88k.md
rev 1.16.
The delayed branch optimization is unfortunately still corrupting register
values by moving instructions which shouldn't, in complex enough code. Even
though the gcc testsuite passes, including the few tests which exercize
this, in gcc's own tree-cfg.c, the combination of the inlining of
update_modified_stmts() and delayed branching creates a code path where the
argument of one update_stmt_operands() call (via update_stmt_is_modified())
is overwritten with a load of the (declared noreturn, and it matters)
fancy_abort() message in the stmt_ann() diagnostic.
This can be reduced to a 127 line testcase, which will hopefully let me cut
my teeth further on this.
In the meantime, disabling this optimization allows gcc 4 to be self-hosting
again on m88k.
Also, revert override of TARGET_BUILTIN_SETJMP_FRAME_VALUE - this was done
while experimenting with sjlj exceptions support, to make them work better,
but now that unwinding works it is no longer useful.
Back when compiler-rt was missing __atomic_is_lock_free, we added it
for 64-bit atomic ops on powerpc. A later version of compiler-rt
added __atomic_is_lock_free to builtins/atomic.c; the linker has been
finding atomic.c and ignoring atomic_lock_free.c.
ok kettenis@
Create a single arch-independent list for them and allow for replacement
of the generic C implementation with arch-dependent assembly code.
ok rsadowski@
perform a setrlimit() call requiring the "proc" pledge if you invoke the
compiler with -dH in order to be able to get a core dump of cc1{,plus} if
it ICEs.
ok deraadt@
and would create incomplete CFI information, concerning only the lowest register
of the pair. Hilarity^Wcoredumps would then occur while trying to handle
exceptions.
Attach a note with the equivalent pair of single stores instructions, to let
it do the right thing.
caused internal compiler errors building libobjc. But libobjc is no longer
being built and I am quite sure the problem would have been fixed by the
other bugfixes of the last few weeks.
in the delay slot of a branch instruction, as output_call had a way to merge
them.
But the jump instruction itself may have another instruction in its own delay
slot. This would cause generated code to be:
bsr.n function
br.n label
another instruction
which confuses the processor and has absolutely no chance to work as intended.
Apparently we had been lucky enough to not stumble upon this in the gcc 2 and
gcc 3 times, but gcc 4 is more daring at optimizing things and ends up
producing this invalid code, especially in exception-related code paths.
The only safe thing to do is to declare jump instructions as not allowed in
the delay slot of branch instructions. This loses the output_call optimization
but code correctness is preferred.