The current implementation creates a non-natural loop and none of the SIL and
LLVM loop passes will work for such loops. We have to find a way to fix this in
SIL. Until then, rewrite so we get a natural loop in SIL.
The Swift and LLVM function mergers were disabled when Swift VFE or WME
are enabled because the function merger did not respect metadata on
calls to `llvm.type.checked.load`. This is no longer the case,
so we can turn these passes back on.
Tool selection is primarily done by checking the executable (= symlink) name.
But sometimes (e.g. if the tool symlink is not there) it's useful to have an option for selecting the tool.
The selection option (e.g. -sil-opt) must be the first argument of swift-frontend.
In low level LLVMARCOptimizer, during canonicalization we don't rauw the result of RT_Retain with its arg similarly to RT_ObjCRetain and RT_BridgeRetain.
And during performLocalReleaseMotion, we assert that we have canonicalized RT_Retain.
In a release compiler, if we optimize such an RT_Retain with a RT_Release, then this can result in a compiler crash
Similarly not rauw'ing, can cause a crash due to performLocalRetainMotion
Fixes rdar://79238115
rdar://problem/48833545
From the LLVM Manual regarding tail/musttail : "Both markers imply that the callee does not access allocas from the caller”
Swift’s LLVMARCContract just marks all the calls it creates as tail call without any analysis and/or checking if we are allowed to do that. This created an interesting runtime crash that was a pain to debug - story time:
I traced a runtime crash back to Swift’s LLVMARCContract, but could not grok why the transformation there is wrong: we replaced two consecutive _swift_bridgeObjectRelease(x) calls with _swift_bridgeObjectRelease_n(x, 2), which is a perfectly valid thing to do.
I noticed that the new call is marked as a tail call, disabling that portion of the pass “solved” the runtime crash, but I wanted to understand *why*:
This code worked:
pushq $2
popq %rsi
movq -168(%rbp), %rdi
callq _swift_bridgeObjectRelease_n
leaq -40(%rbp), %rsp
popq %rbx
popq %r12
popq %r13
popq %r14
popq %r15
popq %rbp
retq
While this version crashed further on during the run:
movq -168(%rbp), %rdi
leaq -40(%rbp), %rsp
popq %rbx
popq %r12
popq %r13
popq %r14
popq %r15
popq %rbp
jmp _swift_bridgeObjectRelease_n
As you can see, the call is the last thing we do before returning, so nothing appeared out of the ordinary at first…
Dumping the heap object at the release basic block looked perfectly fine: the ref count was 2 and all the fields looked valid.
However, when we reached the callee the value was modified / dumping it showed it changed somewhere. Which did not make any sense.
Setting up a memory watchpoint on the heap object and/or its reference count did not get us anywhere: the watchpoint triggered on unrelated code in the entry to the callee..
I then realized what’s going on, here’s a an amusing reproducer that you can checkout in LLDB:
Experiment 1:
Setup a breakpoint at leaq -40(%rbp), %rsp
Dump the heap object - it looks good
Experiment 2:
Rerun the same test with a small modification:
Setup a breakpoint at popq %rbx (the instruction after leaq, do not set a breakpoint at leaq)
Dump the heap object - it looks bad!
So what is going on there? The SIL Optimizer changed an alloc_ref instruction into an alloc_ref [stack], which is a perfectly valid thing to do.
However, this means we allocated the heap object on the stack, and then tail-called into the swift runtime with said object. After having modified the stack pointer in the caller’s epilogue.
So why does experiment 2 show garbage? We’ve updated the stack pointer, and it just so happens that we are after the red zone on the stack. When the breakpoint is hit (the OS passes control back to LLDB), it is perfectly allowed to use the memory where the heap object used to reside.
Note: I then realized something even more concerning, that we were lucky not have hit so far: not only did we not check if we are allowed to mark a call as ’tail’ in this situation, which could have been considered a corner case, we could have if we have not promoted it from heap to stack, but we marked *ALL* the call instructions created in this pass as tail call even if they are not the last thing that occurred in the calling function! Looking at the LVMPasses/contract.ll test case, which is modified in this PR, we see some scary checks that are just wrong: we are checking if a call is marked as ‘tail’ in the middle of the function, then check the rest of the function in CHECK-NEXT lines. Knowing full well that the new ‘tail call’ is not the last thing that should execute in the caller.
We used to represent these just as normal LLVM functions, e.x.:
declare objc_object* @objc_retain(objc_object*)
declare void @objc_release(objc_object*)
Recently, special objc intrinsics were added to LLVM. This pass updates these
small (old) passes to use the new intrinsics.
This turned out to not be too difficult since we never create these
instructions. We only analyze them, move them, and delete them.
rdar://47852297
Dtrace has special linker support to detect and patch probe call sites.
Each call to a dtrace probe must resolve to a unique patchpoint.
rdar://45738058
These functions don't accept local variable heap memory, although the names make it sound like they work on anything. When you try, they mistakenly identify such things as ObjC objects, call through to the equivalent objc_* function, and crash confusingly. This adds Object to the name of each one to make it more clear what they accept.
rdar://problem/37285743
In rare corner cases the pass merged two functions which contain incompatible call instructions.
See source comment in the change for details.
rdar://problem/43051718
Since we introduce the declaration for bridgeRetainN, its result type may be out
of sync with bridgeRetain's. This means that when we perform a RAUW of one for
the other, the types do not match and we get an LLVM error. Instead, just cast
the bridgeRetainN's type to bridgeRetain's result type.
rdar://40507281
Sometimes when running ARCContract on LLVM-IR certain required declarations will
be deleted. This triggers an assert that makes it difficult to work on running
test cases through the pass.
Instead, if we can not find by name the runtime function we are looking for,
recreate the named type as an opaque struct. Since we can not find the function
by name, we can assume that either this is a runtime function that we are the
only passes that create them or that the declarations were dead. Thus, there is
no earlier data, so we can just create a new opaque struct type and use pointers
to that struct_type. If we later need to merge this with another module that did
not delete that type definition, LLVM IR will set the type's body. Since we are
just using pointers to the type, there will be no codegen differences.
rdar://40491584
* Remove RegisterPreservingCC. It was unused.
* Remove DefaultCC from the runtime. The distinction between C_CC and DefaultCC
was unused and inconsistently applied. Separate C_CC and DefaultCC are
still present in the compiler.
* Remove function pointer indirection from runtime functions except those
that are used by Instruments. The remaining Instruments interface is
expected to change later due to function pointer liability.
* Remove swift_rt_ wrappers. Function pointers are an ABI liability that we
don't want, and there are better ways to get nonlazy binding if we need it.
The fully custom wrappers were only needed for RegisterPreservingCC and
for optimizing the Instruments function pointers.
On architectures where the calling convention uses the same argument register as
return register this allows the argument register to be live through the calls.
We use LLVM's 'returned' attribute on the parameter to facilitate this.
We used to perform this optimization via an optimization pass. This was ripped
out some time ago around commit 955e4ed652.
By using LLVM's 'returned' attribute on swift_*retain, we get the same
optimization from the LLVM backend.
First, it fixes a crash where the eliminated function is still referenced.
This shows up if two equivalent self-recursive functions are merged and those functions are internal.
Fixes SR-4514, rdar://problem/31479425
Second, it avoids creating a not needed parameter for really equivalent self recursive functions.
Evidently this has been silently not testing anything for a while. Luckily, the
functionality seems not to be broken, just the test. Also IRGen is emitting the
correct function name as well.
Swift uses rt_swift_* functions to call the Swift runtime without using dyld's stubs. These functions are renamed to swift_rt_* to reduce namespace pollution.
rdar://28706212
When constructing the body of the merged function, the internal function is
given internal linkage always. It is only accessible through the adjusting
thunks which retain the original DLL storage. Ensure that we reset the DLL
storage which it inherited from the first function.
It's like LLVM's MergeFunctions pass, except that it can also merge functions which differ by some constants.
The intention is to merge specialized functions which only differ by metadata lookups. But it can also merge other types of functions.
It gives ~7% code size reducation for the stdlib.
There are still some open TODOs, e.g. to share common code with LLVM's MergeFunctions pass (currently much code is just copied).
This occured if a stack-promoted object with a devirtualized final release is not actually allocated on the stack.
Now the ReleaseDevirtualizer models the procedure of a final release more accurately.
It inserts a set_deallocating instruction and calles the deallocator (instead of just the deinit).
This changes also includes two peephole optimizations in IRGen and LLVMStackPromotion which get rid of
unused runtime calls in case the stack promoted object is really allocated on the stack.
This fixes rdar://problem/25068118
This prevents the linker from trying to emit relative relocations to locally-defined public symbols into dynamic libraries, which gives ld.so heartache.
This lets us remove `swift_fixLifetime` as a real runtime entry point. Also, avoid generating the marker at all if the LLVM ARC optimizer won't be run, as in -Onone or -disable-llvm-arc-optimizer mode.
Assertion failed: (NumUsePointsToFind > 0 && "There must be at least one
releasing instruction for an alloc"), function canPromoteAlloc
Revert "Fix comment for StackPromotion pass in SIL Passes"
Revert "Reapply the StackPromotion commit
0dd045ca04dcc10a33abf57f7e1b08260c4e3de1."
This reverts commit 3f4b1496bd and commit
199cfca13b.