There are a couple of known race conditions that seem to be stemming
from the concurrency runtime, triggering test failures in the TSan
tests.
rdar://158355890
Disabling several TSan tests that are failing due to a bug in the thread
sanitizer. TSan appears to be setting the thread to ignored in the
`thr_exit` interceptor, and then immediately checking that the thread
isn't being ignored, and dying.
```
ThreadSanitizer: main thread finished with ignores enabled
One of the following ignores was not ended (in order of probability)
```
rdar://158450231
Fixing pthread usage in tsan and tsan-inout tests. pthreads are imported
as opaque pointers on FreeBDS, and thus need to be kept in an optional
pthread_t, like on Apple platforms.
Unlike on macOS, pthread_join is not annotated with nullability
annotations and thus takes an optional opaque pointer, so we don't need
to unwrap it.
Unified builds of compiler-rt together with LLVM failed for the Android SDKs. It got too complicated to redirect the way LLVM would configure the nested build-trees. Standalone builds slightly increase build time, but they turned out much simpler and we end up with less duplication of definitions.
* Use the longer name ThreadSanitizer rather than TSan for the new files.
* Don't implement `tsan::consume` at all for now.
* Do the `tsan::release` for `ulock_unlock()` at the head of the function,
not at the tail.
* Add a comment to test/Sanitizers/tsan/once.swift to explain the test a
little more clearly.
rdar://110665213
Move the TSan functionality from Concurrency into Threading. Use it
in the Linux `ulock` implementation so that TSan knows about `ulock`
and will tolerate the newer `swift_once` implementation that uses it.
rdar://110665213
The real fix here is to make sure the . operator actually gets its own source
location. Right now, it depends on what code is being inlined from the stdlib
thus the failure with the debug stdlib only. We can relax the test further in
the short term.
rdar://108132971
This patch replaces the stateful generation of SILScope information in
SILGenFunction with data derived from the ASTScope hierarchy, which should be
100% in sync with the scopes needed for local variables. The goal is to
eliminate the surprising effects that the stack of cleanup operations can have
on the current state of SILBuilder leading to a fully deterministic (in the
sense of: predictible by a human) association of SILDebugScopes with
SILInstructions. The patch also eliminates the need to many workarounds. There
are still some accomodations for several Sema transformation passes such as
ResultBuilders, which don't correctly update the source locations when moving
around nodes. If these were implemented as macros, this problem would disappear.
This necessary rewrite of the macro scope handling included in this patch also
adds proper support nested macro expansions.
This fixes
rdar://88274783
and either fixes or at least partially addresses the following:
rdar://89252827
rdar://105186946
rdar://105757810
rdar://105997826
rdar://105102288
This test does:
```
race()
print("Done!")
// CHECK: ThreadSanitizer: data race
// CHECK: Done!
```
We see some recent cases where the output of the test binary on iOS
devices was:
```
Done!
==================
WARNING: ThreadSanitizer: data race
…
```
So apparently the TSan report output is not guaranteed to be printed
before "Done!". Maybe this is because we print "Done!" on stdout and
the sanitizer report on stderr?
The remaining question is: what changed that we are seeing this issue
now, but not previously?
rdar://99713724
Co-authored-by: Julian Lettner <julian.lettner@apple.com>
Load task status with an acquire when canceling a task, to synchronize with the store-release that comes when updating a task's status.
Add explicit TSan calls in cancellation, as well as withStatusRecordLock and addStatusRecord, to avoid TSan complaining about data races when canceling a task.
Add a test that checks for TSan-reported data races when canceling a task.
rdar://93892417
- #58975 switched many tests from XFAIL on linux to linux-gnu, so seven
fail on the Android CI and two natively. They are now explicitly excluded.
- #39605 added several C++ Interop tests, 11 of which fail on the Android CI,
so disable them for now.
- #42478 removed the @noescape attribute for the non-Android
SIL/clang-function-types tests, so I remove it for Android too.
- My pull #40779 moved the Swift pointer tags to the second byte, so
SILOptimizer/concat_string_literals.64 will need to be updated for that,
disabled it for now.
- Compiler-rt moved the directory in which it places those libraries for
Android, llvm/llvm-project@a68ccba, so lit.cfg is updated for that.
lit.py currently allows any substring of `target_triple` to be used as a
feature in REQUIRES/UNSUPPORTED/XFAIL. This results in various forms of
the OS spread across the tests and is also somewhat confusing since they
aren't actually listed in the available features.
Modify all OS-related features to use the `OS=` version that Swift adds
instead. We can later remove `config.target_triple` so that these don't
the non-OS versions don't work in the first place.
There was a regression in atos, but now a new-enough Xcode version that
includes the fixed atos is now available on Swift CI bots. We can
re-enable the tests.
Radar-Id: rdar://problem/85471075
Co-authored-by: Julian Lettner <julian.lettner@apple.com>
We won't look for async functions that can't be used due to availability
unless the availability checking is disabled. Need to disable the
availability checking due to the minimum deploy target being too low for
concurrency.
The concurrency runtime now deploys back to macOS 10.15, iOS 13.0, watchOS 6.0, tvOS 13.0, which corresponds to the 5.1 release of the stdlib.
Adjust macro usages accordingly.
The two TSan versions of the
`test/Concurrency/Runtime/async_let_fibonacci.swift` were disabled for
different reasons:
[1] Swift Concurrency work broke the test and it was never re-enabled.
[2] Regression in atos required test to be disabled. Re-enablement was
blocked on Swift CI upgrading to an Xcode that contains the fixed
version of atos.
While the TSan versions of the test was not running they fell out of
sync with the original test and then started failing for different
(trivial) reasons once re-enabled.
Please help us keep these tests running by:
* Not landing work that makes them fail (and deferring the fix to a
later point), if at all possible. Breaking sanitizers should be a
"blocker".
* Keeping them in sync with the original tests.
Radar-Id: rdar://83162880
[1] rdar://76446550 (Re-enable test: Sanitizers/tsan/async_let_fibonacci.swift)
[2] rdar://80274830 ([Swift CI] Sanitizer report symbolication fails due to regression in atos)
Co-authored-by: Julian Lettner <julian.lettner@apple.com>
We updated the Swift CI nodes to a version of Xcode that includes the
fix for a regression in atos that broke sanitizer report symbolication.
Regression: rdar://79151503 (If atos is handed a dSYM, it should find the binary rather than erring)
Fix: rdar://80345994 (Regression: atos -p <pid> exits immediately)
Radar-Id: rdar://80274830
Co-authored-by: Julian Lettner <julian.lettner@apple.com>
After upgrading the OS and Xcode on the CI nodes sanitizer report
symbolication fails because we fail to start atos. This might be a
sandboxing issue.
Radar-Id: rdar://80274830
* Synchronize both versions of actor_counters.swift test
* Synchronize on Job address
Make sure to synchronize on Job address (AsyncTasks are Jobs, but not
all Jobs are AsyncTasks).
* Add fprintf debug output for TSan acquire/release
* Add tsan_release edge on task creation
without this, we are getting false data races between when a task
is created and immediately scheduled on a different thread.
False positive for `Sanitizers/tsan/actor_counters.swift` test:
```
WARNING: ThreadSanitizer: data race (pid=81452)
Read of size 8 at 0x7b2000000560 by thread T5:
#0 Counter.next() <null>:2 (a.out:x86_64+0x1000047f8)
#1 (1) suspend resume partial function for worker(identity:counters:numIterations:) <null>:2 (a.out:x86_64+0x100005961)
#2 swift::runJobInEstablishedExecutorContext(swift::Job*) <null>:2 (libswift_Concurrency.dylib:x86_64+0x280ef)
Previous write of size 8 at 0x7b2000000560 by main thread:
#0 Counter.init(maxCount:) <null>:2 (a.out:x86_64+0x1000046af)
#1 Counter.__allocating_init(maxCount:) <null>:2 (a.out:x86_64+0x100004619)
#2 runTest(numCounters:numWorkers:numIterations:) <null>:2 (a.out:x86_64+0x100006d2e)
#3 swift::runJobInEstablishedExecutorContext(swift::Job*) <null>:2 (libswift_Concurrency.dylib:x86_64+0x280ef)
#4 main <null>:2 (a.out:x86_64+0x10000a175)
```
New edge with this change:
```
[4357150208] allocate task 0x7b3800000000, parent = 0x0
[4357150208] creating task 0x7b3800000000 with parent 0x0
[4357150208] tsan_release on 0x7b3800000000 <<< new release edge
[139088221442048] tsan_acquire on 0x7b3800000000
[139088221442048] trying to switch from executor 0x0 to 0x7ff85e2d9a00
[139088221442048] switch failed, task 0x7b3800000000 enqueued on executor 0x7ff85e2d9a00
[139088221442048] enqueue job 0x7b3800000000 on executor 0x7ff85e2d9a00
[139088221442048] tsan_release on 0x7b3800000000
[139088221442048] tsan_release on 0x7b3800000000
[4357150208] tsan_acquire on 0x7b3800000000
counters: 1, workers: 1, iterations: 1
[4357150208] allocate task 0x7b3c00000000, parent = 0x0
[4357150208] creating task 0x7b3c00000000 with parent 0x0
[4357150208] tsan_release on 0x7b3c00000000 <<< new release edge
[139088221442048] tsan_acquire on 0x7b3c00000000
[4357150208] task 0x7b3800000000 waiting on task 0x7b3c00000000, going to sleep
[4357150208] tsan_release on 0x7b3800000000
[4357150208] tsan_release on 0x7b3800000000
[139088221442048] getting current executor 0x0
[139088221442048] tsan_release on 0x7b3c00000000
...
```
rdar://78932849
* Add static_cast<Job *>()
* Move TSan release edge to swift_task_enqueueGlobal()
Move the TSan release edge from `swift_task_create_commonImpl()` to
`swift_task_enqueueGlobalImpl()`. Task creation itself is not an event
that needs synchronization, but rather that task creation "happens
before" execution of that task on another thread.
This edge is usually added when the task is scheduled via
`swift_task_enqueue()` (which then usually calls
`swift_task_enqueueGlobal()`). However, not all task scheduling goes
through the `swift_task_enqueue()` funnel as some places call the more
specific `swift_task_enqueueGlobal()` directly. So let's annotate this
function (duplicate edges aren't harmful) to ensure we cover all
schedule events, including newly-created tasks (our original problem
here).
rdar://78932849
Co-authored-by: Julian Lettner <julian.lettner@apple.com>