Commit Graph

7762 Commits

Author SHA1 Message Date
Karoy Lorentey
1326c43b0f [stdlib][NFC] Update some outdated comments 2022-03-24 21:00:00 -07:00
Karoy Lorentey
0523b67e1f [stdlib] String.index(_:offsetBy:limitedBy:): compare limit against original index
Whether the limit actually applies depends on how it’s ordered
relative to the original index `i`, not the one we round down to the
nearest Character.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
c436654b61 [stdlib] Substring._characterStride(startingAt:): Limit stride to the correct bounds 2022-03-24 21:00:00 -07:00
Karoy Lorentey
6d400c81a2 [stdlib] Substring: remove _encodedOffsetRange in favor of existing _offsetRange 2022-03-24 21:00:00 -07:00
Karoy Lorentey
2464aa681e [stdlib] String: Ensure indices are marked scalar aligned before rounding down to Character 2022-03-24 21:00:00 -07:00
Karoy Lorentey
5a22ceb72b [stdlib] _StringGutsSlice: Small adjustments 2022-03-24 21:00:00 -07:00
Karoy Lorentey
a3435704f0 [stdlib][NFC] String normalization: fix terminology (index ⟹ offset) 2022-03-24 21:00:00 -07:00
Karoy Lorentey
321284e9a9 [stdlib] Review & fix index validation during String index conversions
- Validate that the index has the same encoding as the string
- Validate that the index is within bounds
2022-03-24 21:00:00 -07:00
Karoy Lorentey
0c0cbe290d [stdlib] _StringGutsSlice: Don’t mark methods on non-@usableFromInline internal types @inlinable
(This really ought to be diagnosed by the compiler.)
2022-03-24 21:00:00 -07:00
Karoy Lorentey
6245da2457 [stdlib] Substring: Be consistent about how we refer to the underlying string
Prefer direct stored properties to computed ones — there is no reason
to risk inlining issues, esp. since things like `Slice.base` aren’t
even force-inlined.

Prefer using `_wholeGuts` to spelling out the full incantation.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
836bf9ad73 [stdlib] Mark index encodings in String.UTF8View & UTF16View 2022-03-24 21:00:00 -07:00
Karoy Lorentey
8ab2379946 [stdlib] Round indices down to nearest Character in String’s index algorithms
To prevent unaligned indices from breaking well-defined index distance
and index offset calculations, round every index down to the nearest
whole Character.

For the horrific details, see the forum discussion below.

https://forums.swift.org/t/string-index-unification-vs-bidirectionalcollection-requirements/55946

To avoid rounding from regressing String performance in the regular
case (when indices aren’t being passed across string views), introduce
a new String.Index flag bit that indicates that the index is already
Character aligned.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
87073f2af8 [stdlib] Substring.replaceSubrange: fix startIndex/endIndex adjustment
This used to forward to `Slice.replaceSubrange`, but that’s a generic algorithm that isn’t aware of the pecularities of Unicode extended grapheme clusters, and it can be mislead by unusual cases, like a substring or subrange whose bounds aren’t `Character`-aligned, or a replacement string that starts with a continuation scalar.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
a44997eeea [stdlib] Factor scalar-aligned String index validation out into a set of common routines
There are three flavors, corresponding to i < endIndex, i <= endIndex, and range containment checks.
Additionally, we have separate variants for index validation in substrings.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
15c7721caf [stdlib] Use the new index encoding flags when marking the encoding of indices
This removes an unnecessary opaque call from the inlinable path, but it preserves a runtime version check.
2022-03-24 20:59:59 -07:00
Karoy Lorentey
6e18955f90 [stdlib] Add bookkeeping to keep track of the encoding of strings and indices
Assign some previously reserved bits in String.Index and _StringObject to keep track of their associated storage encoding (either UTF-8 or UTF-16).

None of these bits will be reliably set in processes that load binaries compiled with older stdlib releases, but when they do end up getting set, we can use them opportunistically to more reliably detect cases where an index is applied on a string with a mismatching encoding.

As more and more code gets recompiled with 5.7+, the stdlib will gradually become able to detect such issues with complete accuracy.

Code that misuses indices this way was always considered broken; however, String wasn’t able to reliably detect these runtime errors before. Therefore, I expect there is a large amount of broken code out there that keeps using bridged Cocoa String indices (UTF-16) after a mutation turns them into native UTF-8 strings. Therefore, instead of trapping, this commit silently corrects the issue, transcoding the offsets into the correct encoding.

It would probably be a good idea to also emit a runtime warning in addition to recovering from the error. This would generate some noise that would gently nudge folks to fix their code.

rdar://89369680
2022-03-24 20:59:59 -07:00
Karoy Lorentey
683b9fa021 [stdlib] Adjust/fix String’s indexing operations to deal with the consequences of SE-0180 2022-03-24 20:59:59 -07:00
Alastair Houghton
71efd95052 [Demangler][Runtime] Give the demangler its own error handling.
The demangling library can't use the error handling from the main runtime
because it isn't always linked with it.  However, it's useful to have
some error handling, and in particular to be able to get data into the
crash logs.

This is complicated because of the way the demangling library gets used,
the upshot of which is that I've had to add a second object library just
for libswiftCore's use, so that the demangler will use the runtime's
error handling functions when present, and fall back on its own when
they aren't.

rdar://89139049
2022-03-24 13:05:13 +00:00
swift_jenkins
cfa14e3634 Merge remote-tracking branch 'origin/main' into next 2022-03-23 14:21:51 -07:00
David Smith
cffe105d98 Merge pull request #41964 from Catfish-Man/single-instruction-multiple-ifdef
Only use SIMD when stdlib vector types are available
2022-03-23 14:05:13 -07:00
swift_jenkins
1dd02c09e8 Merge remote-tracking branch 'origin/main' into next 2022-03-22 19:00:53 -07:00
David Smith
398de941b6 Merge pull request #41897 from Catfish-Man/we-control-the-horizontal
Stay in vectors longer before doing a horizontal sum
2022-03-22 18:54:40 -07:00
David Smith
c05e47dd60 Only use SIMD when stdlib vector types are available 2022-03-22 15:48:24 -07:00
swift_jenkins
76f91dbe18 Merge remote-tracking branch 'origin/main' into next 2022-03-18 16:41:36 -07:00
David Smith
36b41d4940 Merge pull request #41869 from Catfish-Man/stack-promotion-and-ar-raise
Use withUnsafeTemporaryAllocation instead of a temporary Array
2022-03-18 16:27:31 -07:00
David Smith
dbaada435c Stay in vectors longer before doing a horizontal sum 2022-03-18 15:27:40 -07:00
swift_jenkins
1e556247b1 Merge remote-tracking branch 'origin/main' into next 2022-03-18 10:01:20 -07:00
David Smith
cb082e185c Merge pull request #41866 from Catfish-Man/what-a-crumb-y-optimization
Vectorize UTF16 offset calculations
2022-03-18 09:53:55 -07:00
David Smith
9ad3c9a5db Use withUnsafeTemporaryAllocation instead of a temporary Array 2022-03-17 17:08:05 -07:00
David Smith
eaf3f316ec Vectorize UTF16 offset calculations 2022-03-17 14:18:21 -07:00
swift_jenkins
2f51506809 Merge remote-tracking branch 'origin/main' into next 2022-03-16 15:40:24 -07:00
Guillaume Lessard
b17b1a9d04 Merge pull request #41836 from glessard/sr15994
[stdlib] tolerate empty source buffers in `UMRBP.copyMemory`
2022-03-16 16:24:44 -06:00
Guillaume Lessard
e40e0f7580 [stdlib] improve UnsafeMutableRawBufferPointer.copyMemory
- The previous implementation trapped when the source buffer was empty. That behaviour is both not documented and unnecessary. If the source buffer is empty, its length is necessarily shorter or equal than the length of the destination.
- The updated version simply returns when the source buffer is empty.
2022-03-16 10:05:21 -06:00
swift_jenkins
89c623f6de Merge remote-tracking branch 'origin/main' into next 2022-03-15 10:00:42 -07:00
Konrad `ktoso` Malawski
4fa0855907 [Distributed] RemoteCallTarget now pretty prints and hides mangled names 2022-03-15 17:34:04 +09:00
swift_jenkins
180b575cd5 Merge remote-tracking branch 'origin/main' into next 2022-03-14 12:20:40 -07:00
Alejandro Alonso
27e6241a41 Merge pull request #41389 from Azoy/fix-indic-sequences
[stdlib] Fix backwards count of Indic graphemes
2022-03-14 12:08:21 -07:00
swift_jenkins
c2946e7d62 Merge remote-tracking branch 'origin/main' into next 2022-03-10 20:20:50 -08:00
Doug Gregor
ea89407304 Extend the lifetime of the buffer while copying from a _CocoaArrayWrapper
Fixes rdar://90108215.
2022-03-10 18:03:07 -08:00
swift_jenkins
b50c587501 Merge remote-tracking branch 'origin/main' into next 2022-03-08 08:41:06 -08:00
André Mello
af461ec5a4 Improve description of the pi constant (#41585) 2022-03-08 11:21:14 -05:00
swift_jenkins
0e494ea5bd Merge remote-tracking branch 'origin/main' into next 2022-03-07 22:20:58 -08:00
Nate Cook
544d10f2cb Fix error in MutableCollection.shuffle() docs 2022-03-07 22:17:58 -06:00
swift_jenkins
5c4bac9d1f Merge remote-tracking branch 'origin/main' into next 2022-03-07 15:20:55 -08:00
Robert Widmann
7cd3541c62 Merge pull request #41656 from MillerTechnologyPeru/feature/ppc32
[stdlib] Added PowerPC 32-bit support
2022-03-07 15:08:26 -08:00
swift_jenkins
6626d91625 Merge remote-tracking branch 'origin/main' into next 2022-03-07 12:01:13 -08:00
Saleem Abdulrasool
9be5586688 Merge pull request #41705 from compnerd/arm64-long
core: map UnsignedLong to UInt32
2022-03-07 11:47:19 -08:00
swift_jenkins
fe8398658f Merge remote-tracking branch 'origin/main' into next 2022-03-07 11:02:05 -08:00
Alex Martini
6491da8b62 Merge pull request #41339 from amartini51/BidirectionalCollection_88709422
Add missing word & slightly clarify.
2022-03-07 10:52:53 -08:00
Saleem Abdulrasool
20ade9b602 core: map UnsignedLong to UInt32
Windows ARM64 is a LLP64 platform, which means that `unsigned long` is a
32-bit value.  This was already mapped properly for x86_64, but somehow
had missed ARM64.  This repairs that which is required for building the
standard library.
2022-03-07 08:11:42 -08:00