Commit Graph

80 Commits

Author SHA1 Message Date
Guillaume Lessard
dd833569e0 [stdlib] rename Span.extracting functions 2025-07-09 17:25:27 -07:00
Guillaume Lessard
791dd4cb0a [stdlib] name your constants better
- computed properties have an ABI impact; a function is fine here
2025-06-25 22:01:28 -07:00
Guillaume Lessard
4a13c916d7 [stdlib] name your constants 2025-06-24 17:19:48 -07:00
Guillaume Lessard
e8f0d52fe2 [gardening] labels for conditional compilation markers 2025-06-24 17:10:37 -07:00
Guillaume Lessard
3aad241019 [stdlib] address 32-bit watchOS ABI issue 2025-06-23 18:31:16 -07:00
Guillaume Lessard
9e8019a0fe [stdlib] make annotation adjustments 2025-06-06 08:21:48 -07:00
Guillaume Lessard
524b717944 [stdlib] work around autoclosure issues
- The move-only checker has issues with the existence of autoclosures.
- These `borrowing` accessors are within the purview of the move-only checker.
2025-04-14 10:48:42 -07:00
Guillaume Lessard
2d50c6cce9 Revert "[temporary] disable small-string support"
This reverts commit ea44ff9fc9.
2025-04-14 10:48:42 -07:00
Allan Shortlidge
14e0eed88f stdlib: Address unreachable code warnings. 2025-04-03 10:19:46 -07:00
Allan Shortlidge
d32310fb76 stdlib: Address new #StrictMemorySafety warnings. 2025-04-03 10:18:39 -07:00
Guillaume Lessard
f3a01663ac [stdlib] document limitations for Substring’s span 2025-04-02 17:53:13 -07:00
Guillaume Lessard
bde7a1871c [stdlib] harden bounds checking in Substring’s span 2025-04-02 17:53:12 -07:00
Guillaume Lessard
ea44ff9fc9 [temporary] disable small-string support 2025-04-02 17:53:12 -07:00
Guillaume Lessard
6059be4bca [stdlib] simplify spans of bridged substrings 2025-03-31 12:05:50 -07:00
Guillaume Lessard
85a9be6a1b [stdlib] spans over bridged Substring instances 2025-03-31 12:05:50 -07:00
Guillaume Lessard
a061425aae [stdlib] add storage property to Substring.UTF8View 2025-03-31 12:05:50 -07:00
Guillaume Lessard
dfb2e2f12e [stdlib] annotate uses of Range.init(_uncheckedBounds:) 2025-03-05 18:52:11 -08:00
Doug Gregor
22eecacc35 Adopt unsafe annotations throughout the standard library 2025-02-26 14:28:01 -08:00
Guillaume Lessard
8912098c01 [doc] adjust to reflect the current implementation. 2023-07-12 14:13:19 -07:00
Valeriy Van
95d25d3ba5 Minor comment typo fix 2022-11-15 10:40:43 +02:00
Karoy Lorentey
cb2194c024 [stdlib] Fix ABI and portability issues 2022-04-13 19:15:30 -07:00
Karoy Lorentey
b33fefb71c [stdlib] String: be more consistent about when markEncoding is called 2022-04-13 18:38:41 -07:00
Karoy Lorentey
67adcabefc Apply notes from code review 2022-04-10 00:14:43 -07:00
Karoy Lorentey
3c9968945e [stdlib] String: Implement happy paths for index validation 2022-04-10 00:14:43 -07:00
Karoy Lorentey
d24ae9dfcd [stdlib] Remove Substring._endIsCharacterAligned
Now that the cached character stride in indices always mean the
stride in the full string, we can stop looking at whether a substring
has a character-aligned end index.
2022-04-09 21:33:53 -07:00
Karoy Lorentey
83df814c63 [stdlib] _StringObject.isKnownUTF16 → isForeignUTF8
This fixes a compatibility issue with potential future UTF-8 encoded
foreign String forms, as well as simplifying the code a bit — we no
longer need to do an availability check on inlinable fast paths.

The isForeignUTF8 bit is never set by any past or current stdlib
version, but it allows us to introduce UTF-8 encoded foreign forms
without breaking inlinable index encoding validation introduced in
Swift 5.7.
2022-04-09 21:33:53 -07:00
Karoy Lorentey
73312fedd4 [stdlib] Grapheme breaking: Refactor to simplify logic
- Split forward and backward direction into separate code paths.
  This makes the code more readable and paves the way for future
  improvements. (E.g., switching to a linear-time algorithm for
  breaking backwards.)
- `Substring.index(after:)` now uses the same grapheme breaking paths
  as `String.index(after:)`.
- The cached stride value in string indices is now well-defined even
  on indices that aren’t character-aligned.
2022-04-05 20:47:42 -07:00
Karoy Lorentey
e0bd5f7a79 [stdlib] Fix Substring.UnicodeScalarView.replaceSubrange 2022-04-05 20:47:42 -07:00
Karoy Lorentey
2e9fd9eb6b [stdlib] Substring.UnicodeScalarView: Add _invariantCheck 2022-04-05 20:47:42 -07:00
Karoy Lorentey
f7c674ed55 [stdlib] Slice._bounds, Range<String.Index>._encodedOffsetRange: New helpers 2022-04-05 20:47:42 -07:00
Karoy Lorentey
ff58d54565 [stdlib][NFC] Substring adjustments 2022-03-29 20:10:40 -07:00
Karoy Lorentey
c9adf7aaea [stdlib] Substring: Review view creation/conversion code 2022-03-29 20:10:40 -07:00
Karoy Lorentey
b7c54ac41c [stdlib] Substring.init: Stop checking things twice 2022-03-29 20:00:08 -07:00
Karoy Lorentey
9714f97ad8 [stdlib] Substring: round indices down to nearest character in indexing operations
Distances between indices aren’t well-defined without this.
2022-03-29 20:00:08 -07:00
Karoy Lorentey
4eab8355ca [stdlib] String: prefer passing ranges to start+end argument pairs 2022-03-29 20:00:08 -07:00
Karoy Lorentey
4ad8b26ab3 [stdlib] String.UTF16View: Review/fix index validation
Also, in UTF-16 slices, forward collection methods to the base view
instead of `Slice`, to make behavior a bit easier to understand.

(There is no need to force readers to page in `Slice`
implementations _in addition to_ whatever the base view is doing.)
2022-03-29 20:00:08 -07:00
Karoy Lorentey
5f6c300adb [stdlib] String.UTF8View: Review/fix index validation
Also, in UTF-8 slices, forward collection methods to the base view
instead of `Slice`, to make behavior a bit easier to understand.

(There is no need to force readers to page in `Slice`
implementations _in addition to_ whatever the base view is doing.)
2022-03-29 18:40:25 -07:00
Karoy Lorentey
d58811262d [stdlib] String.UnicodeScalarView: Review index validation 2022-03-29 18:40:25 -07:00
Karoy Lorentey
0523b67e1f [stdlib] String.index(_:offsetBy:limitedBy:): compare limit against original index
Whether the limit actually applies depends on how it’s ordered
relative to the original index `i`, not the one we round down to the
nearest Character.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
c436654b61 [stdlib] Substring._characterStride(startingAt:): Limit stride to the correct bounds 2022-03-24 21:00:00 -07:00
Karoy Lorentey
6d400c81a2 [stdlib] Substring: remove _encodedOffsetRange in favor of existing _offsetRange 2022-03-24 21:00:00 -07:00
Karoy Lorentey
2464aa681e [stdlib] String: Ensure indices are marked scalar aligned before rounding down to Character 2022-03-24 21:00:00 -07:00
Karoy Lorentey
321284e9a9 [stdlib] Review & fix index validation during String index conversions
- Validate that the index has the same encoding as the string
- Validate that the index is within bounds
2022-03-24 21:00:00 -07:00
Karoy Lorentey
6245da2457 [stdlib] Substring: Be consistent about how we refer to the underlying string
Prefer direct stored properties to computed ones — there is no reason
to risk inlining issues, esp. since things like `Slice.base` aren’t
even force-inlined.

Prefer using `_wholeGuts` to spelling out the full incantation.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
8ab2379946 [stdlib] Round indices down to nearest Character in String’s index algorithms
To prevent unaligned indices from breaking well-defined index distance
and index offset calculations, round every index down to the nearest
whole Character.

For the horrific details, see the forum discussion below.

https://forums.swift.org/t/string-index-unification-vs-bidirectionalcollection-requirements/55946

To avoid rounding from regressing String performance in the regular
case (when indices aren’t being passed across string views), introduce
a new String.Index flag bit that indicates that the index is already
Character aligned.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
87073f2af8 [stdlib] Substring.replaceSubrange: fix startIndex/endIndex adjustment
This used to forward to `Slice.replaceSubrange`, but that’s a generic algorithm that isn’t aware of the pecularities of Unicode extended grapheme clusters, and it can be mislead by unusual cases, like a substring or subrange whose bounds aren’t `Character`-aligned, or a replacement string that starts with a continuation scalar.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
a44997eeea [stdlib] Factor scalar-aligned String index validation out into a set of common routines
There are three flavors, corresponding to i < endIndex, i <= endIndex, and range containment checks.
Additionally, we have separate variants for index validation in substrings.
2022-03-24 21:00:00 -07:00
Karoy Lorentey
15c7721caf [stdlib] Use the new index encoding flags when marking the encoding of indices
This removes an unnecessary opaque call from the inlinable path, but it preserves a runtime version check.
2022-03-24 20:59:59 -07:00
Karoy Lorentey
6e18955f90 [stdlib] Add bookkeeping to keep track of the encoding of strings and indices
Assign some previously reserved bits in String.Index and _StringObject to keep track of their associated storage encoding (either UTF-8 or UTF-16).

None of these bits will be reliably set in processes that load binaries compiled with older stdlib releases, but when they do end up getting set, we can use them opportunistically to more reliably detect cases where an index is applied on a string with a mismatching encoding.

As more and more code gets recompiled with 5.7+, the stdlib will gradually become able to detect such issues with complete accuracy.

Code that misuses indices this way was always considered broken; however, String wasn’t able to reliably detect these runtime errors before. Therefore, I expect there is a large amount of broken code out there that keeps using bridged Cocoa String indices (UTF-16) after a mutation turns them into native UTF-8 strings. Therefore, instead of trapping, this commit silently corrects the issue, transcoding the offsets into the correct encoding.

It would probably be a good idea to also emit a runtime warning in addition to recovering from the error. This would generate some noise that would gently nudge folks to fix their code.

rdar://89369680
2022-03-24 20:59:59 -07:00
Karoy Lorentey
683b9fa021 [stdlib] Adjust/fix String’s indexing operations to deal with the consequences of SE-0180 2022-03-24 20:59:59 -07:00