Speculatively fixing this to rule out potential miscompiles.
The compiler needs to know if a reference is being materialized out of
thin air. The proper way to do that is with the Unmanaged API.
Under the hood, this forces the reference into an "unowned(unsafe)"
variable which the reference must be reloaded from. That tells the
compiler that it can't optimize some seemingly unrelated object which
the reference may happen to refer to at runtime.
/// Warning: Casting from an integer or a pointer type to a reference type
/// is undefined behavior. It may result in incorrect code in any future
/// compiler release. To convert a bit pattern to a reference type:
/// 1. convert the bit pattern to an UnsafeRawPointer.
/// 2. create an unmanaged reference using Unmanaged.fromOpaque()
/// 3. obtain a managed reference using Unmanaged.takeUnretainedValue()
/// The programmer must ensure that the resulting reference has already been
/// manually retained.
This is a wild guess at what might be causing our persistent, random
String failures on the main branch:
```
Swift(macosx-x86_64) :: Prototypes/CollectionTransformers.swift
Swift(macosx-x86_64) :: stdlib/NSSlowString.swift
Swift(macosx-x86_64) :: stdlib/NSStringAPI.swift
Swift(macosx-x86_64) :: stdlib/StringIndex.swift
Swift-validation(macosx-x86_64) :: stdlib/String.swift
Swift-validation(macosx-x86_64) :: stdlib/StringBreadcrumbs.swift
Swift-validation(macosx-x86_64) :: stdlib/StringUTF8.swift
```
FWIW, it appears this is *not* caused by https://github.com/apple/swift/pull/62717:
that change has also landed on release/5.8, and I haven’t seen these
issues on that branch.
Our atomic breadcrumbs initialization vs its non-atomic loading
gives me an uneasy feeling that this may in fact be a long standing
synchronization issue that is only now causing problems (for whatever
reason). I am unable to reproduce these issues locally, so this guess
may be (and probably is) wildly off the mark, but this PR is likely
to be a good idea anyway, if only to rule out this possibility.
rdar://104751936
Since values of generic type are currently assumed to always
support copying, we need to prevent move-only types from
being substituted for generic type parameters.
This approach leans on a `_Copyable` marker protocol to which
all generic type parameters implicitly must conform.
A few other changes in this initial implementation:
- Now every concrete type that can conform to Copyable will do so. This fixes issues with conforming to a protocol that requires Copyable.
- Narrowly ban writing a concrete type `[T]` when `T` is move-only.
It does not compile in this mode.
```
error: no exact matches in call to instance method 'appendInterpolation'
owner: \(repr._objectIdentifier!), \
```
Key paths can store an offset or a pointer in the same field. On 32-bit, the field is considered to be an offset when it's less than the 4kB zero page, and a pointer otherwise.
The check uses a signed comparison, so pointers in the top half of memory would look like negative offsets. Add a check that the offset is zero or positive to avoid this.
rdar://103886537
Moves a `//` comment up above a `///` documentation comment, since the latter needs to be attached directly to the declaration or it won't be picked up as documentation.
- Reduce the emphasis on the type theory that Never is an uninhabited
type, focusing more on its meaning and usage in code.
- Move the definition of uninhabited out of the abstract. Define
"nonreturning" more explicitly.
- Expand the favoriteNumber example's code comment into a brief
paragraph to walk through the code listing.
- Avoid italics in the abstract, future tense, and parenthetical asides.
- Use contractions.
`String.debugDescription` currently fails to protect the contents of
the string from combining with the opening or closing `”` characters
or one of the characters of a quoted scalar:
```swift
let s = “\u{301}A\n\u{302}B\u{70F}”
print(s.debugDescription)
// ⟹ “́A\n̂B” (characters: “́, A, \, n̂, B, ”)
```
This can make debug output difficult to read, as string contents are
allowed to spread over and pollute neighboring meta-characters.
This change fixes this by force-quoting the problematic scalars in
these cases:
```swift
let s = “\u{301}A\n\u{302}B\u{70F}”
print(s.debugDescription)
// ⟹ “\u{301}A\n\u{302}B\u{70F}”
```
Of course, Unicode scalars that don’t engage in such behavior are
still allowed to pass through unchanged:
```swift
let s = “Cafe\u{301}”
print(s.debugDescription)
// ⟹ “Café”
```
Rule GB11 in Unicode Annex 29 is:
GB11: Extended_Pictographic Extend* ZWJ × Extended_Pictographic
However, our forward grapheme breaking state machine implements it as:
GB11: Extended_Pictographic Extend* ZWJ+ × Extended_Pictographic
We implement the correct rules when going backward, which can cause String values to have different counts whether we’re going forward or back.
The rule as implemented would be fine (Unicode doesn’t care much about the placement of grapheme breaks in invalid sequences), but the directional inconsistency messes with String’s Collection conformance.
rdar://104279671
Swift tends to emit unnecessary checks and traps when iterating unsafe
raw buffer pointers. These traps are confirming that the position
pointer isn't nil, but this check is redundant with the bounds check
that is already present. We can safely remove it.
Equatability allows faster implementations for updating cached grapheme boundary state after a text mutation, because it enables quick detection of before/after state equality, without having to feed the recognizers until they produce a synchronized grapheme break.
The CustomStringConvertible conformance makes it orders of magnitude more pleasant to debug code that uses this.
Sendable is a baseline requirement for value types these days.
* Remove prefix+ from StaticBigInt
This operator causes source breakage in cases like:
```
let a:Int = 7
let b = +1
let c = a + b // Error: Cannot convert `b` from `StaticBigInt` to `Int`
```
[Bidirectional]Collection’s default index manipulation methods (as
well as _utf16Distance) do not expect to be given unreachable
indices, and they tend to fail when operating on them. Round indices
down to the nearest scalar boundary before calling these.
Align the grammar of macro declarations with SE-0382, so that macro
definitions are parsed as an expression. External macro definitions
are referenced via a referenced to the macro `#externalMacro`. Define
that macro in the standard library, and recognize uses of it as the
definition of other macros to use externally-defined macros. For
example, this means that the "stringify" macro used in a lot of
examples is now defined as something like this:
@expression macro stringify<T>(_ value: T) -> (T, String) =
#externalMacro(module: "MyMacros", type: "StringifyMacro")
We still parse the old "A.B" syntax for two reasons. First, it's
helpful to anyone who has existing code using the prior syntax, so they
get a warning + Fix-It to rewrite to the new syntax. Second, we use it
to define builtin macros like `externalMacro` itself, which looks like this:
@expression
public macro externalMacro<T>(module: String, type: String) -> T =
Builtin.ExternalMacro
This uses the same virtual `Builtin` module as other library builtins,
and we can expand it to handle other builtin macro implementations
(such as #line) over time.
These simply expose the preexisting internal
`_StringGuts.validate*Index` functions that indexing operations
use to implicitly round indices down to the nearest valid index. (Or, in the case of the encoding views, the nearest scalar boundary.)
Being able to do this as a standalone, explicit, efficient operation
is crucial when implementing some `String` algorithms that need to
work with arbitrary indices.
`Unicode._CharacterRecognizer` is a newly exported opaque type that
exposes the stdlib’s extended grapheme cluster breaking facility,
independent of `String`.
This essentially makes the underlying simple state machine public,
without exposing any of the (unstable) Unicode details.
The ability to perform grapheme breaking over, say, the scalars stored
in multiple `String` values can be extremely useful while building
custom text processing algorithms and data structures.
Ideally this would eventually become API, but before proposing this
to Swift Evolution, I’d like to prove the shape of the type in actual
use (and we’ll also need to find better names for its operations).
This turns _GraphemeBreakingState into a more proper state
machine, although it is only able to recognize breaks in the
forward direction.
The backward direction requires arbitrarily long lookback,
and it currently remains in _StringGuts.