swift-mirror

mirror of https://github.com/apple/swift.git synced 2025-12-14 20:36:38 +01:00

Author	SHA1	Message	Date
Karoy Lorentey	29cf262ef7	[stdlib] String.Index: conform to CustomDebugStringConvertible instead Apply the LSG’s modifications as detailed in their review notes.	2024-10-07 17:00:13 -07:00
Karoy Lorentey	28ed16b469	[stdlib] String.Index.description: Explicitly note that the returned format is not stable This tends to be generally through for most `description` properties, but explicitly calling it out should help preventing people from trying to parse the result in this case.	2024-10-07 16:02:18 -07:00
Karoy Lorentey	e39613b851	[stdlib] String.Index.description: Use @_aeic for now	2024-10-07 16:02:18 -07:00
Karoy Lorentey	01792372a9	[stdlib] String.Index: conform to CustomStringConvertible This better exposes the internals of string indices, demystifying their operation and radically simplifying working with them.	2024-10-07 16:02:16 -07:00
Karoy Lorentey	704b565d40	[stdlib] String.Index: Add _description & _debugDescription While we’re vacillating on how best to add CustomStringConvertible and CustomDebugStringConvertible conformances to String.Index, add the prospective implementations as underscored-but-public members, to make life a little more bearable for people who need to debug string index operations.	2022-10-11 14:37:20 -07:00
Karoy Lorentey	2574d78d40	Merge pull request #42442 from lorentey/better-index-conversions	2022-04-19 20:22:06 -07:00
Josh Soref	644c18ca9b	Spelling stdlib (#42444 ) * spelling: against Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: algorithmic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: alignment Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: anything Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: architectural Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: architecture Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: are Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: artificial Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: aside Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: available Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: being Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: bidirectional Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: characters Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: circular Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: compatibility Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: compiled Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: correctly Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: covers Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: declaration Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: dependencies Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: descriptor Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: dictionaries Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: dynamic Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: greater Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: hierarchy Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: immortal Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: initialize Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: initializes Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: iterable Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: message Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: minimum Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: multiple Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: originally Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: simplified Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: sophisticated Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: trivia Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: wasn't Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> Co-authored-by: Josh Soref <jsoref@users.noreply.github.com>	2022-04-19 14:02:43 -07:00
Josh Soref	a0d2cabda6	Spelling stdlib/public/core (#42441 ) * spelling: available Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: components Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: conjunction Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: conversion Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: enforce Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: guarantee Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: interchangeable Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: satisfied Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> * spelling: superfluous Signed-off-by: Josh Soref <jsoref@users.noreply.github.com> Co-authored-by: Josh Soref <jsoref@users.noreply.github.com>	2022-04-19 14:02:24 -07:00
Karoy Lorentey	847337efd7	[stdlib][cosmetics] Clean up unused/underused interfaces, update naming There is little point to having `isUTF16` properties when they simply return `!isUTF8`; remove them. Rename `String.Index._copyEncoding(from:)` to `_copyingEncoding(from:)`.	2022-04-18 21:06:20 -07:00
Karoy Lorentey	4d557b0b45	[stdlib] Make String.Index(_:within:) initializers more permissive In Swift 5.6 and below, (broken) code that acquired indices from a UTF-16-encoded string bridged from Cocoa and kept using them after a `makeContiguousUTF8` call (or other mutation) may have appeared to be working correctly as long as the string was ASCII. Since https://github.com/apple/swift/pull/41417, the `String(_:within:)` initializers recognize miscoded indices and reject them by returning nil. This is technically correct, but it unfortunately may be a binary compatibility issue, as these used to return non-nil in previous versions. Mitigate this issue by accepting UTF-16 indices on a UTF-8 string, transcoding their offset as needed. (Attempting to use an UTF-8 index on a UTF-16 string is still rejected — we do not implicitly convert strings in that direction.) rdar://89369680	2022-04-18 21:02:14 -07:00
Karoy Lorentey	dcfc26cbc5	[stdlib][NFC] Doc adjustments	2022-04-10 16:57:44 -07:00
Karoy Lorentey	d3df05cb30	[stdlib] String.Index: Remove custom printing	2022-04-10 16:51:51 -07:00
Karoy Lorentey	3c9968945e	[stdlib] String: Implement happy paths for index validation	2022-04-10 00:14:43 -07:00
Karoy Lorentey	d18b5f573f	[stdlib] Branchless _StringGuts.hasMatchingEncoding	2022-04-09 21:33:53 -07:00
Karoy Lorentey	eadef7a204	[stdlib] String.Index: Use symbolic names rather than magic constants	2022-04-09 21:33:53 -07:00
Karoy Lorentey	83df814c63	[stdlib] _StringObject.isKnownUTF16 → isForeignUTF8 This fixes a compatibility issue with potential future UTF-8 encoded foreign String forms, as well as simplifying the code a bit — we no longer need to do an availability check on inlinable fast paths. The isForeignUTF8 bit is never set by any past or current stdlib version, but it allows us to introduce UTF-8 encoded foreign forms without breaking inlinable index encoding validation introduced in Swift 5.7.	2022-04-09 21:33:53 -07:00
Karoy Lorentey	73312fedd4	[stdlib] Grapheme breaking: Refactor to simplify logic - Split forward and backward direction into separate code paths. This makes the code more readable and paves the way for future improvements. (E.g., switching to a linear-time algorithm for breaking backwards.) - `Substring.index(after:)` now uses the same grapheme breaking paths as `String.index(after:)`. - The cached stride value in string indices is now well-defined even on indices that aren’t character-aligned.	2022-04-05 20:47:42 -07:00
Karoy Lorentey	dc6990370e	[stdlib] StringGuts.scalarAlign: Preserve encoding flags in returned index	2022-03-29 20:00:08 -07:00
Karoy Lorentey	98d5959478	[stdlib] String.Index: Adjust printing	2022-03-24 21:00:00 -07:00
Karoy Lorentey	8ab2379946	[stdlib] Round indices down to nearest Character in String’s index algorithms To prevent unaligned indices from breaking well-defined index distance and index offset calculations, round every index down to the nearest whole Character. For the horrific details, see the forum discussion below. https://forums.swift.org/t/string-index-unification-vs-bidirectionalcollection-requirements/55946 To avoid rounding from regressing String performance in the regular case (when indices aren’t being passed across string views), introduce a new String.Index flag bit that indicates that the index is already Character aligned.	2022-03-24 21:00:00 -07:00
Karoy Lorentey	6e18955f90	[stdlib] Add bookkeeping to keep track of the encoding of strings and indices Assign some previously reserved bits in String.Index and _StringObject to keep track of their associated storage encoding (either UTF-8 or UTF-16). None of these bits will be reliably set in processes that load binaries compiled with older stdlib releases, but when they do end up getting set, we can use them opportunistically to more reliably detect cases where an index is applied on a string with a mismatching encoding. As more and more code gets recompiled with 5.7+, the stdlib will gradually become able to detect such issues with complete accuracy. Code that misuses indices this way was always considered broken; however, String wasn’t able to reliably detect these runtime errors before. Therefore, I expect there is a large amount of broken code out there that keeps using bridged Cocoa String indices (UTF-16) after a mutation turns them into native UTF-8 strings. Therefore, instead of trapping, this commit silently corrects the issue, transcoding the offsets into the correct encoding. It would probably be a good idea to also emit a runtime warning in addition to recovering from the error. This would generate some noise that would gently nudge folks to fix their code. rdar://89369680	2022-03-24 20:59:59 -07:00
Karoy Lorentey	683b9fa021	[stdlib] Adjust/fix String’s indexing operations to deal with the consequences of SE-0180	2022-03-24 20:59:59 -07:00
Doug Gregor	9579390024	[SE-0304] Rename ConcurrentValue to Sendable	2021-03-18 22:48:20 -07:00
Doug Gregor	1a1f79c0de	Introduce safety checkin for ConcurrentValue conformance. Introduce checking of ConcurrentValue conformances: - For structs, check that each stored property conforms to ConcurrentValue - For enums, check that each associated value conforms to ConcurrentValue - For classes, check that each stored property is immutable and conforms to ConcurrentValue Because all of the stored properties / associated values need to be visible for this check to work, limit ConcurrentValue conformances to be in the same source file as the type definition. This checking can be disabled by conforming to a new marker protocol, UnsafeConcurrentValue, that refines ConcurrentValue. UnsafeConcurrentValue otherwise his no specific meaning. This allows both "I know what I'm doing" for types that manage concurrent access themselves as well as enabling retroactive conformance, both of which are fundamentally unsafe but also quite necessary. The bulk of this change ended up being to the standard library, because all conformances of standard library types to the ConcurrentValue protocol needed to be sunk down into the standard library so they would benefit from the checking above. There were numerous little mistakes in the initial pass through the stsandard library types that have now been corrected.	2021-02-04 03:45:09 -08:00
Michael Ilseman	e01a294da6	[stdlib] Introduce _invariantCheck_5_1 for 5.1 and later assertions. Inlinable and non-inlinable code can cause 5.1 code to intermix with 5.0 code on older OSes. Some (weak) invariants for 5.1 should only be checked when the OS's code is 5.1 or later, which is the purpose of _invariantCheck_5_1. Applied to String.Index._isScalarAligned, which is a new bit introduced in 5.1 from one of the reserved bits from 5.0. The bit is set when the index is proven to be scalar aligned, and we want to assert on this liberally in contexts where we expect it to be so. However, older OSes might not set this bit when doing scalar aligning, depending on exactly what got inlined where/when.	2019-07-12 15:58:27 -07:00
Michael Ilseman	63a6794cf9	[String] Switch scalar-aligned bit to a reserved bit. Since scalar-alignment is set in inlinable code, switch the alignment bit to one of the previously-reserved bits rather than a grapheme cache bit. Setting a grapheme cache bit in inlinable would break backward deployment, as older versions would interpret it as a cached value. Also adjust the name to "scalar-aligned", which is clearer, and removed assertion (which should be a real precondition).	2019-07-02 16:25:04 -07:00
Michael Ilseman	bd5a40ff1b	[gardening] Add underscore to internal member	2019-06-27 11:11:44 -07:00
Michael Ilseman	4cd1e812b7	[String] Scalar-alignment bug fixes. Fixes a general category (pun intended) of scalar-alignment bugs surrounding exchanging non-scalar-aligned indices between views and for slicing. SE-0180 unifies the Index type of String and all its views and allows non-scalar-aligned indices to be used across views. In order to guarantee behavior, we often have to check and perform scalar alignment. To speed up these checks, we allocate a bit denoting known-to-be-aligned, so that the alignment check can skip the load. The below shows what views need to check for alignment before they can operate, and whether the indices they produce are aligned. ┌───────────────╥────────────────────┬──────────────────────────┐ │ View ║ Requires Alignment │ Produces Aligned Indices │ ╞═══════════════╬════════════════════╪══════════════════════════╡ │ Native UTF8 ║ no │ no │ ├───────────────╫────────────────────┼──────────────────────────┤ │ Native UTF16 ║ yes │ no │ ╞═══════════════╬════════════════════╪══════════════════════════╡ │ Foreign UTF8 ║ yes │ no │ ├───────────────╫────────────────────┼──────────────────────────┤ │ Foreign UTF16 ║ no │ no │ ╞═══════════════╬════════════════════╪══════════════════════════╡ │ UnicodeScalar ║ yes │ yes │ ├───────────────╫────────────────────┼──────────────────────────┤ │ Character ║ yes │ yes │ └───────────────╨────────────────────┴──────────────────────────┘ The "requires alignment" applies to any operation taking a String.Index that's not defined entirely in terms of other operations taking a String.Index. These include: * index(after:) * index(before:) * subscript * distance(from:to:) (since `to` is compared against directly) * UTF16View._nativeGetOffset(for:)	2019-06-26 16:42:58 -07:00
Ben Cohen	e9d4687e31	De-underscore @frozen, apply it to structs (#24185 ) * De-underscore @frozen for enums * Add @frozen for structs, deprecate @_fixed_layout for them * Switch usage from _fixed_layout to frozen	2019-05-30 17:55:37 -07:00
Michael Ilseman	f7cdda2720	[gardening] Clean up many String computed vars	2019-04-08 15:16:48 -07:00
Michael Ilseman	415cc8fb0c	[String.Index] Deprecate encodedOffset var/init String.Index has an encodedOffset-based initializer and computed property that exists for serialization purposes. It was documented as UTF-16 in the SE proposal introducing it, which was String's underlying encoding at the time, but the dream of String even then was to abstract away whatever encoding happend to be used. Serialization needs an explicit encoding for serialized indices to make sense: the offsets need to align with the view. With String utilizing UTF-8 encoding for native contents in Swift 5, serialization isn't necessarily the most efficient in UTF-16. Furthermore, the majority of usage of encodedOffset in the wild is buggy and operates under the assumption that a UTF-16 code unit was a Swift Character, which isn't even valid if the String is known to be all-ASCII (because CR-LF). This change introduces a pair of semantics-preserving alternatives to encodedOffset that explicitly call out the UTF-16 assumption. These serve as a gentle off-ramp for current mis-uses of encodedOffset.	2019-02-13 18:42:40 -08:00
Ben Cohen	1673c12d78	[stdlib] Replace "sanityCheck" with "internalInvariant" (#20616 ) * Replace "sanityCheck" with "internalInvariant"	2018-11-15 20:50:22 -08:00
Michael Ilseman	63fe485758	[String] Audit and publish the rest of the ABI	2018-11-15 11:06:33 -08:00
Michael Ilseman	948655e850	[String] Cleanups, comments, documentation After rebasing on master and incorporating more 32-bit support, perform a bunch of cleanup, documentation updates, comments, move code back to String declaration, etc.	2018-11-04 10:42:42 -08:00
Karoy Lorentey	40aae6b235	[String] 32-bit platform support Add support for 32-bit platforms for UTF-8 backed String.	2018-11-04 10:42:41 -08:00
Michael Ilseman	7aea40680d	[String] NFC iterator fast-paths Refactor and rename _StringGutsSlice, apply NFC-aware fast paths to a new buffered iterator. Also, fix bug in _typeName which used to assume ASCIIness and better SIL optimizations on StringObject.	2018-11-04 10:42:41 -08:00
Michael Ilseman	a0e639eaf5	[String] Grapheme breaking fast-paths Add in our scalar-based fast-paths for UTF-8 and foreign strings, and update the grapheme cache.	2018-11-04 10:42:40 -08:00
Michael Ilseman	fe7c3ce2e4	[String] Refactorings and cleanup * Refactor out RRC implementation into dedicated file. * Change our `_invariantCheck` pattern to generate efficient code in asserts builds and make the optimizer job's easier. * Drop a few Bidi shims we no longer need. * Restore View decls to String, workaround no longer needed * Cleaner unicode helper facilities	2018-11-04 10:42:40 -08:00
Michael Ilseman	f23a3c19b8	[String] Bounds checking and Index cleanup	2018-11-04 10:42:40 -08:00
Michael Ilseman	4ab45dfe20	[String] Drop in initial UTF-8 String prototype This is a giant squashing of a lot of individual changes prototyping a switch of String in Swift 5 to be natively encoded as UTF-8. It includes what's necessary for a functional prototype, dropping some history, but still leaves plenty of history available for future commits. My apologies to anyone trying to do code archeology between this commit and the one prior. This was the lesser of evils.	2018-11-04 10:42:40 -08:00
Ben Cohen	a4230ab2ad	[stdlib] Update stdlib to 4.0 and reorganize compatibility shims (#17580 ) * Update stdlib to 4.0 and move all compatibility shims into a dedicated source file	2018-06-29 06:26:52 -07:00
Ben Cohen	a51cc89b11	Replace _CharacterView with a typealias (#17472 )	2018-06-25 13:22:09 -07:00
Karoy Lorentey	23c630ac92	[stdlib] Add @usableFromInline to internal typealiases that need it This fixes 3659 warnings in the standard library.	2018-06-18 16:34:19 +01:00
Karoy Lorentey	07c1b74cc4	[stdlib] Audit inlinability of Hashable implementations As a general rule, it is safe to mark the implementation of hash(into:) and _rawHashValue(seed:) for @_fixed_layout structs as inlinable. However, some structs (like String guts, Character, KeyPath-related types) have complicated enough hashing that it seems counterproductive to inline them. Mark these with @effects(releasenone) instead.	2018-05-31 18:24:59 -07:00
Michael Ilseman	3ee17102ed	[String.Index] Restore compound offsets. Move the shifts to index creation time rather than index comparison time. This seems to benefit micro benchmarks and cover up inefficiencies in our generic index distance calculations.	2018-05-25 09:54:35 -07:00
Michael Ilseman	614016fecd	[String.Index] Simplify and prepare for more resilience. Simplify String.Index by sinking transcoded offsets into the .utf8 variant. This is in preparation for a more resilient index type capable of supporting existential string indices.	2018-05-24 14:47:04 -07:00
Michael Ilseman	ba65638244	[string] Shrink String.Index to 2 words. String.Index is 3 words in size, which means that Range<String.Index> is 6, and Substring is 8 words total. This is pretty wasteful, so make a very minor adjustment to the index cache's UTF-8 buffer to bring it down to 2 words total. Do other simplifications too.	2018-05-24 14:47:04 -07:00
Nate Cook	7a4e0a32f6	[stdlib] Revise documentation This includes various revisions to the APIs landing in Swift 4.2, including: - Random and other randomness APIs - Hashable changes - MemoryLayout.offset(of:)	2018-05-18 11:31:54 -05:00
Karoy Lorentey	0342ce3b96	[SE-0206][stdlib] Remove obsolete hashValue implementations These are now synthesized by the compiler. (Inlinability will be different, but that seems fine.)	2018-04-30 10:17:09 +01:00
Karoy Lorentey	239c2c91dd	[SE-0206][stdlib] Add missing hash(into:) declarations	2018-04-30 10:17:09 +01:00

1 2

69 Commits