Commit Graph

178 Commits

Author SHA1 Message Date
Michael Ilseman
415cc8fb0c [String.Index] Deprecate encodedOffset var/init
String.Index has an encodedOffset-based initializer and computed
property that exists for serialization purposes. It was documented as
UTF-16 in the SE proposal introducing it, which was String's
underlying encoding at the time, but the dream of String even then was
to abstract away whatever encoding happend to be used.

Serialization needs an explicit encoding for serialized indices to
make sense: the offsets need to align with the view. With String
utilizing UTF-8 encoding for native contents in Swift 5, serialization
isn't necessarily the most efficient in UTF-16.

Furthermore, the majority of usage of encodedOffset in the wild is
buggy and operates under the assumption that a UTF-16 code unit was a
Swift Character, which isn't even valid if the String is known to be
all-ASCII (because CR-LF).

This change introduces a pair of semantics-preserving alternatives to
encodedOffset that explicitly call out the UTF-16 assumption. These
serve as a gentle off-ramp for current mis-uses of encodedOffset.
2019-02-13 18:42:40 -08:00
Daniel Rodríguez Troitiño
d08b46c47e [tests] Standarize the checks for Darwin, Glibc and MSVCRT.
Different tests used different os checks for importing Darwin, Glibc and
MSVCRT. This commit use the same pattern for importing those libraries,
in order to avoid the #else branches of the incorrect patterns to be
applied to the wrong platform. This was very normal for Android, which
normally should follow the Linux branches, but sometimes was trying to
import Darwin or not importing anything.

The standarized pattern imports Darwin for macOS, iOS, tvOS and watchOS.
It imports Glibc for Linux, FreeBSD, PS4, Android, Cygwin and Haiku; and
imports MSVCRT for Windows. If a new platform is introduced, the else
branch will report an error, so the new platform can be added to one of
the branches (or maybe add a new specific branch).

In some cases  the standard pattern was modified because some test required
it (importing extra modules, or extra type aliases), and in some other
cases some branches were removed because the test will not have used
them (but it is not exhaustive, so there might be some unnecessary
branches).

This should, at least, fix three tests for Android (the three
dynamic_replacement*.swift ones).
2019-02-06 10:51:55 -08:00
Lance Parker
15aaa1e777 [stdlib]String normalization functions (#21026)
* fast/foreignNormalize functions
2019-01-08 13:55:29 -08:00
Michael Ilseman
c0c530aef8 [String] Speed up constant factors on comparison.
Include some tuning and tweaking to reduce the constant factors
involved in string comparison. This yields considerable improvement on
our micro-benchmarks, and allows us to make less inlinable code and
have a smaller ABI surface area.

Adds more extensive testing of corner cases in our existing
fast-paths.
2018-12-03 15:49:38 -08:00
Michael Ilseman
94942c5b3b [String] Fix corner case in comparison fast-path. (#20937)
When in a post-binary-prefix-scan fast-path, we need to make sure we
are comparing a full-segment scalar, otherwise we miss situations
where a combining end-of-segment scalar would be reordered with a
prior combining scalar in the same segment under normalization in one
string but not the other.

This was hidden by the fact that many combining scalars are not
NFC_QC=maybe, but those which are not present in any precomposed form
have NFC_QC=yes. Added tests.
2018-12-03 10:41:45 -08:00
Lance Parker
17187344df Make the NormalizationCheck test compare code units 2018-11-16 14:43:04 -08:00
Lance Parker
12bf2978e3 Michael's feedback 2018-11-16 10:20:46 -08:00
Lance Parker
0009b21533 properly promote stack buffer to heap buffer when necessary 2018-11-16 10:19:48 -08:00
Michael Ilseman
75943350d2 [String] Give String a custom iterator
Gives us modest wins on complex grapheme strings, but up to 40% on
heavy-ASCII strings.
2018-11-08 18:25:01 -08:00
Michael Ilseman
ec6729a3a3 [String] Assertion logic and isASCII bug fix.
Fix bugs in assertion logic and properly update the isASCII bit on
RRC. RRC tests added.
2018-11-04 10:42:44 -08:00
Michael Ilseman
948655e850 [String] Cleanups, comments, documentation
After rebasing on master and incorporating more 32-bit support,
perform a bunch of cleanup, documentation updates, comments, move code
back to String declaration, etc.
2018-11-04 10:42:42 -08:00
Michael Ilseman
e6582c37ee [test] Adjust String tests for UTF-8 representation.
Adjust tests for the UTF-8 representation, in preparation for 32-bit
support. Includes UTF-8 literal update.
2018-11-04 10:42:41 -08:00
Michael Ilseman
b87bff4fac [test] Test the unique-native String RRC optimization path 2018-11-04 10:42:41 -08:00
Lance Parker
f1a35bd1c9 String comparison iterator for UTF8 strings 2018-11-04 10:42:41 -08:00
Ben Cohen
e338344bae Remove overloads that were needed pre-conditional conformance 2018-09-11 21:00:36 -07:00
Jordan Rose
01a0de27ec [test] Update for remote-run-ing tests on a different macOS (#18966)
Most of this is just "remember to specify the inputs and outputs on
the command line, so remote-run can see them". A bit is "prefix
environment variables with '%env-'". And the last few are "yeah,
this was never going to work in a remote environment".

In the few cases where I couldn't think of anything reasonable, I just
marked the test as "UNSUPPORTED: remote_run", a new "feature".
2018-08-27 14:50:40 -07:00
Arnold Schwaighofer
b62c6e64ff Codesign validation-test/stdlib 2018-08-10 09:39:09 -07:00
Michael Ilseman
8294c0003a [string] Drop _StringGuts subscript; NFC
_StringGuts shouldn't expose a subscript, implying efficient
access. Switch to the explicit code unit fetch method. Update tests
accordingly, and switch off of deprecated typealiases.
2018-08-02 16:34:22 -07:00
Michael Ilseman
ba6158d74e [test] Internalize _StringGuts; Add shared testing struct; NFC
Create a _StringRepresentation struct to standardize internal testing
on. Internalize much of _StringGuts, except for some SPI hacks, and
update tests to use _StringRepresentation.
2018-08-01 14:23:56 -07:00
Michael Ilseman
a7d3c7079b [test] Adjust String.swift tests for non-small strings on 32b; NFC 2018-07-31 11:23:51 -07:00
Michael Ilseman
534a17aebb [test] Migrate String.swift off of Swift 3; NFC 2018-07-30 17:38:25 -07:00
Michael Ilseman
c8ed8f9a2f [test] Update String tests for older iOS versions 2018-07-13 16:08:36 -07:00
Ben Cohen
685f31b0e2 [stdlib] Migrate stdlib tests of Swift 3 (#17427)
* First sweep of Swift 3 stdlib test upgrades

* Review feedback

* Remove a handful more #if >=4.0

* Fix up Dictionary tests
2018-07-08 09:37:01 -07:00
Ben Cohen
a51cc89b11 Replace _CharacterView with a typealias (#17472) 2018-06-25 13:22:09 -07:00
Slava Pestov
5d2752f7d2 Run tests with -swift-version 4 by default
Some test now fail, so add an explicit -swift-version 3.
2018-06-19 23:24:19 -07:00
Michael Ilseman
1fe5fb717d [string] Skip allocation in reserveCapacity if smol
If the requested capacity is small enough to fit in our small string
representation, don't allocate a UTF-16 buffer, instead just return
early.
2018-05-18 21:26:59 -07:00
Michael Ilseman
459833725e [String] Streamline more String creation logic.
Streamline and de-genericize non-inlinable internal functions to
create a String from UTF-8 efficiently.
2018-05-13 07:38:55 -07:00
Michael Ilseman
715003c206 [gardening] Internalize many non-API String interfaces 2018-04-28 15:36:05 -07:00
Michael Ilseman
93d6130066 [string] Integrate small strings.
Switch StringObject and StringGuts from opaquely storing tagged cocoa
strings into storing small strings. Plumb small string support
throughout the standard library's routines.
2018-03-27 14:00:59 -07:00
Michael Ilseman
cdfeb88cfe [string] Simplify creation logic, especially for C strings.
Streamline internal String creation. Previously, everything funneled
into a single generic function, however, every single call of the
generic funnel had relevant specific information that could be used
for a more efficient algorithm.

In preparation for efficiently forming small strings, refactor this
logic into a handful of more specialized subroutines to preserve more
specific information from the callers.
2018-03-27 10:49:02 -07:00
Lance Parker
e0e50e9b3e Add failing test 2018-02-28 11:27:40 -08:00
Lance Parker
7cc222e271 Ditched the simple/complex test distinction as they all pass now (#20) 2018-02-19 10:09:24 -08:00
Lance Parker
0661de22a2 [stdlib]Un-revert string comparison (#14694)
Restore (un-revert) sting comparison, with fixes

More exhaustive testing of opaque strings, which consistently reproduces prior sporadic failure. Shims fixups. Some test tweaking.
2018-02-18 10:50:33 -08:00
Lance Parker
abe6a6d177 Revert string comparison (#14657) 2018-02-15 14:37:43 -08:00
Lance Parker
49bc1459ae Update string comparison tests 2018-02-14 15:44:11 -08:00
Karoy Lorentey
b8d8949166 String & String views: Add bounds checking to range subscripts. 2018-01-21 12:40:21 -08:00
Karoy Lorentey
90e894729a [StringGuts] Linux support
Add support for compiling StringGuts without the Objective-C runtime.
2018-01-21 12:37:36 -08:00
Michael Ilseman
3be2faf5d3 [String] Initial implementation of 64-bit StringGuts.
Include the initial implementation of _StringGuts, a 2-word
replacement for _LegacyStringCore. 64-bit Darwin supported, 32-bit and
Linux support in subsequent commits.
2018-01-21 12:32:26 -08:00
Michael Ilseman
75463e30f3 [stdlib] Rename _StringCore to _LegacyStringCore. NFC.
In grand LLVM tradition, the first step to redesigning _StringCore is
to first rename it to _LegacyStringCore. Subsequent commits will
introduce the replacement, and eventually all uses of the old one will
be moved to the new one.

NFC.
2018-01-21 12:28:56 -08:00
Ben Cohen
ca6c6b1d36 [stdlib] Cleanup DefaultIndices, delete dead code (#13952)
* Remove a bunch of Default(Bidirectional|RandomAccess)Indices usage from stdlib and test

* Remove some DefaultRandomAccessIndices and IndexDistance usage from Foundation

* Remove no-longer-used internal type in Existentials.swift

* Get rid of indicesForTraversal
2018-01-15 13:48:08 -08:00
Ben Cohen
4ddac3fbbd [stdlib] Eradicate IndexDistance associated type (#12641)
* Eradicate IndexDistance associated type, replacing with Int everywhere

* Consistently use Int for ExistentialCollection’s IndexDistance type.

* Fix test for IndexDistance removal

* Remove a handful of no-longer-needed explicit types

* Add compatibility shims for non-Int index distances

* Test compatibility shim

* Move IndexDistance typealias into the Collection protocol
2017-12-08 12:00:23 -08:00
Ben Cohen
dcab9493ae Removed some warnings (#12753) 2017-11-30 15:12:56 -08:00
Nate Cook
9dce40ca24 [stdlib] Add tests for index hashability 2017-11-28 13:29:55 -06:00
Lance Parker
77b0f39cc2 Max's feedback 2017-08-29 17:23:11 -07:00
Lance Parker
19d43c24d1 Some more test cleanup 2017-08-28 22:49:19 -07:00
Lance Parker
3f60f1c80f Remove commented out code 2017-08-28 22:44:41 -07:00
Lance Parker
c623881c78 Fix Linux tests 2017-08-28 21:24:11 -07:00
Lance Parker
0ebb2fecca Add COW test for String 2017-08-28 17:20:19 -07:00
Dave Abrahams
41c53ae729 [stdlib] Give Substring its own views
This necessary for ensuring the property that String doesn't keep
inaccessible memory alive.  For example, before this change,

    String(s.dropFirst().unicodeScalars)

would compile and produce a String that owned inaccessible memory.
Now it no longer compiles.

String's view's SubSequences are the same as the Substring's
view. E.g. String.UnicodeScalarView.SubSequence is
Substring.UnicodeScalarView.

New compatibility inits added, to work around the fact that many
previously failable initializers are now non-failable.
2017-07-26 15:59:51 -07:00
Dave Abrahams
1f1f35a57b [stdlib] Squash some warnings in a test 2017-07-26 15:59:12 -07:00