Commit Graph

41 Commits

Author SHA1 Message Date
Guillaume Lessard
9ccf71e0b7 [stdlib] More MutableCollection fix to _SmallString 2021-07-28 02:13:45 -06:00
Guillaume Lessard
50e7c98583 [stdlib] MutableCollection fix to _SmallString 2021-07-20 13:25:55 -06:00
Robert Widmann
0149ccd0ca Add arm64_32 support for Swift
Commit the platform definition and build script work necessary to
cross-compile for arm64_32.

arm64_32 is a variant of AARCH64 that supports an ILP32 architecture.
2021-04-20 14:59:04 -07:00
Michael Ilseman
49eddbf318 [stdlib] Fix bug in small string uninitialized init
Fix a bug and enforce the invariant that all bits between the last code unit
and the descriminator in a small string should be unset.
2021-03-30 18:24:28 -06:00
Karoy Lorentey
1f92df093c [stdlib] Add an unsafe U[M]BP initializer to work around some inliner test failures 2020-12-09 19:31:28 -08:00
Andrew Trick
5eafc20cdd Fix undefined behavior in SmallString.withUTF8
withUTF8 currently vends a typed UInt8 pointer to the underlying
SmallString. That pointer type differs from SmallString's
representation. It should simply vend a raw pointer, which would be
both type safe and convenient for UTF8 data. However, since this
method is already @inlinable, I added calls to bindMemory to prevent
the optimizer from reasoning about access to the typed pointer that we
vend.

rdar://67983613 (Undefinied behavior in SmallString.withUTF8 is miscompiled)

Additional commentary:

SmallString creates a situation where there are two types, the
in-memory type, (UInt64, UInt64), vs. the element type,
UInt8. `UnsafePointer<T>` specifies the in-memory type of the pointee,
because that's how C works. If you want to specify an element type,
not the in-memory type, then you need to use something other than
UnsafePointer to view the memory. A trivial `BufferView<UInt8>` would
be fine, although, frankly, I think UnsafeRawPointer is a perfectly
good type on its own for UTF8 bytes.

Unfortunately, a lot of the UTF8 helper code is ABI-exposed, so to
work around this, we need to insert calls to bindMemory at strategic
points to avoid undefined behavior. This is high-risk and can
negatively affect performance. So far, I was able to resolve the
regressions in our microbenchmarks just by tweaking the inliner.
2020-09-24 18:36:42 -07:00
Max Desiatov
8e705f3413 [WebAssembly] Add wasm32 support to stdlib String 2020-02-17 12:51:34 +00:00
David Smith
b06137b283 Add a private implementation of a String initializer with access to uninitialized storage (https://github.com/apple/swift-evolution/pull/1022) and use it to speed up uppercased() and lowercased() 2019-07-09 15:05:00 -07:00
Ben Cohen
e9d4687e31 De-underscore @frozen, apply it to structs (#24185)
* De-underscore @frozen for enums

* Add @frozen for structs, deprecate @_fixed_layout for them

* Switch usage from _fixed_layout to frozen
2019-05-30 17:55:37 -07:00
Michael Ilseman
f7cdda2720 [gardening] Clean up many String computed vars 2019-04-08 15:16:48 -07:00
Michael Ilseman
877a20ead0 [String] Fix crash when given null UBP 2019-02-06 14:44:01 -08:00
Michael Munday
8d4608f8e4 [string] Improve some small string comments
Clarify the expected behavior of some small string helper functions.
2019-01-08 06:15:54 -05:00
Michael Munday
9587582c86 [string] Fix string implementation for big endian platforms
This commit supersedes 2866b4a which was overwritten by 4ab45df.

Store small string code units in little-endian byte order. This way
the code units are in the same order on all machines and can be
safely treated as an array of bytes.
2019-01-07 09:01:43 -05:00
Michael Ilseman
5a6d2dfa59 [String] Switch ABI to only use 4 discriminator bits.
In anticipation of potential future HW features, e.g. armv8.5 memory
tagging, only use the high 4 bytes as discriminator bits in
_BridgeObject rather than the top 8 bits. Utilize two perf flags to
cover this instead. This requires shifting around a fair amount of
internal complexity.
2018-12-19 13:54:50 -08:00
Michael Ilseman
5d67236bc0 [String] Refactor 32-bit StringObject.
Remove Discriminator, Flags, etc., abstractions from
StringObject. These cause code divergence between 32-bit and 64-bit
ABI, complicate ABI changes, and otherwise contribute to bloat.
2018-12-19 11:19:08 -08:00
Ben Cohen
1673c12d78 [stdlib] Replace "sanityCheck" with "internalInvariant" (#20616)
* Replace "sanityCheck" with "internalInvariant"
2018-11-15 20:50:22 -08:00
Maxim Moiseev
cbf83ac04f [NFC][stdlib] Add FIXME markers to simplify audit 2018-11-14 11:58:42 -08:00
Slava Pestov
f6c2caf64b stdlib: Add @inlinable to @inline(__always) declarations
These should be audited since some might not actually need to be
@inlinable, but for now:

- Anything public and @inline(__always) is now also @inlinable
- Anything @usableFromInline and @inline(__always) is now @inlinable
2018-11-13 15:15:07 -05:00
Michael Ilseman
948655e850 [String] Cleanups, comments, documentation
After rebasing on master and incorporating more 32-bit support,
perform a bunch of cleanup, documentation updates, comments, move code
back to String declaration, etc.
2018-11-04 10:42:42 -08:00
Karoy Lorentey
40aae6b235 [String] 32-bit platform support
Add support for 32-bit platforms for UTF-8 backed String.
2018-11-04 10:42:41 -08:00
Michael Ilseman
d5da6fdbfd [String] More comparison speedups and cleanup 2018-11-04 10:42:41 -08:00
Michael Ilseman
9135c07cac [String] Perform small string append in-register 2018-11-04 10:42:41 -08:00
Michael Ilseman
a37d110adf [String] Constant-fold small strings from literals.
Tweak and adjust code so that the SIL optimizer can constant-fold
small strings from literals. Also some cleanup.
2018-11-04 10:42:41 -08:00
Michael Ilseman
7aea40680d [String] NFC iterator fast-paths
Refactor and rename _StringGutsSlice, apply NFC-aware fast paths to a
new buffered iterator.

Also, fix bug in _typeName which used to assume ASCIIness and better
SIL optimizations on StringObject.
2018-11-04 10:42:41 -08:00
Michael Ilseman
8851bac1be [String] Inlining, NFC fast paths, and more.
Add inlinability annotations to restore performance parity with 4.2 String.

Take advantage of known NFC as a fast-path for comparison, and
overhaul comparison dispatch.

RRC improvements and optmizations.
2018-11-04 10:42:41 -08:00
Michael Ilseman
9d9f9005e3 [String] Define performance flags and plumb them throughout 2018-11-04 10:42:41 -08:00
Michael Ilseman
fe7c3ce2e4 [String] Refactorings and cleanup
* Refactor out RRC implementation into dedicated file.

* Change our `_invariantCheck` pattern to generate efficient code in
  asserts builds and make the optimizer job's easier.

* Drop a few Bidi shims we no longer need.

* Restore View decls to String, workaround no longer needed

* Cleaner unicode helper facilities
2018-11-04 10:42:40 -08:00
Michael Ilseman
9bf2c4d3d3 [String] Use small string at string creation 2018-11-04 10:42:40 -08:00
Michael Ilseman
4ab45dfe20 [String] Drop in initial UTF-8 String prototype
This is a giant squashing of a lot of individual changes prototyping a
switch of String in Swift 5 to be natively encoded as UTF-8. It
includes what's necessary for a functional prototype, dropping some
history, but still leaves plenty of history available for future
commits.

My apologies to anyone trying to do code archeology between this
commit and the one prior. This was the lesser of evils.
2018-11-04 10:42:40 -08:00
Michael Munday
2866b4a30d [string] Fix small string implementation for big endian platforms
Exclusively store small strings in little-endian byte order. This
will insert byte swaps when accessing small strings on big endian
platforms, however these are usually extremely cheap.

This approach means that the layout of the code points and count
in memory will be the same on both big and little endian machines
simplifying future development. Prior to this change this code
was broken on big endian machines because the memory layout was
different (the count ending up in the middle of the string).
2018-09-24 13:10:09 +01:00
Michael Ilseman
1859ae404a [test] Internalize SmallString; NFC 2018-08-01 14:23:56 -07:00
Slava Pestov
5d1f48e3ae stdlib: Update for stricter enforcement of @usableFromInline 2018-06-25 16:26:56 -07:00
Karoy Lorentey
23c630ac92 [stdlib] Add @usableFromInline to internal typealiases that need it
This fixes 3659 warnings in the standard library.
2018-06-18 16:34:19 +01:00
Erik Little
863f3a19ff Rename @effects to @_effects
@effects is too low a level, and not meant for general usage outside
the standard library. Therefore it deserves to be underscored like
other such attributes.
2018-06-06 12:53:03 -04:00
Erik Eckstein
47383e2f8e stdlib: let the construction of a small string literal compile down to 2 constant loads.
The main part of this is to rewrite the small string literal-constructor to work with values (= shifting bytes) instead of setting bytes in memory.
This allows the compiler to fold away everything and end up with the optimal code for small string literals.
2018-05-29 11:21:28 -07:00
Erik Eckstein
0868f2864b stdlib: make Sequence -> String inlinable
As this function is generic, it makes a big difference when it can be specialized for concrete sequences, like arrays or unsafe buffers.
This fixes a performance regression of String(decoding:as:), e.g. when constructing a String from a byte buffer.
2018-05-17 13:44:51 -07:00
Michael Ilseman
6b7d316f50 [string] Fix 32-bit small string compilation failure 2018-05-13 07:38:55 -07:00
Michael Ilseman
459833725e [String] Streamline more String creation logic.
Streamline and de-genericize non-inlinable internal functions to
create a String from UTF-8 efficiently.
2018-05-13 07:38:55 -07:00
Slava Pestov
2e5aef9c8d stdlib: Remove redundant @usableFromInline attributes 2018-04-06 00:02:30 -07:00
Slava Pestov
e1f50b2d36 SE-0193: Rename @_inlineable to @inlinable, @_versioned to @usableFromInline 2018-03-30 21:55:30 -07:00
Michael Ilseman
cb9dc84d72 [string] Initial small string implementation.
This adds a small string representation capable of holding up to 15
ASCII code units directly in registers. This is extendable to UTF-8 in
the future.

It is intended to be the preferred representation whenever possible
for String, and is intended to be a String fast-path. Future small
forms may be added in the future (likely off the fast-path).

Small strings are available on 64-bit, where they are most beneficial
and well accomodated by limited address spaces. They are unavailable
on 32-bit, where they are less of a win and would require much more
hackery due to full address spaces.
2018-03-27 14:00:55 -07:00