Add ICU shims for creating UTexts and performing grapheme breaking
using UTexts. This will allow us to explore multi-encoding grapheme
support throughout string.
- numericValue returns nil instead of .nan for non-numerics
- Remove small-string optimizations from _scalarName that failed on 32-bit archs
- Put case mappings back into U.S.Properties
- Added more sanity tests
* Add partial range subscripts to _UnmanagedOpaqueString
* Use SipHash13+_NormalizedCodeUnitIterator for String hashes on all platforms
* Remove unecessary collation algorithm shims
* Pass the buffer to the SipHasher for ASCII
* Hash the ascii parts of UTF16 strings the same way we hash pure ascii strings
* De-dupe some code that can be shared between _UnmanagedOpaqueString and _UnmanagedString<UInt16>
* ASCII strings now hash consistently for in hashASCII() and hashUTF16()
* Fix zalgo comparison regression
* Use hasher
* Fix crash when appending to an empty _FixedArray
* Compact ASCII characters into a single UInt64 for hashing
* String: Switch to _hash(into:)-based hashing
This should speed up String hashing quite a bit, as doing it through hashValue involves two rounds of SipHash nested in each other.
* Remove obsolete workaround for ARC traffic
* Ditch _FixedArray<UInt8> in favor of _UIntBuffer<UInt64, UInt8>
* Bad rebase remnants
* Fix failing benchmarks
* michael's feedback
* clarify the comment about nul-terminated string hashes
Adjust the signature to match the ICU declaration for
`unorm2_normalize`. This was adjusted to allow building against ICU
59.1. The shim type definition for the UChar ensures that the signature
is correct on all the targets. NFC.
Restore (un-revert) sting comparison, with fixes
More exhaustive testing of opaque strings, which consistently reproduces prior sporadic failure. Shims fixups. Some test tweaking.
The default ICU build will change the underlying type of the UChar type,
with C++ using the builtin `char16_t` rather than `unsigned short`.
This adjusts the interface to account for that. I've verified across
Apple's implementation that they always use the `unsigned short` as the
type for `UChar`. Since we cannot guarantee that the ICU interfaces are
built the same way on all targets, especially when using the underlying
system's ICU.
Adjust the stubs implementation declaration to match the ICU header's
declaration.
Force the autolinking on Windows and Darwin as both have mechanisms to
support this. ELFish targets are unfortunately not supported yet as
there is no portable mechanism to do this.
Remove the unnecessary handling of specific targets. Always perform the
cast as it adds no overhead and will always be correct (worst case is
that the type is cast to itself). This simplifies the logic.
Move the forward declarations to avoid inclusion to the same location as
the inclusion.
ICU headers prefer use of `char16_t` instead of `uint16_t` for `UChar` in C++, where it is treated as a distinct type. This fixes associated warnings and errors.
Programs using a statically linked build of the standard library need
to explicitly link against icucore. There are various potential
hacks^Wsolutions to this problem, and this is an attempt at a lesser
of evils approach.
Emit a linker directive to perform autolinking against icucore on
Darwin systems. This allows us to avoid hacking the compiler driver
and propagating that hack onto any build systems that don't go through
the driver.
Avoid a dependency on ICU headers on Apple platforms, rather than rely
on corelibs-foundation being checked out. This simplifies the
dependencies and unblocks build bots.
Introduce shims for using UBreakIterators from ICU. Also introduce
shims for using thread local storage via pthreads.
We will be relying on ICU and UBreakIterators for grapheme
breaking. But, UBreakIterators are very expensive to create,
especially for the way we do grapheme breaking, which is relatively
stateless. Thus, we will stash one or more into thread local storage
and reset it as needed.
Note: Currently, pthread_key_t is hard coded for a single platform
(Darwin), but I have a static_assert alongside directions on how to
adapt it to any future platforms who differ in key type.
Introduce shims for using UBreakIterators from ICU. Also introduce
shims for using thread local storage via pthreads.
We will be relying on ICU and UBreakIterators for grapheme
breaking. But, UBreakIterators are very expensive to create,
especially for the way we do grapheme breaking, which is relatively
stateless. Thus, we will stash one or more into thread local storage
and reset it as needed.
Note: Currently, pthread_key_t is hard coded for a single platform
(Darwin), but I have a static_assert alongside directions on how to
adapt it to any future platforms who differ in key type.
The runtime and stubs are built for ALL targets, not specific ones. This allows
us to configure when cross-compiling to Windows again. Collapse the dual
addition of the swiftRuntime into a single build. This unifies the runtime
build for the apple and non-Apple SDKs. The difference here was the ObjC
interop sources. In order to deal with that unification add a CPP macro to
indicate whether the interop sources should be included or not.
Update for SE-0107: UnsafeRawPointer
This adds a "mutating" initialize to UnsafePointer to make
Immutable -> Mutable conversions explicit.
These are quick fixes to stdlib, overlays, and test cases that are necessary
in order to remove arbitrary UnsafePointer conversions.
Many cases can be expressed better up by reworking the surrounding
code, but we first need a working starting point.
To minimize code size and VM live set, we try to funnel all one-time initialization through swift_once instead of mixing it with the C++ runtime's support for lazy static initialization.
This patch is for libswiftCore.lib, linking with the library set of Visual Studio 2015. Clang with the option -fms-extension is used to build this port.
This is the approved subpatch of a large patch.
The general rule here is that something needs to be SWIFT_CC(swift)
if it's just declared in Swift code using _silgen_name, as opposed to
importing something via a header.
Of course, SWIFT_CC(swift) expands to nothing by default for now, and
I haven't made an effort yet to add the indirect-result / context
parameter ABI attributes. This is just a best-effort first pass.
I also took the opportunity to shift a few files to just implement
their shims header and to demote a few things to be private stdlib
interfaces.
...and explicitly mark symbols we export, either for use by executables or for runtime-stdlib interaction. Until the stdlib supports resilience we have to allow programs to link to these SPI symbols.