Commit Graph

39 Commits

Author SHA1 Message Date
Azoy
b8fc8b333c Remove _interface 2018-07-29 10:41:22 -04:00
Michael Ilseman
95cbf45957 [stubs] Add nullability annotations; NFC 2018-07-20 13:28:33 -07:00
Michael Ilseman
30450671fa Merge pull request #15593 from allevato/unicode-properties
[SE-0211] Add Unicode properties to Unicode.Scalar
2018-07-11 13:27:31 -07:00
Michael Ilseman
5358401d35 [string] Add UText shims; NFC
Add ICU shims for creating UTexts and performing grapheme breaking
using UTexts. This will allow us to explore multi-encoding grapheme
support throughout string.
2018-07-09 10:47:37 -07:00
Tony Allevato
d0e93acb00 Various fixes to Unicode.Scalar.Properties.
- numericValue returns nil instead of .nan for non-numerics
- Remove small-string optimizations from _scalarName that failed on 32-bit archs
- Put case mappings back into U.S.Properties
- Added more sanity tests
2018-07-05 20:42:56 -07:00
Tony Allevato
fb9f7ecca1 Merge branch 'master' into unicode-properties 2018-03-31 09:54:18 -07:00
Tony Allevato
354f2ad66e [stdlib] Add "numeric{Type,Value}" to Unicode.Scalar.Properties 2018-03-27 21:07:27 -07:00
Tony Allevato
e7fa49984a [stdlib] Add "{lower,title,upper}caseMapping" to Unicode.Scalar.Properties 2018-03-26 06:38:55 -07:00
Tony Allevato
9858d4edfb [stdlib] Add "name", "nameAlias" to Unicode.Scalar.Properties 2018-03-26 06:38:55 -07:00
Tony Allevato
af798fa972 [stdlib] Add "generalCategory" to Unicode.Scalar.Properties 2018-03-26 06:38:55 -07:00
Tony Allevato
d6ee54f4b5 [stdlib] Add "age" to Unicode.Scalar.Properties 2018-03-26 06:38:54 -07:00
Lance Parker
cbf157f924 [stdlib]Unify String hashing implementation (#14921)
* Add partial range subscripts to _UnmanagedOpaqueString

* Use SipHash13+_NormalizedCodeUnitIterator for String hashes on all platforms

* Remove unecessary collation algorithm shims

* Pass the buffer to the SipHasher for ASCII

* Hash the ascii parts of UTF16 strings the same way we hash pure ascii strings

* De-dupe some code that can be shared between _UnmanagedOpaqueString and _UnmanagedString<UInt16>

* ASCII strings now hash consistently for in hashASCII() and hashUTF16()

* Fix zalgo comparison regression

* Use hasher

* Fix crash when appending to an empty _FixedArray

* Compact ASCII characters into a single UInt64 for hashing

* String: Switch to _hash(into:)-based hashing

This should speed up String hashing quite a bit, as doing it through hashValue involves two rounds of SipHash nested in each other.

* Remove obsolete workaround for ARC traffic

* Ditch _FixedArray<UInt8> in favor of _UIntBuffer<UInt64, UInt8>

* Bad rebase remnants

* Fix failing benchmarks

* michael's feedback

* clarify the comment about nul-terminated string hashes
2018-03-17 22:13:37 -07:00
Saleem Abdulrasool
73c04d1dd9 stubs: match ICU signature for unorm2_normalize (NFC)
Adjust the signature to match the ICU declaration for
`unorm2_normalize`.  This was adjusted to allow building against ICU
59.1.  The shim type definition for the UChar ensures that the signature
is correct on all the targets.  NFC.
2018-02-25 22:48:27 -08:00
Lance Parker
0661de22a2 [stdlib]Un-revert string comparison (#14694)
Restore (un-revert) sting comparison, with fixes

More exhaustive testing of opaque strings, which consistently reproduces prior sporadic failure. Shims fixups. Some test tweaking.
2018-02-18 10:50:33 -08:00
Saleem Abdulrasool
cd58f5c1c9 Shims: adjust the ICU interface
The default ICU build will change the underlying type of the UChar type,
with C++ using the builtin `char16_t` rather than `unsigned short`.
This adjusts the interface to account for that.  I've verified across
Apple's implementation that they always use the `unsigned short` as the
type for `UChar`.  Since we cannot guarantee that the ICU interfaces are
built the same way on all targets, especially when using the underlying
system's ICU.

Adjust the stubs implementation declaration to match the ICU header's
declaration.
2018-02-15 21:11:06 -08:00
Lance Parker
abe6a6d177 Revert string comparison (#14657) 2018-02-15 14:37:43 -08:00
Lance Parker
db2de58dd6 Remove unecessary unicode collation algorithm shims 2018-02-14 15:44:11 -08:00
Lance Parker
32d21c489f Add unicode normalization shims 2018-02-14 15:44:11 -08:00
Michael Ilseman
5eb5e34897 [stdlib] Shims for UBreakIterator and thread local storage.
Introduce shims for using UBreakIterators from ICU. Also introduce
shims for using thread local storage via pthreads.

We will be relying on ICU and UBreakIterators for grapheme
breaking. But, UBreakIterators are very expensive to create,
especially for the way we do grapheme breaking, which is relatively
stateless. Thus, we will stash one or more into thread local storage
and reset it as needed.

Note: Currently, pthread_key_t is hard coded for a single platform
(Darwin), but I have a static_assert alongside directions on how to
adapt it to any future platforms who differ in key type.
2017-05-16 20:28:31 -07:00
Michael Ilseman
f0abff5539 Revert "Merge pull request #9265 from milseman/tls_ftw"
This reverts commit 26f7659efe, reversing
changes made to 7b927e55e8.
2017-05-11 10:39:58 -07:00
Michael Ilseman
4a17449d02 [stdlib] Shims for UBreakIterator and thread local storage.
Introduce shims for using UBreakIterators from ICU. Also introduce
shims for using thread local storage via pthreads.

We will be relying on ICU and UBreakIterators for grapheme
breaking. But, UBreakIterators are very expensive to create,
especially for the way we do grapheme breaking, which is relatively
stateless. Thus, we will stash one or more into thread local storage
and reset it as needed.

Note: Currently, pthread_key_t is hard coded for a single platform
(Darwin), but I have a static_assert alongside directions on how to
adapt it to any future platforms who differ in key type.
2017-05-10 15:21:07 -07:00
Hugh Bellamy
bb34e2a959 Fix attribute fallout from new refcount representation 2017-03-02 19:44:37 +07:00
Hugh Bellamy
05a50fd978 Remove extern "C" from uses of SWIFT_RUNTIME_STDLIB_INTERFACE 2017-01-22 18:32:17 +00:00
practicalswift
6d1ae2a39c [gardening] 2016 → 2017 2017-01-06 16:41:22 +01:00
practicalswift
797b80765f [gardening] Use the correct base URL (https://swift.org) in references to the Swift website
Remove all references to the old non-TLS enabled base URL (http://swift.org)
2016-11-20 17:36:03 +01:00
Dmitri Gribenko
e8e8b35610 stdlib: use SipHash-1-3 for string hashing on non-ObjC platforms
Part of rdar://problem/24109692
2016-09-06 20:41:03 -07:00
Dmitri Gribenko
4da587b388 stdlib: annotate Unicode shims with nullability 2016-09-03 11:51:42 -07:00
Andrew Trick
0b75ee975e Remove "illegal" UnsafePointer casts from the stdlib.
Update for SE-0107: UnsafeRawPointer

This adds a "mutating" initialize to UnsafePointer to make
Immutable -> Mutable conversions explicit.

These are quick fixes to stdlib, overlays, and test cases that are necessary
in order to remove arbitrary UnsafePointer conversions.

Many cases can be expressed better up by reworking the surrounding
code, but we first need a working starting point.
2016-07-28 20:42:23 -07:00
Saleem Abdulrasool
81661fca61 stdlib: use the reserved attribute spellings
This is a purely mechanical change replacing the attributes with the reserved
spelling.  Compilers are to not error when they encounter a reserved spelling
for an attribute which they do not support.
2016-05-11 11:30:24 -07:00
John McCall
39107bdab9 Build fix. 2016-05-04 10:36:23 -07:00
John McCall
50d58b2732 Add a lot of calling-convention annotations to the standard library / runtime.
The general rule here is that something needs to be SWIFT_CC(swift)
if it's just declared in Swift code using _silgen_name, as opposed to
importing something via a header.

Of course, SWIFT_CC(swift) expands to nothing by default for now, and
I haven't made an effort yet to add the indirect-result / context
parameter ABI attributes.  This is just a best-effort first pass.

I also took the opportunity to shift a few files to just implement
their shims header and to demote a few things to be private stdlib
interfaces.
2016-05-04 10:31:23 -07:00
Joe Groff
f7291b21ec Runtime: Build with -fvisibility=hidden.
...and explicitly mark symbols we export, either for use by executables or for runtime-stdlib interaction. Until the stdlib supports resilience we have to allow programs to link to these SPI symbols.
2016-02-08 08:06:02 -08:00
Michael Gottesman
cc1c6f7077 [runtime] Mark some readonly unicode functions as being pure.
These come in two categories of functions:

1. Comparison.
  - swift_stdlib_unicode_compare_utf16_utf16
  - swift_stdlib_unicode_compare_utf8_utf16
  - swift_stdlib_unicode_compare_utf8_utf8

2. Hashing.
  - swift_stdlib_unicode_hash
  - swift_stdlib_unicode_hash_ascii
2016-02-02 15:39:15 -08:00
Zach Panzarino
e3a4147ac9 Update copyright date 2015-12-31 23:28:40 +00:00
Arnold Schwaighofer
eb3d5e4d4a Revert "stdlib: Move the darwin String implementation over to use the ICU library."
Revert "Add test cases to exercise the native String vs cocoa buffer String path."
Revert "stdlib: Add back a test I removed"
Revert "stdlib: Fix hasPrefix,hasSuffix tests"
Revert "stdlib: Add documentation for the cached ascii collation tables"

This reverts commit 31493, 31492, 31491, 31490, 31489.

There are linking errors in SwiftExternalProjects (we probably have to link
against libicucore somewhere).

Swift SVN r31543
2015-08-27 21:02:32 +00:00
Arnold Schwaighofer
502f1e3de1 stdlib: Move the darwin String implementation over to use the ICU library.
Reapply of 31474 with a fix in _compareCocoaBuffer to use the bufferSizeRhs
variable instead of bufferSizeLhs for the right hand side buffer.

We no longer create intermediate NSString copies to compare and hash swift
Strings. Instead we call directly into the ICU library.

I measured a 1.2 to 2x improvement on dictionary benchmarks as a result of this.
The SuperChars benchmark is also about 1.2x faster because of this.

Pure ASCII comparison has gotten a little bit slower (20% on a pure comparison
micro-benchmark) because we no longer do a memcmp. Doing a memcmp on ASCII is
not the same as the default unicode collation. Instead we have to a string scan.
The default unicode collation does not order like ASCII does and ignores
characters (for example the \0 character).

rdar://18992510

Swift SVN r31489
2015-08-26 15:14:18 +00:00
Arnold Schwaighofer
2d8f29e710 Revert "stdlib: Fix hasPrefix,hasSuffix tests"
Revert "stdlib: Add back a test I removed"
Revert "Add test cases to exercise the native String vs cocoa buffer String path."
Revert "stdlib: Move the darwin String implementation over to use the ICU library."

This reverts commit r31477, r31476, r31475, r31474.

Commit r31474 broke the ASAN build.

Swift SVN r31488
2015-08-26 13:09:03 +00:00
Arnold Schwaighofer
5a25a00d1f stdlib: Move the darwin String implementation over to use the ICU library.
We no longer create intermediate NSString copies to compare and hash swift
Strings. Instead we call directly into the ICU library.

I measured a 1.2 to 2x improvement on dictionary benchmarks as a result of this.
The SuperChars benchmark is also about 1.2x faster because of this.

Pure ASCII comparison has gotten a little bit slower (20% on a pure comparison
micro-benchmark) because we no longer do a memcmp. Doing a memcmp on ASCII is
not the same as the default unicode collation. Instead we have to a string scan.
The default unicode collation does not order like ASCII does and ignores
characters (for example the \0 character).

rdar://18992510

Swift SVN r31474
2015-08-26 03:36:59 +00:00
Dmitri Hrybenko
350248dae5 Reorganize the directory structure under 'stdlib'
The standard library has grown significantly, and we need a new
directory structure that clearly reflects the role of the APIs, and
allows future growth.

See stdlib/{public,internal,private}/README.txt for more information.

Swift SVN r25876
2015-03-09 05:26:05 +00:00