swift-mirror

mirror of https://github.com/apple/swift.git synced 2025-12-14 20:36:38 +01:00

Author	SHA1	Message	Date
Michael Ilseman	75463e30f3	[stdlib] Rename _StringCore to _LegacyStringCore. NFC. In grand LLVM tradition, the first step to redesigning _StringCore is to first rename it to _LegacyStringCore. Subsequent commits will introduce the replacement, and eventually all uses of the old one will be moved to the new one. NFC.	2018-01-21 12:28:56 -08:00
Ben Cohen	4ddac3fbbd	[stdlib] Eradicate IndexDistance associated type (#12641 ) * Eradicate IndexDistance associated type, replacing with Int everywhere * Consistently use Int for ExistentialCollection’s IndexDistance type. * Fix test for IndexDistance removal * Remove a handful of no-longer-needed explicit types * Add compatibility shims for non-Int index distances * Test compatibility shim * Move IndexDistance typealias into the Collection protocol	2017-12-08 12:00:23 -08:00
Ben Cohen	dcab9493ae	Removed some warnings (#12753 )	2017-11-30 15:12:56 -08:00
Max Moiseev	a24998a5b1	[stdlib] Add missing @_fixed_layout attributes to fix resilience build	2017-10-02 15:19:06 -07:00
Max Moiseev	ef6b5c4795	Add missing @_inlineable attributes and deinits	2017-09-29 11:26:56 -07:00
Max Moiseev	53b8419279	[stdlib] Make all the stdlib APIs @_inlineable This change in theory should allow us to remove a special stdlib-only sil-serialize-all compilation mode. <rdar://problem/34138683>	2017-09-29 11:26:56 -07:00
swift-ci	79a3f9c415	Merge pull request #11670 from natecook1000/nc-rev-77-2	2017-09-19 10:15:59 -07:00
Nate Cook	050268d876	[stdlib] Documentation revisions - Update NSRange -> Range guidance - Fix example in Optional - Improve RangeExpression docs - Fix issue in UnsafeRawBufferPointer.initializeMemory - Code point -> scalar value most places - Reposition the dot above the scripty `i' - Fix ExpressibleByArrayLiteral code sample	2017-08-29 09:41:55 -05:00
Maxim Moiseev	ee5fb33656	[stdlib] Remove the Grand Renaming artifacts of Swift 3 era	2017-08-28 15:54:11 -07:00
Michael Ilseman	7c705c3a75	[stdlib] Deprecate String/Substring.CharacterView CharacterView is now entirely redundant in Swift 4. Deprecate its use. This also allows us to schedule the unbreaking of String.CharacterView leakiness without a hard source break.	2017-08-10 17:24:06 -07:00
Dave Abrahams	9159239995	Un-revert "[stdlib] String index interchange, etc." (#10812 ) I failed to merge the upstream changes to swift-corelibs-foundation at the same time as I merged that #9806, and it broke on linux. Going to get it right this time.	2017-07-07 12:13:25 -07:00
Xi Ge	d9fb110674	Revert "[stdlib] String index interchange, etc." (#10812 ) rdar://33186295	2017-07-07 12:03:16 -07:00
Dave Abrahams	e523c80339	[stdlib] Index interchange, part I	2017-07-07 00:59:04 -07:00
Michael Ilseman	5bc20cba08	[stdlib] Clean up non-contiguous string grapheme breaking code. Removes the legacy grapheme breaking code paths. Simplifies and clarifies the non-contiguous grapheme breaking code through consistent naming and handling of absolute positions vs relative offsets.	2017-06-28 15:46:44 -07:00
Michael Ilseman	b3b28e0c50	[gardening] 80 columns; NFC	2017-06-28 15:46:39 -07:00
Michael Ilseman	a37a823e6e	[stdlib] Update non-contiguous NSStrings to Unicode 9 This adds Unicode 9 grapheme breaking support for non-contiguous NSStrings. Non-contiguous NSStrings that don't hit our fast paths are very rare, but should still behave identically to contiguous strings. We first copy a fixed number of code units into a fixed size buffer (currently 16 in size) and try to grapheme break inside of that buffer. This is sufficient storage for all known non-pathological graphemes. Any graphemes larger than the buffer are handled by copying larger portions of the string into an Array. Test cases added, including pathological "zalgo" text that stresses extremely long graphemes.	2017-06-28 15:35:25 -07:00
Michael Ilseman	4c0ba61e53	[gardening] Remove done TODO comments	2017-06-27 20:37:16 -07:00
Michael Ilseman	bd5189c25a	[String] Grapheme fast paths for punctuation: 5-8x speedup. Many strings use non-sub-300 punctuation characters (e.g. unicode hyphen, CJK quotes, etc). This can cause switching between fast and slow paths for grapheme breaking. Add in fast-paths for general punctuation characters and CJK punctuation and symbol characters. This results in about a 5-8x speedup for heavily (unicode) punctuated Latiny and CJKy workloads.	2017-06-27 19:18:51 -07:00
Nate Cook	825e9d077d	[stdlib] More documentation revisions / consistency fixes.	2017-06-13 14:08:00 -05:00
Dave Abrahams	562fd79aa6	[stdlib] Encode small Characters as UTF-16 This takes care of the standard library portion, but we need a new BuiltinUTF16ExtendedGraphemeClusterLiteralConvertible protocol in order to fully recover the performance of character literals. Note that part of the character_literals.swift test is currently disabled. That will need to be fixed before we can merge this work.	2017-06-01 20:57:25 -07:00
Michael Ilseman	44cccba22d	[stdlib] Change dynamic check to sanity check. Double-checking for CR-LF is redundant in _internalExtraCheckGraphemeBreakBetween. Add in a sanity check and omit the overly conservative CR check.	2017-05-31 14:55:24 -07:00
Michael Ilseman	0a88de53d3	[stdlib] Grapheme break fast-paths for Cyrillic, Arabic, Hangul Add in more grapheme break fast paths for scripts based on Cyrillic, Arabic, or Hangul. Generates significant performance wins, similar to those for the unihan fast paths. While every extra check does slow down the runtime of _internalExtraCheckGraphemeBreakBetween as currently implemented, I've not found the performance cost to be relevant for workloads with occasional mixed emoji contents, nor for workloads that his the earlier checks. A pure Korean workload (currently the last check) does pays a rather noticable price for the previous checks, but this is only because the workload is now so greatly improved. Optimizing this implementation is interesting future work, but not urgent.	2017-05-31 11:09:43 -07:00
Dave Abrahams	801b9c5544	[stdlib] Move specialization from init to append Since init just calls append anyway, it's 2 birds/1 stone	2017-05-24 16:10:34 -07:00
Dave Abrahams	794a287c27	Kill a stray TAB How'd that get in there? Thanks, @moiseev	2017-05-24 04:10:25 -07:00
Dave Abrahams	3d789cff2d	Inlineable character fast paths	2017-05-23 01:42:28 -07:00
swift-ci	c0623c42ce	Merge pull request #9722 from apple/stringprotocol-interchange	2017-05-17 19:13:25 -07:00
Dave Abrahams	d6fee05375	[stdlib] Enable interchange among StringProtocol models	2017-05-17 17:21:43 -07:00
Michael Ilseman	97511d65bf	[stdlib] Unicode 9 here we come: use ICU for grapheme breaking Use UBreakIterators to perform grapheme breaking. This gives Unicode 9 grapheme breaking (e.g. family emoji) and provides a means to upgrade to future versions. It also serves as a model for how to serve up other advanced functionality in ICU to users. This has tricky performance implications. Some things are faster and a number of cases are slower. But, careful use of ICU can help mitigate and amortize these costs. In conjunction with more early detection of fast paths, overall grapheme breaking for the average user should be much faster than in Swift 3. NOTE: This is incomplete. It currently falls back on the legacy tries for some bridged strings. There are many potential directions for a general solution, but for now we'll be interatively adding support for more and more special cases.	2017-05-16 20:29:21 -07:00
practicalswift	aae419ad30	[gardening] Fix word processing artefacts	2017-05-15 11:30:25 +02:00
Michael Ilseman	fb5734c24f	Merge pull request #9575 from milseman/unihan_fasterhan [stdlib] String: Walk Chinese/Japanese faster: 2x/4x forwards/backwards	2017-05-14 13:50:15 -07:00
Ben Cohen	ea2f64cad2	[stdlib] Add Sequence.Element, change ExpressibleByArrayLiteral.Element to ArrayLiteralElement (#8990 ) * Give Sequence a top-level Element, constrain Iterator to match * Remove many instances of Iterator. * Fixed various hard-coded tests * XFAIL a few tests that need further investigation * Change assoc type for arrayLiteralConvertible * Mop up remaining "better expressed as a where clause" warnings * Fix UnicodeDecoders prototype test * Fix UIntBuffer * Fix hard-coded Element identifier in CSDiag * Fix up more tests * Account for flatMap changes	2017-05-14 06:33:25 -07:00
Nate Cook	f650e0a7da	[stdlib] String and range expressions * finish string documentation revisions * revise examples throughout to use range expressions instead of e.g. prefix(upTo: _)	2017-05-13 10:06:12 -05:00
Michael Ilseman	f08ee0fd93	[stdlib] Walk Chinese/Japanese faster: 2x/4x forwards/backwards This adds more fast path checks for grapheme breaks between BMP scalars. Notably the rather vast range of 0x3400–0xA4CF which includes unified common Han ideographs as well as the first extension to unified Han ideographs. It also happens to pick up various Yijin and Yi symbols/radicals. Additionally, the narrow hiragana/katakana ranges 0x3041-0x3096 and 0x30A1-0x30FA (including pre-composed semi-voiced characters but excluding the combining semi-voice marks) have fast paths. The net effect is that the vast majority of modern Chinese and Japanese text should be fast-pathed. This is especially important, as adopting Unicode 9 might otherwise pessimize performance here relative to the tries.	2017-05-12 13:13:33 -07:00
Dave Abrahams	ddf7ad517f	UnicodeScalar => Unicode.Scalar	2017-05-11 15:23:25 -07:00
Michael Ilseman	f0abff5539	Revert "Merge pull request #9265 from milseman/tls_ftw" This reverts commit `26f7659efe`, reversing changes made to `7b927e55e8`.	2017-05-11 10:39:58 -07:00
Michael Ilseman	18104c616c	[stdlib] Unicode 9 here we come: use ICU for grapheme breaking Use UBreakIterators to perform grapheme breaking. This gives Unicode 9 grapheme breaking (e.g. family emoji) and provides a means to upgrade to future versions. It also serves as a model for how to serve up other advanced functionality in ICU to users. This has tricky performance implications. Some things are faster and a number of cases are slower. But, careful use of ICU can help mitigate and amortize these costs. In conjunction with more early detection of fast paths, overall grapheme breaking for the average user should be much faster than in Swift 3. NOTE: This is incomplete. It currently falls back on the legacy tries for some bridged strings. There are many potential directions for a general solution, but for now we'll be interatively adding support for more and more special cases.	2017-05-10 15:21:08 -07:00
Max Moiseev	178b9f0b44	[stdlib] Adding bounds check in a.subscript(Index) fast path UnsafeBufferPoiunter subscript used in the fast path only checks bounds in Debug mode, therefore extra checks are needed. Addresses: <rdar://problem/31992473>	2017-05-05 15:26:24 -07:00
Michael Ilseman	47d0247476	[stdlib] Speed up Character construction from CharacterView.subscript (#9252 ) This adds a fast path for single-code-unit Character construction. Rather than use the general purpose String based initializer (which then repeats grapheme breaking to ensure a trap, amongst other inefficiencies), just make the Character from the single unicode scalar value directly. This also speeds up simple iteration of BMP strings when the optimizer is unable to eliminate the subscript. Around 2x for ASCII, and around 20% for BMP UTF16.	2017-05-04 06:59:30 -07:00
Ben Cohen	1163ea7c7a	[stdlib] swapAt method (#9119 ) * Add swapAt method * Migrate sorting to swapAt * Migrate further stdlib usage	2017-04-29 11:55:00 -07:00
Michael Ilseman	fa61a665c5	[stdlib] Fix relative-offset computing bug in grapheme breaking. Many of StringCore private APIs, when the StringCore is itself a substring, expect relative offsets rather than absolute offsets. This fixes a bug in the sub-0x300 fast path where we were using absolute offsets. Test cases added.	2017-04-25 16:51:35 -07:00
Michael Ilseman	2d8164e552	[stdlib] Parse my tweets faster! 2x forwards, 3x reverse Adds in a special case grapheme break detection between two values within scalar ranges where we have special knowledge. Any sub-0x300 scalars, except CR-LF, are guaranteed to have grapheme breaks between them. We're reasonably confident this will not change in future versions of Unicode. We might add more ranges in the future, but should do so conservatively, anticipating future Unicode changes. In these cases we can very quickly break, even for strings that have mixed latin and emoji characters. In a ASCII string with a single emoji in it, we traverse the string 2x faster forwards and 3x faster in reverse. (Reverse is 3x faster as it involves some forwards traversal inside of the index). For a string that's half Latin half non-Latin, we're about 1.5x faster forwards and backwards.	2017-04-24 15:17:25 -07:00
Michael Ilseman	87e0076272	[stdlib] un-_fastPath ASCII path, it implies coldness of other branch By removing the _fastPath inside of grapheme breaking, we avoid cooling off all the other blocks in the function, restoring performance parity with before the ASCII special case logic. This does not meaningfully impact the performance of the special case, so we are still about 2-3x faster iterating forwards and backwards over ASCII characters than before.	2017-04-20 16:53:53 -07:00
Michael Ilseman	8b5777fdd2	[stdlib] Bug fix in reverse ASCII grapheme breaking Fix a bug using wrong index calculations in the ASCII grapheme breaking fast path. Add new test case.	2017-04-20 16:22:18 -07:00
Michael Ilseman	8ff9bb602f	[stdlib] Speed up char iteration on ASCII strings. By doing a fast check for CR-LF, we can speed up forwards and backwards character iteration on ASCII strings by ~2-3x. This speedup can be seen in the stringTests suite (previous PR). It is done as a fast path inside of _measureExtendedGraphemeClusterForward/Backward, which is never inlined. Doing the fast path inline would yield even more dramatic improvements, and might be an area for future exploration, but would slightly pessimize non-ASCII strings due to code bloat. Additionally, the current implementation has not been micro-optimized yet. It's possible that more optimal code would be smaller and have less of an impact on the general case.	2017-04-20 16:22:18 -07:00
practicalswift	6d1ae2a39c	[gardening] 2016 → 2017	2017-01-06 16:41:22 +01:00
practicalswift	797b80765f	[gardening] Use the correct base URL (https://swift.org ) in references to the Swift website Remove all references to the old non-TLS enabled base URL (http://swift.org)	2016-11-20 17:36:03 +01:00
Nate Cook	20ce99a255	[stdlib] Fix example in String.CharacterView discussion	2016-11-14 15:43:50 -06:00
Nate Cook	bd6025f463	[stdlib] Various documentation fixes - Fix incorrect type in Float(_:String) examples - Expand discussions for ExpressibleBy_Literal protocols - Add notes about non-escaping unsafe pointers from closures - Add note about isEmpty to Collection.count discussions - Describe imported `Bool` types - Clean up some floating point discussions - Provide some additional operator documentation - Revise documentation for CVarArg functions - Fix incorrect Set method parameter descriptions - Clarify array bridging behavior - Add collection subscript complexity notes	2016-11-11 11:23:49 -06:00
Nate Cook	c2bc72d9d6	[stdlib] Fix string index sharing (#4896 ) * [stdlib] Fix String.UTF16View index sharing * [stdlib] Fix String.UnicodeScalarView index sharing * [stdlib] Fix String.CharacterView index sharing * [stdlib] Test advancing string indices past their ends * [stdlib] Simplify CharacterView ranged subscript	2016-10-13 10:19:38 -07:00
airspeedswift	ed5231b47c	Numbered all FIXME(ABI) entries for tracking purposes. (#4868 )	2016-09-19 16:41:41 -07:00

1 2 3 4

165 Commits