Add "single extended grapheme cluster" literals (SEGCL) -- a subset of

double-quoted string literals that contain a single extended grapheme cluster

SEGCL by default infer type String, but you can ask to infer Character
for them.

Single quoted literals continue to infer Character.

Actual extended grapheme cluster segmentation is not implemented yet,
<rdar://problem/16755123> Implement extended grapheme cluster
segmentation in libSwiftBasic

This is part of
<rdar://problem/16363872> Remove single quoted characters

Swift SVN r17034
This commit is contained in:
Dmitri Hrybenko
2014-04-29 14:08:16 +00:00
parent b337d35e43
commit 669f633070
18 changed files with 218 additions and 61 deletions

28
lib/Basic/Unicode.cpp Normal file
View File

@@ -0,0 +1,28 @@
//===--- Unicode.cpp - Unicode utilities ----------------------------------===//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2014 - 2015 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See http://swift.org/LICENSE.txt for license information
// See http://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
//
//===----------------------------------------------------------------------===//
#include "swift/Basic/Unicode.h"
#include "llvm/Support/ConvertUTF.h"
using namespace swift;
StringRef swift::unicode::extractFirstExtendedGraphemeCluster(StringRef S) {
// FIXME: implement as described in Unicode Standard Annex #29.
if (S.empty())
return StringRef();
// FIXME: deal with broken code unit sequences.
// For now, just extract the first code point.
unsigned CodeUnitSeqLen = getNumBytesForUTF8(S[0]);
return S.slice(0, CodeUnitSeqLen);
}