mirror of
https://github.com/apple/swift.git
synced 2025-12-14 20:36:38 +01:00
Fix undefined behavior in SmallString.withUTF8
withUTF8 currently vends a typed UInt8 pointer to the underlying SmallString. That pointer type differs from SmallString's representation. It should simply vend a raw pointer, which would be both type safe and convenient for UTF8 data. However, since this method is already @inlinable, I added calls to bindMemory to prevent the optimizer from reasoning about access to the typed pointer that we vend. rdar://67983613 (Undefinied behavior in SmallString.withUTF8 is miscompiled) Additional commentary: SmallString creates a situation where there are two types, the in-memory type, (UInt64, UInt64), vs. the element type, UInt8. `UnsafePointer<T>` specifies the in-memory type of the pointee, because that's how C works. If you want to specify an element type, not the in-memory type, then you need to use something other than UnsafePointer to view the memory. A trivial `BufferView<UInt8>` would be fine, although, frankly, I think UnsafeRawPointer is a perfectly good type on its own for UTF8 bytes. Unfortunately, a lot of the UTF8 helper code is ABI-exposed, so to work around this, we need to insert calls to bindMemory at strategic points to avoid undefined behavior. This is high-risk and can negatively affect performance. So far, I was able to resolve the regressions in our microbenchmarks just by tweaking the inliner.
This commit is contained in:
@@ -250,7 +250,9 @@ extension _StringGuts {
|
||||
) -> Int? {
|
||||
#if _runtime(_ObjC)
|
||||
// Currently, foreign means NSString
|
||||
if let res = _cocoaStringCopyUTF8(_object.cocoaObject, into: mbp) {
|
||||
if let res = _cocoaStringCopyUTF8(_object.cocoaObject,
|
||||
into: UnsafeMutableRawBufferPointer(start: mbp.baseAddress,
|
||||
count: mbp.count)) {
|
||||
return res
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user