Remove ByteTree serialization format

It was originally designed for faster trasmission of syntax trees from
C++ to SwiftSyntax, but superceded by the CLibParseActions. There's no
deserializer for it anymore, so let's just remove it.
This commit is contained in:
Alex Hoppen
2021-01-14 20:37:49 +01:00
parent fb4583f7e7
commit 8ec8516893
17 changed files with 67 additions and 853 deletions

View File

@@ -1,32 +0,0 @@
# ByteTree
The ByteTree format is a binary format to efficiently serialize and deserialize trees. It was designed to serialize the syntax tree in `libSyntax` but the framework allows serialisation of arbitrary trees. It currently offers a serialiser written in C++ and a deserialiser written in Swift.
## Overview
The ByteTree format consists of two different constructs: *objects* and *scalars*. A scalar is a raw sequence of binary data. Scalars are untyped and the meaning of their binary data needs to be inferred by the client based on their position in the tree. An object consists of multiple *fields*, indexed by their position within the object, which again can be either objects or scalars.
## Serialization of scalars
A scalar is encoded as its size followed by the data. Size is a `uint_32` that represents the size of the data in bytes in little endian order. It always has its most significant bit set to 0 (to distinguish objects from scalars, see *Forwards compatibility*).
For example, the string "Hello World" would be encoded as `(uint32_t)11` `"Hello World"`, or in hex `0B 00 00 00 48 65 6C 6C 6F 20 57 6F 72 6C 64`.
## Serialization of objects
An object consists of its size, measured in the number of fields and represented as a `uint_32t` in little endian order, followed by the direct concatenation of its fields. Because each field is again prefixed with its size, no delimits are necessary in between the fields.
To distinguish scalars and objects, the size of objects has its most-significant bit set to 1. It must be ignored to retrieve the number of fields in the object.
Arrays are modelled as objects whose fields are all of the same type and whose length is variadic (and is indicated by the object's size).
## Versioning
The ByteTree format is prepended by a 4-byte protocol version number that describes the version of the object tree that was serialized. Its exact semantics are up to each specific application, but it is encouraged to interpret it as a two-component number where the first component, consisting of the three most significant bytes, is incremented for breaking changes and the last byte is incremented for backwards-compatible changes.
## Forward compatibility
Fields may be added to the end of objects in a backwards compatible manner (older deserialisers will still be able to deserialise the format). It does so by skipping over all fields that are not read during deserialisation. Newer versions of the deserialiser can detect if recently added fields are not present in the serialised data by inspecting the `numFields` property passed during deserialisation.
## Serialization safety
Since all fields in objects are accessed by their index, issues quickly arise if a new field is accidentally added at the beginning of an object. To prevent issues like this, the ByteTree serialiser and deserialiser requires the explicit specification of each field's index within the object. These indices are never serialised. Their sole purpose is to check that all fields are read in the correct order in assertion builds.

View File

@@ -86,9 +86,6 @@ documentation, please create a thread on the Swift forums under the
## Explanations
- [ByteTree.md](/docs/ByteTree.md):
Describes the ByteTree binary format used for serializing syntax trees
in `libSyntax`.
- [WebAssembly.md](/docs/WebAssembly.md):
Explains some decisions that were made while implementing the WebAssembly target.

View File

@@ -1,363 +0,0 @@
//===--- ByteTreeSerialization.h - ByteTree serialization -------*- C++ -*-===//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2014 - 2018 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
//
//===----------------------------------------------------------------------===//
//
/// \file
/// Provides an interface for serializing an object tree to a custom
/// binary format called ByteTree.
///
//===----------------------------------------------------------------------===//
#ifndef SWIFT_BASIC_BYTETREESERIALIZATION_H
#define SWIFT_BASIC_BYTETREESERIALIZATION_H
#include "llvm/Support/BinaryStreamError.h"
#include "llvm/Support/BinaryStreamWriter.h"
#include "swift/Basic/ExponentialGrowthAppendingBinaryByteStream.h"
#include <map>
namespace {
// Only used by compiler if both template types are the same
template <typename T, T>
struct SameType;
} // anonymous namespace
namespace swift {
namespace byteTree {
class ByteTreeWriter;
using UserInfoMap = std::map<void *, void *>;
/// Add a template specialization of \c ObjectTraits for any type that
/// serializes as an object consisting of multiple fields.
template <class T>
struct ObjectTraits {
// Must provide:
/// Return the number of fields that will be written in \c write when
/// \p Object gets serialized. \p UserInfo can contain arbitrary values that
/// can modify the serialization behaviour and gets passed down from the
/// serialization invocation.
// static unsigned numFields(const T &Object, UserInfoMap &UserInfo);
/// Serialize \p Object by calling \c Writer.write for all the fields of
/// \p Object. \p UserInfo can contain arbitrary values that can modify the
/// serialization behaviour and gets passed down from the serialization
/// invocation.
// static void write(ByteTreeWriter &Writer, const T &Object,
// UserInfoMap &UserInfo);
};
/// Add a template specialization of \c ScalarTraits for any type that
/// serializes into a raw set of bytes.
template <class T>
struct ScalarTraits {
// Must provide:
/// Return the number of bytes the serialized format of \p Value will take up.
// static unsigned size(const T &Value);
/// Serialize \p Value by writing its binary format into \p Writer. Any errors
/// that may be returned by \p Writer can be returned by this function and
/// will be handled on the call-side.
// static llvm::Error write(llvm::BinaryStreamWriter &Writer, const T &Value);
};
/// Add a template specialization of \c DirectlyEncodable for any type whose
/// serialized form is equal to its binary representation on the serializing
/// machine.
template <class T>
struct DirectlyEncodable {
// Must provide:
// static bool const value = true;
};
/// Add a template specialization of \c WrapperTypeTraits for any type that
/// serializes as a type that already has a specialization of \c ScalarTypes.
/// This will typically be useful for types like enums that have a 1-to-1
/// mapping to e.g. an integer.
template <class T>
struct WrapperTypeTraits {
// Must provide:
/// Write the serializable representation of \p Value to \p Writer. This will
/// typically take the form \c Writer.write(convertedValue(Value), Index)
/// where \c convertedValue has to be defined.
// static void write(ByteTreeWriter &Writer, const T &Value, unsigned Index);
};
// Test if ObjectTraits<T> is defined on type T.
template <class T>
struct has_ObjectTraits {
using Signature_numFields = unsigned (*)(const T &, UserInfoMap &UserInfo);
using Signature_write = void (*)(ByteTreeWriter &Writer, const T &Object,
UserInfoMap &UserInfo);
template <typename U>
static char test(SameType<Signature_numFields, &U::numFields> *,
SameType<Signature_write, &U::write> *);
template <typename U>
static double test(...);
public:
static bool const value =
(sizeof(test<ObjectTraits<T>>(nullptr, nullptr)) == 1);
};
// Test if ScalarTraits<T> is defined on type T.
template <class T>
struct has_ScalarTraits {
using Signature_size = unsigned (*)(const T &Object);
using Signature_write = llvm::Error (*)(llvm::BinaryStreamWriter &Writer,
const T &Object);
template <typename U>
static char test(SameType<Signature_size, &U::size> *,
SameType<Signature_write, &U::write> *);
template <typename U>
static double test(...);
public:
static bool const value =
(sizeof(test<ScalarTraits<T>>(nullptr, nullptr)) == 1);
};
// Test if WrapperTypeTraits<T> is defined on type T.
template <class T>
struct has_WrapperTypeTraits {
using Signature_write = void (*)(ByteTreeWriter &Writer, const T &Object,
unsigned Index);
template <typename U>
static char test(SameType<Signature_write, &U::write> *);
template <typename U>
static double test(...);
public:
static bool const value = (sizeof(test<WrapperTypeTraits<T>>(nullptr)) == 1);
};
class ByteTreeWriter {
private:
/// The writer to which the binary data is written.
llvm::BinaryStreamWriter &StreamWriter;
/// The underlying stream of the StreamWriter. We need this reference so that
/// we can call \c ExponentialGrowthAppendingBinaryByteStream.writeInteger
/// which is more efficient than the generic \c writeBytes of
/// \c llvm::BinaryStreamWriter since it avoids the arbitrary size memcopy.
ExponentialGrowthAppendingBinaryByteStream &Stream;
/// The number of fields this object contains. \c UINT_MAX if it has not been
/// set yet. No member may be written to the object if expected number of
/// fields has not been set yet.
unsigned NumFields = UINT_MAX;
/// The index of the next field to write. Used in assertion builds to keep
/// track that no indicies are jumped and that the object contains the
/// expected number of fields.
unsigned CurrentFieldIndex = 0;
UserInfoMap &UserInfo;
/// The \c ByteTreeWriter can only be constructed internally. Use
/// \c ByteTreeWriter.write to serialize a new object.
/// \p Stream must be the underlying stream of \p SteamWriter.
ByteTreeWriter(ExponentialGrowthAppendingBinaryByteStream &Stream,
llvm::BinaryStreamWriter &StreamWriter, UserInfoMap &UserInfo)
: StreamWriter(StreamWriter), Stream(Stream), UserInfo(UserInfo) {}
/// Write the given value to the ByteTree in little-endian byte order.
template <typename T>
llvm::Error writeInteger(T Value) {
auto Error = Stream.writeInteger(StreamWriter.getOffset(), Value);
StreamWriter.setOffset(StreamWriter.getOffset() + sizeof(T));
return Error;
}
/// Set the expected number of fields the object written by this writer is
/// expected to have.
void setNumFields(uint32_t NumFields) {
assert(NumFields != UINT_MAX &&
"NumFields may not be reset since it has already been written to "
"the byte stream");
assert((this->NumFields == UINT_MAX) && "NumFields has already been set");
// Num fields cannot exceed (1 << 31) since it would otherwise interfere
// with the bitflag that indicates if the next construct in the tree is an
// object or a scalar.
assert((NumFields & ((uint32_t)1 << 31)) == 0 && "Field size too large");
// Set the most significant bit to indicate that the next construct is an
// object and not a scalar.
uint32_t ToWrite = NumFields | (1 << 31);
auto Error = writeInteger(ToWrite);
(void)Error;
assert(!Error);
this->NumFields = NumFields;
}
/// Validate that \p Index is the next field that is expected to be written,
/// does not exceed the number of fields in this object and that
/// \c setNumFields has already been called.
void validateAndIncreaseFieldIndex(unsigned Index) {
assert((NumFields != UINT_MAX) &&
"setNumFields must be called before writing any value");
assert(Index == CurrentFieldIndex && "Writing index out of order");
assert(Index < NumFields &&
"Writing more fields than object is expected to have");
CurrentFieldIndex++;
}
~ByteTreeWriter() {
assert(CurrentFieldIndex == NumFields &&
"Object had more or less elements than specified");
}
public:
/// Write a binary serialization of \p Object to \p StreamWriter, prefixing
/// the stream by the specified ProtocolVersion.
template <typename T>
typename std::enable_if<has_ObjectTraits<T>::value, void>::type
static write(ExponentialGrowthAppendingBinaryByteStream &Stream,
uint32_t ProtocolVersion, const T &Object,
UserInfoMap &UserInfo) {
llvm::BinaryStreamWriter StreamWriter(Stream);
ByteTreeWriter Writer(Stream, StreamWriter, UserInfo);
auto Error = Writer.writeInteger(ProtocolVersion);
(void)Error;
assert(!Error);
// There always is one root. We need to set NumFields so that index
// validation succeeds, but we don't want to serialize this.
Writer.NumFields = 1;
Writer.write(Object, /*Index=*/0);
}
template <typename T>
typename std::enable_if<has_ObjectTraits<T>::value, void>::type
write(const T &Object, unsigned Index) {
validateAndIncreaseFieldIndex(Index);
auto ObjectWriter = ByteTreeWriter(Stream, StreamWriter, UserInfo);
ObjectWriter.setNumFields(ObjectTraits<T>::numFields(Object, UserInfo));
ObjectTraits<T>::write(ObjectWriter, Object, UserInfo);
}
template <typename T>
typename std::enable_if<has_ScalarTraits<T>::value, void>::type
write(const T &Value, unsigned Index) {
validateAndIncreaseFieldIndex(Index);
uint32_t ValueSize = ScalarTraits<T>::size(Value);
// Size cannot exceed (1 << 31) since it would otherwise interfere with the
// bitflag that indicates if the next construct in the tree is an object
// or a scalar.
assert((ValueSize & ((uint32_t)1 << 31)) == 0 && "Value size too large");
auto SizeError = writeInteger(ValueSize);
(void)SizeError;
assert(!SizeError);
auto StartOffset = StreamWriter.getOffset();
auto ContentError = ScalarTraits<T>::write(StreamWriter, Value);
(void)ContentError;
assert(!ContentError);
(void)StartOffset;
assert((StreamWriter.getOffset() - StartOffset == ValueSize) &&
"Number of written bytes does not match size returned by "
"ScalarTraits<T>::size");
}
template <typename T>
typename std::enable_if<DirectlyEncodable<T>::value, void>::type
write(const T &Value, unsigned Index) {
validateAndIncreaseFieldIndex(Index);
uint32_t ValueSize = sizeof(T);
auto SizeError = writeInteger(ValueSize);
(void)SizeError;
assert(!SizeError);
auto ContentError = writeInteger(Value);
(void)ContentError;
assert(!ContentError);
}
template <typename T>
typename std::enable_if<has_WrapperTypeTraits<T>::value, void>::type
write(const T &Value, unsigned Index) {
auto LengthBeforeWrite = CurrentFieldIndex;
WrapperTypeTraits<T>::write(*this, Value, Index);
(void)LengthBeforeWrite;
assert(CurrentFieldIndex == LengthBeforeWrite + 1 &&
"WrapperTypeTraits did not call BinaryWriter.write");
}
};
// Define serialization schemes for common types
template <>
struct DirectlyEncodable<uint8_t> {
static bool const value = true;
};
template <>
struct DirectlyEncodable<uint16_t> {
static bool const value = true;
};
template <>
struct DirectlyEncodable<uint32_t> {
static bool const value = true;
};
template <>
struct WrapperTypeTraits<bool> {
static void write(ByteTreeWriter &Writer, const bool &Value,
unsigned Index) {
Writer.write(static_cast<uint8_t>(Value), Index);
}
};
template <>
struct ScalarTraits<llvm::StringRef> {
static unsigned size(const llvm::StringRef &Str) { return Str.size(); }
static llvm::Error write(llvm::BinaryStreamWriter &Writer,
const llvm::StringRef &Str) {
return Writer.writeFixedString(Str);
}
};
template <>
struct ObjectTraits<llvm::NoneType> {
// Serialize llvm::None as an object without any elements
static unsigned numFields(const llvm::NoneType &Object,
UserInfoMap &UserInfo) {
return 0;
}
static void write(ByteTreeWriter &Writer, const llvm::NoneType &Object,
UserInfoMap &UserInfo) {
// Nothing to write
}
};
} // end namespace byteTree
} // end namespace swift
#endif

View File

@@ -18,6 +18,7 @@
#include "swift/Parse/ParsedTrivia.h"
#include "swift/Parse/Token.h"
#include "swift/Syntax/SyntaxKind.h"
#include "llvm/Support/Debug.h"
namespace swift {

View File

@@ -18,7 +18,6 @@
#ifndef SWIFT_SYNTAX_SERIALIZATION_SYNTAXSERIALIZATION_H
#define SWIFT_SYNTAX_SERIALIZATION_SYNTAXSERIALIZATION_H
#include "swift/Basic/ByteTreeSerialization.h"
#include "swift/Basic/JSONSerialization.h"
#include "swift/Basic/StringExtras.h"
#include "swift/Syntax/RawSyntax.h"
@@ -202,192 +201,13 @@ struct NullableTraits<RC<syntax::RawSyntax>> {
};
} // end namespace json
namespace byteTree {
namespace serialization {
/// Increase the major version for every change that is not just adding a new
/// field at the end of an object. Older swiftSyntax clients will no longer be
/// able to deserialize the format.
const uint16_t SYNTAX_TREE_VERSION_MAJOR = 1; // Last change: initial version
/// Increase the minor version if only new field has been added at the end of
/// an object. Older swiftSyntax clients will still be able to deserialize the
/// format.
const uint8_t SYNTAX_TREE_VERSION_MINOR = 0; // Last change: initial version
uint16_t getNumericValue(syntax::SyntaxKind Kind);
uint8_t getNumericValue(syntax::TriviaKind Kind);
uint8_t getNumericValue(tok Value);
// Combine the major and minor version into one. The first three bytes
// represent the major version, the last byte the minor version.
const uint32_t SYNTAX_TREE_VERSION =
SYNTAX_TREE_VERSION_MAJOR << 8 | SYNTAX_TREE_VERSION_MINOR;
/// The key for a ByteTree serializion user info of type
/// `std::unordered_set<unsigned> *`. Specifies the IDs of syntax nodes that
/// shall be omitted when the syntax tree gets serialized.
static void *UserInfoKeyReusedNodeIds = &UserInfoKeyReusedNodeIds;
/// The key for a ByteTree serializion user info interpreted as `bool`.
/// If specified, additional fields will be added to objects in the ByteTree
/// to test forward compatibility.
static void *UserInfoKeyAddInvalidFields = &UserInfoKeyAddInvalidFields;
template <>
struct WrapperTypeTraits<tok> {
static uint8_t numericValue(const tok &Value);
static void write(ByteTreeWriter &Writer, const tok &Value, unsigned Index) {
Writer.write(numericValue(Value), Index);
}
};
template <>
struct WrapperTypeTraits<syntax::SourcePresence> {
static uint8_t numericValue(const syntax::SourcePresence &Presence) {
switch (Presence) {
case syntax::SourcePresence::Missing: return 0;
case syntax::SourcePresence::Present: return 1;
}
llvm_unreachable("unhandled presence");
}
static void write(ByteTreeWriter &Writer,
const syntax::SourcePresence &Presence, unsigned Index) {
Writer.write(numericValue(Presence), Index);
}
};
template <>
struct ObjectTraits<ArrayRef<syntax::TriviaPiece>> {
static unsigned numFields(const ArrayRef<syntax::TriviaPiece> &Trivia,
UserInfoMap &UserInfo) {
return Trivia.size();
}
static void write(ByteTreeWriter &Writer,
const ArrayRef<syntax::TriviaPiece> &Trivia,
UserInfoMap &UserInfo) {
for (unsigned I = 0, E = Trivia.size(); I < E; ++I) {
Writer.write(Trivia[I], /*Index=*/I);
}
}
};
template <>
struct ObjectTraits<ArrayRef<RC<syntax::RawSyntax>>> {
static unsigned numFields(const ArrayRef<RC<syntax::RawSyntax>> &Layout,
UserInfoMap &UserInfo) {
return Layout.size();
}
static void write(ByteTreeWriter &Writer,
const ArrayRef<RC<syntax::RawSyntax>> &Layout,
UserInfoMap &UserInfo);
};
template <>
struct ObjectTraits<std::pair<tok, StringRef>> {
static unsigned numFields(const std::pair<tok, StringRef> &Pair,
UserInfoMap &UserInfo) {
return 2;
}
static void write(ByteTreeWriter &Writer,
const std::pair<tok, StringRef> &Pair,
UserInfoMap &UserInfo) {
Writer.write(Pair.first, /*Index=*/0);
Writer.write(Pair.second, /*Index=*/1);
}
};
template <>
struct ObjectTraits<syntax::RawSyntax> {
enum NodeKind { Token = 0, Layout = 1, Omitted = 2 };
static bool shouldOmitNode(const syntax::RawSyntax &Syntax,
UserInfoMap &UserInfo) {
if (auto ReusedNodeIds = static_cast<std::unordered_set<unsigned> *>(
UserInfo[UserInfoKeyReusedNodeIds])) {
return ReusedNodeIds->count(Syntax.getId()) > 0;
} else {
return false;
}
}
static NodeKind nodeKind(const syntax::RawSyntax &Syntax,
UserInfoMap &UserInfo) {
if (shouldOmitNode(Syntax, UserInfo)) {
return Omitted;
} else if (Syntax.isToken()) {
return Token;
} else {
return Layout;
}
}
static unsigned numFields(const syntax::RawSyntax &Syntax,
UserInfoMap &UserInfo) {
// FIXME: We know this is never set in production builds. Should we
// disable this code altogether in that case
// (e.g. if assertions are not enabled?)
if (UserInfo[UserInfoKeyAddInvalidFields]) {
switch (nodeKind(Syntax, UserInfo)) {
case Token: return 7;
case Layout: return 6;
case Omitted: return 2;
}
llvm_unreachable("unhandled kind");
} else {
switch (nodeKind(Syntax, UserInfo)) {
case Token: return 6;
case Layout: return 5;
case Omitted: return 2;
}
llvm_unreachable("unhandled kind");
}
}
static void write(ByteTreeWriter &Writer, const syntax::RawSyntax &Syntax,
UserInfoMap &UserInfo) {
auto Kind = nodeKind(Syntax, UserInfo);
Writer.write(static_cast<uint8_t>(Kind), /*Index=*/0);
Writer.write(static_cast<uint32_t>(Syntax.getId()), /*Index=*/1);
switch (Kind) {
case Token:
Writer.write(Syntax.getPresence(), /*Index=*/2);
Writer.write(std::make_pair(Syntax.getTokenKind(), Syntax.getTokenText()),
/*Index=*/3);
Writer.write(Syntax.getLeadingTrivia(), /*Index=*/4);
Writer.write(Syntax.getTrailingTrivia(), /*Index=*/5);
// FIXME: We know this is never set in production builds. Should we
// disable this code altogether in that case
// (e.g. if assertions are not enabled?)
if (UserInfo[UserInfoKeyAddInvalidFields]) {
// Test adding a new scalar field
StringRef Str = "invalid forward compatible field";
Writer.write(Str, /*Index=*/6);
}
break;
case Layout:
Writer.write(Syntax.getPresence(), /*Index=*/2);
Writer.write(Syntax.getKind(), /*Index=*/3);
Writer.write(Syntax.getLayout(), /*Index=*/4);
// FIXME: We know this is never set in production builds. Should we
// disable this code altogether in that case
// (e.g. if assertions are not enabled?)
if (UserInfo[UserInfoKeyAddInvalidFields]) {
// Test adding a new object
auto Piece = syntax::TriviaPiece::spaces(2);
ArrayRef<syntax::TriviaPiece> SomeTrivia(Piece);
Writer.write(SomeTrivia, /*Index=*/5);
}
break;
case Omitted:
// Nothing more to write
break;
}
}
};
} // end namespace byteTree
} // end namespace serialization
} // end namespace swift

View File

@@ -25,7 +25,6 @@
#define SWIFT_SYNTAX_KIND_H
#include "swift/Basic/InlineBitfield.h"
#include "swift/Basic/ByteTreeSerialization.h"
#include "swift/Basic/JSONSerialization.h"
#include "llvm/Support/YAMLTraits.h"
@@ -77,36 +76,6 @@ SyntaxKind getUnknownKind(SyntaxKind Kind);
bool parserShallOmitWhenNoChildren(syntax::SyntaxKind Kind);
namespace byteTree {
template <>
struct WrapperTypeTraits<syntax::SyntaxKind> {
// Explicitly spell out all SyntaxKinds to keep the serialized value stable
// even if its members get reordered or members get removed
static uint16_t numericValue(const syntax::SyntaxKind &Kind) {
switch (Kind) {
case syntax::SyntaxKind::Token:
return 0;
case syntax::SyntaxKind::Unknown:
return 1;
% for name, nodes in grouped_nodes.items():
% for node in nodes:
case syntax::SyntaxKind::${node.syntax_kind}:
return ${SYNTAX_NODE_SERIALIZATION_CODES[node.syntax_kind]};
% end
% end
}
llvm_unreachable("unhandled kind");
}
static void write(ByteTreeWriter &Writer, const syntax::SyntaxKind &Kind,
unsigned Index) {
Writer.write(numericValue(Kind), Index);
}
};
} // end namespace byteTree
namespace json {
/// Serialization traits for SyntaxKind.

View File

@@ -90,7 +90,6 @@
#include "swift/Basic/Debug.h"
#include "swift/Basic/OwnedString.h"
#include "swift/Basic/JSONSerialization.h"
#include "swift/Basic/ByteTreeSerialization.h"
#include "llvm/ADT/FoldingSet.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Support/YAMLTraits.h"
@@ -382,52 +381,6 @@ private:
};
} // namespace syntax
namespace byteTree {
template <>
struct WrapperTypeTraits<syntax::TriviaKind> {
static uint8_t numericValue(const syntax::TriviaKind &Kind) {
switch (Kind) {
% for trivia in TRIVIAS:
case syntax::TriviaKind::${trivia.name}: return ${trivia.serialization_code};
% end
}
llvm_unreachable("unhandled kind");
}
static void write(ByteTreeWriter &Writer, const syntax::TriviaKind &Kind,
unsigned Index) {
Writer.write(numericValue(Kind), Index);
}
};
template <>
struct ObjectTraits<syntax::TriviaPiece> {
static unsigned numFields(const syntax::TriviaPiece &Trivia,
UserInfoMap &UserInfo) {
return 2;
}
static void write(ByteTreeWriter &Writer, const syntax::TriviaPiece &Trivia,
UserInfoMap &UserInfo) {
Writer.write(Trivia.getKind(), /*Index=*/0);
// Write the trivia's text or count depending on its kind
switch (Trivia.getKind()) {
% for trivia in TRIVIAS:
case syntax::TriviaKind::${trivia.name}:
% if trivia.is_collection():
Writer.write(static_cast<uint32_t>(Trivia.getCount()), /*Index=*/1);
% else:
Writer.write(Trivia.getText(), /*Index=*/1);
% end
break;
% end
}
}
};
} // end namespace byteTree
namespace json {
/// Serialization traits for TriviaPiece.
/// - All trivia pieces will have a "kind" key that contains the serialized

View File

@@ -1,5 +1,9 @@
%{
from gyb_syntax_support import *
from gyb_syntax_support.kinds import SYNTAX_BASE_KINDS
grouped_nodes = { kind: [] for kind in SYNTAX_BASE_KINDS }
for node in SYNTAX_NODES:
grouped_nodes[node.base_kind].append(node)
# Ignore the following admonition; it applies to the resulting .cpp file only
}%
//// Automatically Generated From SyntaxSerialization.cpp.gyb.
@@ -19,27 +23,41 @@
#include "swift/Syntax/Serialization/SyntaxSerialization.h"
namespace swift {
namespace byteTree {
void ObjectTraits<ArrayRef<RC<syntax::RawSyntax>>>::write(
ByteTreeWriter &Writer, const ArrayRef<RC<syntax::RawSyntax>> &Layout,
UserInfoMap &UserInfo) {
for (unsigned I = 0, E = Layout.size(); I < E; ++I) {
if (Layout[I]) {
Writer.write(*Layout[I], /*Index=*/I);
} else {
Writer.write(llvm::None, /*Index=*/I);
}
namespace serialization {
uint16_t getNumericValue(syntax::SyntaxKind Kind) {
switch (Kind) {
case syntax::SyntaxKind::Token:
return 0;
case syntax::SyntaxKind::Unknown:
return 1;
% for name, nodes in grouped_nodes.items():
% for node in nodes:
case syntax::SyntaxKind::${node.syntax_kind}:
return ${SYNTAX_NODE_SERIALIZATION_CODES[node.syntax_kind]};
% end
% end
}
llvm_unreachable("unhandled kind");
}
uint8_t WrapperTypeTraits<tok>::numericValue(const tok &Value) {
uint8_t getNumericValue(syntax::TriviaKind Kind) {
switch (Kind) {
% for trivia in TRIVIAS:
case syntax::TriviaKind::${trivia.name}: return ${trivia.serialization_code};
% end
}
llvm_unreachable("unhandled kind");
}
uint8_t getNumericValue(tok Value) {
switch (Value) {
case tok::eof: return 0;
% for token in SYNTAX_TOKENS:
case tok::${token.kind}: return ${token.serialization_code};
% end
case tok::kw_undef:
case tok::kw_sil:
case tok::kw_sil_stage:
@@ -61,5 +79,6 @@ uint8_t WrapperTypeTraits<tok>::numericValue(const tok &Value) {
}
llvm_unreachable("unhandled token");
}
} // namespace byteTree
} // namespace swift
} // end namespace serialization
} // end namespace swift

View File

@@ -232,8 +232,6 @@ enum class SyntaxTreeTransferMode {
Full
};
enum class SyntaxTreeSerializationFormat { JSON, ByteTree };
class EditorConsumer {
virtual void anchor();
public:

View File

@@ -83,8 +83,6 @@ struct SKEditorConsumerOptions {
bool EnableStructure = false;
bool EnableDiagnostics = false;
SyntaxTreeTransferMode SyntaxTransferMode = SyntaxTreeTransferMode::Off;
SyntaxTreeSerializationFormat SyntaxSerializationFormat =
SyntaxTreeSerializationFormat::JSON;
bool SyntacticOnly = false;
};
@@ -333,20 +331,6 @@ static SyntaxTreeTransferMode syntaxTransferModeFromUID(sourcekitd_uid_t UID) {
}
}
static llvm::Optional<SyntaxTreeSerializationFormat>
syntaxSerializationFormatFromUID(sourcekitd_uid_t UID) {
if (UID == nullptr) {
// Default is JSON
return SyntaxTreeSerializationFormat::JSON;
} else if (UID == KindSyntaxTreeSerializationJSON) {
return SyntaxTreeSerializationFormat::JSON;
} else if (UID == KindSyntaxTreeSerializationByteTree) {
return SyntaxTreeSerializationFormat::ByteTree;
} else {
return llvm::None;
}
}
namespace {
class SKOptionsDictionary : public OptionsDictionary {
RequestDict Options;
@@ -640,7 +624,6 @@ void handleRequestImpl(sourcekitd_object_t ReqObj, ResponseReceiver Rec) {
int64_t EnableDiagnostics = true;
Req.getInt64(KeyEnableDiagnostics, EnableDiagnostics, /*isOptional=*/true);
auto TransferModeUID = Req.getUID(KeySyntaxTreeTransferMode);
auto SerializationFormatUID = Req.getUID(KeySyntaxTreeSerializationFormat);
int64_t SyntacticOnly = false;
Req.getInt64(KeySyntacticOnly, SyntacticOnly, /*isOptional=*/true);
@@ -649,11 +632,6 @@ void handleRequestImpl(sourcekitd_object_t ReqObj, ResponseReceiver Rec) {
Opts.EnableStructure = EnableStructure;
Opts.EnableDiagnostics = EnableDiagnostics;
Opts.SyntaxTransferMode = syntaxTransferModeFromUID(TransferModeUID);
auto SyntaxSerializationFormat =
syntaxSerializationFormatFromUID(SerializationFormatUID);
if (!SyntaxSerializationFormat)
return Rec(createErrorRequestFailed("Invalid serialization format"));
Opts.SyntaxSerializationFormat = SyntaxSerializationFormat.getValue();
Opts.SyntacticOnly = SyntacticOnly;
return Rec(editorOpen(*Name, InputBuf.get(), Opts, Args, std::move(vfsOptions)));
}
@@ -688,18 +666,12 @@ void handleRequestImpl(sourcekitd_object_t ReqObj, ResponseReceiver Rec) {
int64_t SyntacticOnly = false;
Req.getInt64(KeySyntacticOnly, SyntacticOnly, /*isOptional=*/true);
auto TransferModeUID = Req.getUID(KeySyntaxTreeTransferMode);
auto SerializationFormatUID = Req.getUID(KeySyntaxTreeSerializationFormat);
SKEditorConsumerOptions Opts;
Opts.EnableSyntaxMap = EnableSyntaxMap;
Opts.EnableStructure = EnableStructure;
Opts.EnableDiagnostics = EnableDiagnostics;
Opts.SyntaxTransferMode = syntaxTransferModeFromUID(TransferModeUID);
auto SyntaxSerializationFormat =
syntaxSerializationFormatFromUID(SerializationFormatUID);
if (!SyntaxSerializationFormat)
return Rec(createErrorRequestFailed("Invalid serialization format"));
Opts.SyntaxSerializationFormat = SyntaxSerializationFormat.getValue();
Opts.SyntacticOnly = SyntacticOnly;
return Rec(editorReplaceText(*Name, InputBuf.get(), Offset, Length, Opts));
@@ -2845,39 +2817,6 @@ void SKEditorConsumer::handleSourceText(StringRef Text) {
Dict.set(KeySourceText, Text);
}
void serializeSyntaxTreeAsByteTree(
const swift::syntax::SourceFileSyntax &SyntaxTree,
std::unordered_set<unsigned> &ReusedNodeIds,
ResponseBuilder::Dictionary &Dict) {
auto StartClock = clock();
// Serialize the syntax tree as a ByteTree
auto Stream = swift::ExponentialGrowthAppendingBinaryByteStream();
Stream.reserve(32 * 1024);
std::map<void *, void *> UserInfo;
UserInfo[swift::byteTree::UserInfoKeyReusedNodeIds] = &ReusedNodeIds;
swift::byteTree::ByteTreeWriter::write(Stream,
swift::byteTree::SYNTAX_TREE_VERSION,
*SyntaxTree.getRaw(), UserInfo);
std::unique_ptr<llvm::WritableMemoryBuffer> Buf =
llvm::WritableMemoryBuffer::getNewUninitMemBuffer(sizeof(uint64_t) + Stream.data().size());
*reinterpret_cast<uint64_t*>(Buf->getBufferStart()) =
(uint64_t)CustomBufferKind::RawData;
memcpy(Buf->getBufferStart() + sizeof(uint64_t),
Stream.data().data(), Stream.data().size());
Dict.setCustomBuffer(KeySerializedSyntaxTree, std::move(Buf));
auto EndClock = clock();
LOG_SECTION("incrParse Performance", InfoLowPrio) {
Log->getOS() << "Serialized " << Stream.data().size()
<< " bytes as ByteTree in ";
auto Seconds = (double)(EndClock - StartClock) * 1000 / CLOCKS_PER_SEC;
llvm::write_double(Log->getOS(), Seconds, llvm::FloatStyle::Fixed, 2);
Log->getOS() << "ms";
}
}
void serializeSyntaxTreeAsJson(
const swift::syntax::SourceFileSyntax &SyntaxTree,
std::unordered_set<unsigned> ReusedNodeIds,
@@ -2928,14 +2867,7 @@ void SKEditorConsumer::handleSyntaxTree(
break;
}
switch (Opts.SyntaxSerializationFormat) {
case SourceKit::SyntaxTreeSerializationFormat::JSON:
serializeSyntaxTreeAsJson(SyntaxTree, OmitNodes, Dict);
break;
case SourceKit::SyntaxTreeSerializationFormat::ByteTree:
serializeSyntaxTreeAsByteTree(SyntaxTree, OmitNodes, Dict);
break;
}
serializeSyntaxTreeAsJson(SyntaxTree, OmitNodes, Dict);
}
static sourcekitd_response_t

View File

@@ -28,7 +28,6 @@
using namespace swift;
using namespace swift::syntax;
using namespace swift::byteTree;
typedef swiftparse_range_t CRange;
typedef swiftparse_client_node_t CClientNode;
@@ -122,7 +121,7 @@ private:
for (const auto &piece : trivia) {
CTriviaPiece c_piece;
auto numValue =
WrapperTypeTraits<TriviaKind>::numericValue(piece.getKind());
serialization::getNumericValue(piece.getKind());
c_piece.kind = numValue;
assert(c_piece.kind == numValue && "trivia kind value is too large");
c_piece.length = piece.getLength();
@@ -139,8 +138,8 @@ private:
ArrayRef<CTriviaPiece> leadingTrivia,
ArrayRef<CTriviaPiece> trailingTrivia,
CharSourceRange range) {
node.kind = WrapperTypeTraits<SyntaxKind>::numericValue(SyntaxKind::Token);
auto numValue = WrapperTypeTraits<swift::tok>::numericValue(kind);
node.kind = serialization::getNumericValue(SyntaxKind::Token);
auto numValue = serialization::getNumericValue(kind);
node.token_data.kind = numValue;
assert(node.token_data.kind == numValue && "token kind value is too large");
node.token_data.leading_trivia = leadingTrivia.data();
@@ -179,7 +178,7 @@ private:
ArrayRef<OpaqueSyntaxNode> elements,
CharSourceRange range) override {
CRawSyntaxNode node;
auto numValue = WrapperTypeTraits<SyntaxKind>::numericValue(kind);
auto numValue = serialization::getNumericValue(kind);
node.kind = numValue;
assert(node.kind == numValue && "syntax kind value is too large");
node.layout_data.nodes = elements.data();
@@ -205,7 +204,7 @@ private:
if (!NodeLookup) {
return {0, nullptr};
}
auto numValue = WrapperTypeTraits<SyntaxKind>::numericValue(kind);
auto numValue = serialization::getNumericValue(kind);
CSyntaxKind ckind = numValue;
assert(ckind == numValue && "syntax kind value is too large");
auto result = NodeLookup(lexerOffset, ckind);

View File

@@ -155,21 +155,6 @@ OmitNodeIds("omit-node-ids",
llvm::cl::desc("If specified, the serialized syntax tree will not "
"include the IDs of the serialized nodes."));
static llvm::cl::opt<bool>
SerializeAsByteTree("serialize-byte-tree",
llvm::cl::desc("If specified the syntax tree will be "
"serialized in the ByteTree format instead "
"of JSON."));
static llvm::cl::opt<bool>
AddByteTreeFields("add-bytetree-fields",
llvm::cl::desc("If specified, further fields will be added "
"to the syntax tree if it is serialized as a "
"ByteTree. This is to test forward "
"compatibility with future versions of "
"SwiftSyntax that might add more fields to "
"syntax nodes."));
static llvm::cl::opt<bool>
IncrementalSerialization("incremental-serialization",
llvm::cl::desc("If specified, the serialized syntax "
@@ -752,56 +737,28 @@ int doSerializeRawTree(const char *MainExecutablePath,
ReusedNodeIds = SyntaxCache->getReusedNodeIds();
}
if (options::SerializeAsByteTree) {
if (options::OutputFilename.empty()) {
llvm::errs() << "Cannot serialize syntax tree as ByteTree to stdout\n";
return EXIT_FAILURE;
auto SerializeTree = [&ReusedNodeIds](llvm::raw_ostream &os,
RC<RawSyntax> Root,
SyntaxParsingCache *SyntaxCache) {
swift::json::Output::UserInfoMap JsonUserInfo;
JsonUserInfo[swift::json::OmitNodesUserInfoKey] = &ReusedNodeIds;
if (options::OmitNodeIds) {
JsonUserInfo[swift::json::DontSerializeNodeIdsUserInfoKey] =
(void *)true;
}
swift::json::Output out(os, JsonUserInfo);
out << *Root;
os << "\n";
};
auto Stream = ExponentialGrowthAppendingBinaryByteStream();
Stream.reserve(32 * 1024);
std::map<void *, void *> UserInfo;
UserInfo[swift::byteTree::UserInfoKeyReusedNodeIds] = &ReusedNodeIds;
if (options::AddByteTreeFields) {
UserInfo[swift::byteTree::UserInfoKeyAddInvalidFields] = (void *)true;
}
swift::byteTree::ByteTreeWriter::write(Stream,
byteTree::SYNTAX_TREE_VERSION,
*Root, UserInfo);
auto OutputBufferOrError = llvm::FileOutputBuffer::create(
options::OutputFilename, Stream.data().size());
assert(OutputBufferOrError && "Couldn't open output file");
auto &OutputBuffer = OutputBufferOrError.get();
memcpy(OutputBuffer->getBufferStart(), Stream.data().data(),
Stream.data().size());
auto Error = OutputBuffer->commit();
(void)Error;
assert(!Error && "Unable to write output file");
if (!options::OutputFilename.empty()) {
std::error_code errorCode;
llvm::raw_fd_ostream os(options::OutputFilename, errorCode,
llvm::sys::fs::F_None);
assert(!errorCode && "Couldn't open output file");
SerializeTree(os, Root, SyntaxCache);
} else {
// Serialize as JSON
auto SerializeTree = [&ReusedNodeIds](llvm::raw_ostream &os,
RC<RawSyntax> Root,
SyntaxParsingCache *SyntaxCache) {
swift::json::Output::UserInfoMap JsonUserInfo;
JsonUserInfo[swift::json::OmitNodesUserInfoKey] = &ReusedNodeIds;
if (options::OmitNodeIds) {
JsonUserInfo[swift::json::DontSerializeNodeIdsUserInfoKey] =
(void *)true;
}
swift::json::Output out(os, JsonUserInfo);
out << *Root;
os << "\n";
};
if (!options::OutputFilename.empty()) {
std::error_code errorCode;
llvm::raw_fd_ostream os(options::OutputFilename, errorCode,
llvm::sys::fs::F_None);
assert(!errorCode && "Couldn't open output file");
SerializeTree(os, Root, SyntaxCache);
} else {
SerializeTree(llvm::outs(), Root, SyntaxCache);
}
SerializeTree(llvm::outs(), Root, SyntaxCache);
}
if (!options::DiagsOutputFilename.empty()) {

View File

@@ -52,8 +52,6 @@ UID_KEYS = [
KEY('SourceText', 'key.sourcetext'),
KEY('EnableSyntaxMap', 'key.enablesyntaxmap'),
KEY('SyntaxTreeTransferMode', 'key.syntaxtreetransfermode'),
KEY('SyntaxTreeSerializationFormat',
'key.syntax_tree_serialization_format'),
KEY('EnableStructure', 'key.enablesubstructure'),
KEY('Description', 'key.description'),
KEY('TypeName', 'key.typename'),
@@ -453,8 +451,4 @@ UID_KINDS = [
KIND('SyntaxTreeOff', 'source.syntaxtree.transfer.off'),
KIND('SyntaxTreeIncremental', 'source.syntaxtree.transfer.incremental'),
KIND('SyntaxTreeFull', 'source.syntaxtree.transfer.full'),
KIND('SyntaxTreeSerializationJSON',
'source.syntaxtree.serialization.format.json'),
KIND('SyntaxTreeSerializationByteTree',
'source.syntaxtree.serialization.format.bytetree'),
]

View File

@@ -103,11 +103,6 @@ def main():
parser.add_argument(
'--swiftsyntax-lit-test-helper', required=True,
help='The path to the lit-test-helper binary of SwiftSyntax')
parser.add_argument(
'--serialization-format', choices=['json', 'byteTree'],
default='json', help='''
The format that shall be used to transfer the syntax tree
''')
args = parser.parse_args(sys.argv[1:])
@@ -117,7 +112,6 @@ def main():
temp_dir = args.temp_dir
swift_syntax_test = args.swift_syntax_test
swiftsyntax_lit_test_helper = args.swiftsyntax_lit_test_helper
serialization_format = args.serialization_format
if not os.path.exists(temp_dir):
os.makedirs(temp_dir)
@@ -141,12 +135,10 @@ def main():
after_roundtrip_file=after_roundtrip_file,
swiftsyntax_lit_test_helper=swiftsyntax_lit_test_helper)
treeFileExtension = serialization_format
pre_edit_tree_file = temp_dir + '/' + test_file_name + '.' \
+ test_case + '.pre.' + treeFileExtension
+ test_case + '.pre.json'
incremental_tree_file = temp_dir + '/' + test_file_name + '.' \
+ test_case + '.incr.' + treeFileExtension
+ test_case + '.incr.json'
post_edit_source_file = temp_dir + '/' + test_file_name + '.' \
+ test_case + '.post.swift'
after_roundtrip_source_file = temp_dir + '/' + test_file_name + '.' \
@@ -158,7 +150,6 @@ def main():
test_case=test_case,
mode='pre-edit',
serialization_mode='full',
serialization_format=serialization_format,
omit_node_ids=False,
output_file=pre_edit_tree_file,
diags_output_file=None,
@@ -170,7 +161,6 @@ def main():
test_case=test_case,
mode='incremental',
serialization_mode='incremental',
serialization_format=serialization_format,
omit_node_ids=False,
output_file=incremental_tree_file,
diags_output_file=None,
@@ -185,7 +175,6 @@ def main():
try:
run_command([swiftsyntax_lit_test_helper, '-deserialize-incremental'] +
['-serialization-format', serialization_format] +
['-pre-edit-tree', pre_edit_tree_file] +
['-incr-tree', incremental_tree_file] +
['-out', after_roundtrip_source_file])

View File

@@ -69,7 +69,6 @@ def main():
test_case=test_case,
mode='incremental',
serialization_mode='incremental',
serialization_format='json',
omit_node_ids=False,
output_file=incremental_serialized_file,
diags_output_file=None,

View File

@@ -145,7 +145,7 @@ def prepareForIncrParse(test_file, test_case, pre_edit_file, post_edit_file,
def serializeIncrParseMarkupFile(test_file, test_case, mode,
serialization_mode, serialization_format,
serialization_mode,
omit_node_ids, output_file, diags_output_file,
temp_dir, swift_syntax_test,
print_visual_reuse_info):
@@ -200,15 +200,6 @@ def serializeIncrParseMarkupFile(test_file, test_case, mode,
raise ValueError('Unknown serialization mode "%s"' %
serialization_mode)
if serialization_format == 'json':
# Nothing to do. This is the default behaviour of swift-syntax-test
pass
elif serialization_format == 'byteTree':
command.extend(['-serialize-byte-tree'])
else:
raise ValueError('Unknown serialization format "%s"' %
serialization_format)
if mode == 'pre-edit':
command.extend(['-input-source-filename', pre_edit_file])
elif mode == 'post-edit':
@@ -291,11 +282,6 @@ def main():
Only applicable if `--mode` is `incremental`. Whether to serialize the
entire tree or use the incremental transfer mode. Default is `full`.
''')
parser.add_argument(
'--serialization-format', choices=['json', 'byteTree'],
default='json', help='''
The format in which the syntax tree shall be serialized.
''')
parser.add_argument(
'--omit-node-ids', default=False, action='store_true',
help='Don\'t include the ids of the nodes in the serialized syntax \
@@ -322,7 +308,6 @@ def main():
test_case = args.test_case
mode = args.mode
serialization_mode = args.serialization_mode
serialization_format = args.serialization_format
omit_node_ids = args.omit_node_ids
output_file = args.output_file
temp_dir = args.temp_dir
@@ -334,7 +319,6 @@ def main():
test_case=test_case,
mode=mode,
serialization_mode=serialization_mode,
serialization_format=serialization_format,
omit_node_ids=omit_node_ids,
output_file=output_file,
diags_output_file=None,

View File

@@ -79,7 +79,6 @@ def main():
test_case=test_case,
mode='incremental',
serialization_mode='full',
serialization_format='json',
omit_node_ids=True,
output_file=incremental_serialized_file,
diags_output_file=incremental_diags_file,
@@ -95,7 +94,6 @@ def main():
test_case=test_case,
mode='post-edit',
serialization_mode='full',
serialization_format='json',
omit_node_ids=True,
output_file=post_edit_serialized_file,
diags_output_file=post_edit_diags_file,