Merge branch 'ps/use-reftable-as-default-in-3.0'

The reftable ref backend has matured enough; Git 3.0 will make it
the default format in a newly created repositories by default.

* ps/use-reftable-as-default-in-3.0:
  setup: use "reftable" format when experimental features are enabled
  BreakingChanges: announce switch to "reftable" format
This commit is contained in:
Junio C Hamano
2025-07-14 11:19:24 -07:00
6 changed files with 120 additions and 0 deletions

View File

@@ -118,6 +118,53 @@ Cf. <2f5de416-04ba-c23d-1e0b-83bb655829a7@zombino.com>,
<20170223155046.e7nxivfwqqoprsqj@LykOS.localdomain>,
<CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@mail.gmail.com>.
* The default storage format for references in newly created repositories will
be changed from "files" to "reftable". The "reftable" format provides
multiple advantages over the "files" format:
+
** It is impossible to store two references that only differ in casing on
case-insensitive filesystems with the "files" format. This issue is common
on Windows and macOS platforms. As the "reftable" backend does not use
filesystem paths to encode reference names this problem goes away.
** Similarly, macOS normalizes path names that contain unicode characters,
which has the consequence that you cannot store two names with unicode
characters that are encoded differently with the "files" backend. Again,
this is not an issue with the "reftable" backend.
** Deleting references with the "files" backend requires Git to rewrite the
complete "packed-refs" file. In large repositories with many references
this file can easily be dozens of megabytes in size, in extreme cases it
may be gigabytes. The "reftable" backend uses tombstone markers for
deleted references and thus does not have to rewrite all of its data.
** Repository housekeeping with the "files" backend typically performs
all-into-one repacks of references. This can be quite expensive, and
consequently housekeeping is a tradeoff between the number of loose
references that accumulate and slow down operations that read references,
and compressing those loose references into the "packed-refs" file. The
"reftable" backend uses geometric compaction after every write, which
amortizes costs and ensures that the backend is always in a
well-maintained state.
** Operations that write multiple references at once are not atomic with the
"files" backend. Consequently, Git may see in-between states when it reads
references while a reference transaction is in the process of being
committed to disk.
** Writing many references at once is slow with the "files" backend because
every reference is created as a separate file. The "reftable" backend
significantly outperforms the "files" backend by multiple orders of
magnitude.
** The reftable backend uses a binary format with prefix compression for
reference names. As a result, the format uses less space compared to the
"packed-refs" file.
+
Users that get immediate benefit from the "reftable" backend could continue to
opt-in to the "reftable" format manually by setting the "init.defaultRefFormat"
config. But defaults matter, and we think that overall users will have a better
experience with less platform-specific quirks when they use the new backend by
default.
+
A prerequisite for this change is that the ecosystem is ready to support the
"reftable" format. Most importantly, alternative implementations of Git like
JGit, libgit2 and Gitoxide need to support it.
=== Removals
* Support for grafting commits has long been superseded by git-replace(1).

View File

@@ -24,6 +24,12 @@ reusing objects from multiple packs instead of just one.
* `pack.usePathWalk` may speed up packfile creation and make the packfiles be
significantly smaller in the presence of certain filename collisions with Git's
default name-hash.
+
* `init.defaultRefFormat=reftable` causes newly initialized repositories to use
the reftable format for storing references. This new format solves issues with
case-insensitive filesystems, compresses better and performs significantly
better with many use cases. Refer to Documentation/technical/reftable.adoc for
more information on this new storage format.
feature.manyFiles::
Enable config options that optimize for repos with many files in the

2
help.c
View File

@@ -810,6 +810,8 @@ void get_version_info(struct strbuf *buf, int show_build_options)
SHA1_UNSAFE_BACKEND);
#endif
strbuf_addf(buf, "SHA-256: %s\n", SHA256_BACKEND);
strbuf_addf(buf, "default-ref-format: %s\n",
ref_storage_format_to_name(REF_STORAGE_FORMAT_DEFAULT));
}
}

View File

@@ -20,6 +20,12 @@ enum ref_storage_format {
REF_STORAGE_FORMAT_REFTABLE,
};
#ifdef WITH_BREAKING_CHANGES /* Git 3.0 */
# define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_REFTABLE
#else
# define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_FILES
#endif
struct repo_path_cache {
char *squash_msg;
char *merge_msg;

14
setup.c
View File

@@ -2484,6 +2484,18 @@ static int read_default_format_config(const char *key, const char *value,
goto out;
}
/*
* Enable the reftable format when "features.experimental" is enabled.
* "init.defaultRefFormat" takes precedence over this setting.
*/
if (!strcmp(key, "feature.experimental") &&
cfg->ref_format == REF_STORAGE_FORMAT_UNKNOWN &&
git_config_bool(key, value)) {
cfg->ref_format = REF_STORAGE_FORMAT_REFTABLE;
ret = 0;
goto out;
}
ret = 0;
out:
free(str);
@@ -2544,6 +2556,8 @@ static void repository_format_configure(struct repository_format *repo_fmt,
repo_fmt->ref_storage_format = ref_format;
} else if (cfg.ref_format != REF_STORAGE_FORMAT_UNKNOWN) {
repo_fmt->ref_storage_format = cfg.ref_format;
} else {
repo_fmt->ref_storage_format = REF_STORAGE_FORMAT_DEFAULT;
}
repo_set_ref_storage_format(the_repository, repo_fmt->ref_storage_format);
}

View File

@@ -658,6 +658,17 @@ test_expect_success 'init warns about invalid init.defaultRefFormat' '
test_cmp expected actual
'
test_expect_success 'default ref format' '
test_when_finished "rm -rf refformat" &&
(
sane_unset GIT_DEFAULT_REF_FORMAT &&
git init refformat
) &&
git version --build-options | sed -ne "s/^default-ref-format: //p" >expect &&
git -C refformat rev-parse --show-ref-format >actual &&
test_cmp expect actual
'
backends="files reftable"
for format in $backends
do
@@ -738,6 +749,40 @@ test_expect_success "GIT_DEFAULT_REF_FORMAT= overrides init.defaultRefFormat" '
test_cmp expect actual
'
test_expect_success "init with feature.experimental=true" '
test_when_finished "rm -rf refformat" &&
test_config_global feature.experimental true &&
(
sane_unset GIT_DEFAULT_REF_FORMAT &&
git init refformat
) &&
echo reftable >expect &&
git -C refformat rev-parse --show-ref-format >actual &&
test_cmp expect actual
'
test_expect_success "init.defaultRefFormat overrides feature.experimental=true" '
test_when_finished "rm -rf refformat" &&
test_config_global feature.experimental true &&
test_config_global init.defaultRefFormat files &&
(
sane_unset GIT_DEFAULT_REF_FORMAT &&
git init refformat
) &&
echo files >expect &&
git -C refformat rev-parse --show-ref-format >actual &&
test_cmp expect actual
'
test_expect_success "GIT_DEFAULT_REF_FORMAT= overrides feature.experimental=true" '
test_when_finished "rm -rf refformat" &&
test_config_global feature.experimental true &&
GIT_DEFAULT_REF_FORMAT=files git init refformat &&
echo files >expect &&
git -C refformat rev-parse --show-ref-format >actual &&
test_cmp expect actual
'
for from_format in $backends
do
test_expect_success "re-init with same format ($from_format)" '