Merge branch 'ps/undecided-is-not-necessarily-sha1'

Before discovering the repository details, We used to assume SHA-1
as the "default" hash function, which has been corrected. Hopefully
this will smoke out codepaths that rely on such an unwarranted
assumptions.

* ps/undecided-is-not-necessarily-sha1:
  repository: stop setting SHA1 as the default object hash
  oss-fuzz/commit-graph: set up hash algorithm
  builtin/shortlog: don't set up revisions without repo
  builtin/diff: explicitly set hash algo when there is no repo
  builtin/bundle: abort "verify" early when there is no repository
  builtin/blame: don't access potentially unitialized `the_hash_algo`
  builtin/rev-parse: allow shortening to more than 40 hex characters
  remote-curl: fix parsing of detached SHA256 heads
  attr: fix BUG() when parsing attrs outside of repo
  attr: don't recompute default attribute source
  parse-options-cb: only abbreviate hashes when hash algo is known
  path: move `validate_headref()` to its only user
  path: harden validation of HEAD with non-standard hashes
This commit is contained in:
Junio C Hamano
2024-05-30 14:15:10 -07:00
17 changed files with 168 additions and 92 deletions

View File

@@ -266,12 +266,23 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
return list;
}
/*
* Try to detect the hash algorithm used by the remote repository when using
* the dumb HTTP transport. As dumb transports cannot tell us the object hash
* directly have to derive it from the advertised ref lengths.
*/
static const struct git_hash_algo *detect_hash_algo(struct discovery *heads)
{
const char *p = memchr(heads->buf, '\t', heads->len);
int algo;
/*
* In case the remote has no refs we have no way to reliably determine
* the object hash used by that repository. In that case we simply fall
* back to SHA1, which may or may not be correct.
*/
if (!p)
return the_hash_algo;
return &hash_algos[GIT_HASH_SHA1];
algo = hash_algo_by_length((p - heads->buf) / 2);
if (algo == GIT_HASH_UNKNOWN)
@@ -295,6 +306,12 @@ static struct ref *parse_info_refs(struct discovery *heads)
"is this a git repository?",
transport_anonymize_url(url.buf));
/*
* Set the repository's hash algo to whatever we have just detected.
* This ensures that we can correctly parse the remote references.
*/
repo_set_hash_algo(the_repository, hash_algo_by_ptr(options.hash_algo));
data = heads->buf;
start = NULL;
mid = data;