packfile: skip hash checks in add_promisor_object()

When is_promisor_object() is called for the first time, it lazily
initializes a set of all promisor objects by iterating through all
objects in promisor packs. For each object, add_promisor_object() calls
parse_object(), which decompresses and hashes the entire object.

For repositories with large pack files, this can take an extremely long
time. For example, on a production repository with a 176 GB promisor
pack:

 $ time ~/git/git/git-rev-list --objects --all --exclude-promisor-objects --quiet
 ________________________________________________________
 Executed in   76.10 mins    fish           external
    usr time   72.10 mins    1.83 millis   72.10 mins
    sys time    3.56 mins    0.17 millis    3.56 mins

add_promisor_object() just wants to construct the set of all promisor
objects, so it doesn't really need to verify the hash of every object.
Set PARSE_OBJECT_SKIP_HASH_CHECK to skip the hash check. This has the
side effect of skipping decompression of blob objects completely, saving
a significant amount of time:

 $ time ~/git/git/git-rev-list --objects --all --exclude-promisor-objects --quiet
 ________________________________________________________
 Executed in  124.70 secs    fish           external
    usr time   46.94 secs    0.00 millis   46.94 secs
    sys time   43.11 secs    1.03 millis   43.11 secs

Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Aaron Plattner
2025-12-08 17:48:57 -08:00
committed by Junio C Hamano
parent 3c7c41d6b7
commit 3f5d1749e7

View File

@@ -2310,7 +2310,8 @@ static int add_promisor_object(const struct object_id *oid,
we_parsed_object = 0;
} else {
we_parsed_object = 1;
obj = parse_object(pack->repo, oid);
obj = parse_object_with_flags(pack->repo, oid,
PARSE_OBJECT_SKIP_HASH_CHECK);
}
if (!obj)