midx: support reading incremental MIDX chains

Now that the MIDX machinery's internals have been taught to understand
incremental MIDXs over the previous handful of commits, the MIDX
machinery itself can begin reading incremental MIDXs.

(Note that while the on-disk format for incremental MIDXs has been
defined, the writing end has not been implemented. This will take place
in the commit after next.)

The core of this change involves following the order specified in the
MIDX chain in reverse and opening up MIDXs in the chain one-by-one,
adding them to the previous layer's `->base_midx` pointer at each step.

In order to implement this, the `load_multi_pack_index()` function is
taught to call a new `load_multi_pack_index_chain()` function if loading
a non-incremental MIDX failed via `load_multi_pack_index_one()`.

When loading a MIDX chain, `load_midx_chain_fd_st()` reads each line in
the file one-by-one and dispatches calls to
`load_multi_pack_index_one()` to read each layer of the MIDX chain. When
a layer was successfully read, it is added to the MIDX chain by calling
`add_midx_to_chain()` which validates the contents of the `BASE` chunk,
performs some bounds checks on the number of combined packs and objects,
and attaches the new MIDX by assigning its `base_midx` pointer to the
existing part of the chain.

As a supplement to this, introduce a new mode in the test-read-midx
test-tool which allows us to read the information for a specific MIDX in
the chain by specifying its trailing checksum via the command-line
arguments like so:

    $ test-tool read-midx .git/objects [checksum]

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Taylor Blau
2024-08-06 11:37:55 -04:00
committed by Junio C Hamano
parent 97fd770ea1
commit b80236d0e3
4 changed files with 201 additions and 19 deletions

7
midx.h
View File

@@ -24,6 +24,7 @@ struct bitmapped_pack;
#define MIDX_CHUNKID_OBJECTOFFSETS 0x4f4f4646 /* "OOFF" */
#define MIDX_CHUNKID_LARGEOFFSETS 0x4c4f4646 /* "LOFF" */
#define MIDX_CHUNKID_REVINDEX 0x52494458 /* "RIDX" */
#define MIDX_CHUNKID_BASE 0x42415345 /* "BASE" */
#define MIDX_CHUNK_OFFSET_WIDTH (2 * sizeof(uint32_t))
#define MIDX_LARGE_OFFSET_NEEDED 0x80000000
@@ -50,6 +51,7 @@ struct multi_pack_index {
int preferred_pack_idx;
int local;
int has_chain;
const unsigned char *chunk_pack_names;
size_t chunk_pack_names_len;
@@ -80,11 +82,16 @@ struct multi_pack_index {
#define MIDX_EXT_REV "rev"
#define MIDX_EXT_BITMAP "bitmap"
#define MIDX_EXT_MIDX "midx"
const unsigned char *get_midx_checksum(struct multi_pack_index *m);
void get_midx_filename(struct strbuf *out, const char *object_dir);
void get_midx_filename_ext(struct strbuf *out, const char *object_dir,
const unsigned char *hash, const char *ext);
void get_midx_chain_dirname(struct strbuf *buf, const char *object_dir);
void get_midx_chain_filename(struct strbuf *buf, const char *object_dir);
void get_split_midx_filename_ext(struct strbuf *buf, const char *object_dir,
const unsigned char *hash, const char *ext);
struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);