git/commit-graph.h
Ævar Arnfjörð Bjarmason 43d3561805 commit-graph write: don't die if the existing graph is corrupt
When the commit-graph is written we end up calling
parse_commit(). This will in turn invoke code that'll consult the
existing commit-graph about the commit, if the graph is corrupted we
die.

We thus get into a state where a failing "commit-graph verify" can't
be followed-up with a "commit-graph write" if core.commitGraph=true is
set, the graph either needs to be manually removed to proceed, or
core.commitGraph needs to be set to "false".

Change the "commit-graph write" codepath to use a new
parse_commit_no_graph() helper instead of parse_commit() to avoid
this. The latter will call repo_parse_commit_internal() with
use_commit_graph=1 as seen in 177722b344 ("commit: integrate commit
graph with commit parsing", 2018-04-10).

Not using the old graph at all slows down the writing of the new graph
by some small amount, but is a sensible way to prevent an error in the
existing commit-graph from spreading.

Just fixing the current issue would be likely to result in code that's
inadvertently broken in the future. New code might use the
commit-graph at a distance. To detect such cases introduce a
"GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD" setting used when we do our
corruption tests, and test that a "write/verify" combo works after
every one of our current test cases where we now detect commit-graph
corruption.

Some of the code changes here might be strictly unnecessary, e.g. I
was unable to find cases where the parse_commit() called from
write_graph_chunk_data() didn't exit early due to
"item->object.parsed" being true in
repo_parse_commit_internal() (before the use_commit_graph=1 has any
effect). But let's also convert those cases for good measure, we do
not have exhaustive tests for all possible types of commit-graph
corruption.

This might need to be re-visited if we learn to write the commit-graph
incrementally, but probably not. Hopefully we'll just start by finding
out what commits we have in total, then read the old graph(s) to see
what they cover, and finally write a new graph file with everything
that's missing. In that case the new graph writing code just needs to
continue to use e.g. a parse_commit() that doesn't consult the
existing commit-graphs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01 12:14:50 +09:00

81 lines
2.2 KiB
C

#ifndef COMMIT_GRAPH_H
#define COMMIT_GRAPH_H
#include "git-compat-util.h"
#include "repository.h"
#include "string-list.h"
#include "cache.h"
#define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH"
#define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
struct commit;
char *get_commit_graph_filename(const char *obj_dir);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
/*
* Given a commit struct, try to fill the commit struct info, including:
* 1. tree object
* 2. date
* 3. parents.
*
* Returns 1 if and only if the commit was found in the packed graph.
*
* See parse_commit_buffer() for the fallback after this call.
*/
int parse_commit_in_graph(struct repository *r, struct commit *item);
/*
* It is possible that we loaded commit contents from the commit buffer,
* but we also want to ensure the commit-graph content is correctly
* checked and filled. Fill the graph_pos and generation members of
* the given commit.
*/
void load_commit_graph_info(struct repository *r, struct commit *item);
struct tree *get_commit_tree_in_graph(struct repository *r,
const struct commit *c);
struct commit_graph {
int graph_fd;
const unsigned char *data;
size_t data_len;
unsigned char hash_len;
unsigned char num_chunks;
uint32_t num_commits;
struct object_id oid;
const uint32_t *chunk_oid_fanout;
const unsigned char *chunk_oid_lookup;
const unsigned char *chunk_commit_data;
const unsigned char *chunk_extra_edges;
};
struct commit_graph *load_commit_graph_one_fd_st(int fd, struct stat *st);
struct commit_graph *parse_commit_graph(void *graph_map, int fd,
size_t graph_size);
/*
* Return 1 if and only if the repository has a commit-graph
* file and generation numbers are computed in that file.
*/
int generation_numbers_enabled(struct repository *r);
void write_commit_graph_reachable(const char *obj_dir, int append,
int report_progress);
void write_commit_graph(const char *obj_dir,
struct string_list *pack_indexes,
struct string_list *commit_hex,
int append, int report_progress);
int verify_commit_graph(struct repository *r, struct commit_graph *g);
void close_commit_graph(struct repository *);
void free_commit_graph(struct commit_graph *);
#endif