development/git - HydraGit

mirror of https://github.com/git/git synced 2024-10-30 14:03:28 +00:00

Author	SHA1	Message	Date
Stefan Beller	c1f5eb4962	commit: add repository argument to lookup_commit Add a repository argument to allow callers of lookup_commit to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:39 -07:00
Stefan Beller	f86bcc7b2c	tree: add repository argument to lookup_tree Add a repository argument to allow the callers of lookup_tree to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Stefan Beller	da14a7ff99	blob: add repository argument to lookup_blob Add a repository argument to allow the callers of lookup_blob to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Stefan Beller	1268dfac1e	object: add repository argument to object_as_type Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Stefan Beller	1ec5bfd24e	object: add repository argument to parse_object_buffer Add a repository argument to allow the callers of parse_object_buffer to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Stefan Beller	5abddd1eb7	object: add repository argument to lookup_object Add a repository argument to allow callers of lookup_object to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Stefan Beller	109cd76dd3	object: add repository argument to parse_object Add a repository argument to allow the callers of parse_object to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Junio C Hamano	b16b60f71b	Merge branch 'sb/object-store-grafts' into sb/object-store-lookup * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h	2018-06-29 10:43:28 -07:00
Junio C Hamano	110240588d	Merge branch 'sb/object-store-alloc' The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-alloc: alloc: allow arbitrary repositories for alloc functions object: allow create_object to handle arbitrary repositories object: allow grow_object_hash to handle arbitrary repositories alloc: add repository argument to alloc_commit_index alloc: add repository argument to alloc_report alloc: add repository argument to alloc_object_node alloc: add repository argument to alloc_tag_node alloc: add repository argument to alloc_commit_node alloc: add repository argument to alloc_tree_node alloc: add repository argument to alloc_blob_node object: add repository argument to grow_object_hash object: add repository argument to create_object repository: introduce parsed objects field	2018-06-25 13:22:38 -07:00
Junio C Hamano	a856e7d69f	Merge branch 'ds/commit-graph-lockfile-fix' Update to ds/generation-numbers topic. * ds/commit-graph-lockfile-fix: commit-graph: fix UX issue when .lock file exists commit-graph.txt: update design document merge: check config before loading commits commit: use generation number in remove_redundant() commit: add short-circuit to paint_down_to_common() commit: use generation numbers for in_merge_bases() ref-filter: use generation number for --contains commit-graph: always load commit-graph information commit: use generations in paint_down_to_common() commit-graph: compute generation numbers commit: add generation number to struct commit ref-filter: fix outdated comment on in_commit_list	2018-06-25 13:22:36 -07:00
Junio C Hamano	fcb6df3254	Merge branch 'sb/oid-object-info' The codepath around object-info API has been taught to take the repository object (which in turn tells the API which object store the objects are to be located). * sb/oid-object-info: cache.h: allow oid_object_info to handle arbitrary repositories packfile: add repository argument to cache_or_unpack_entry packfile: add repository argument to unpack_entry packfile: add repository argument to read_object packfile: add repository argument to packed_object_info packfile: add repository argument to packed_to_object_type packfile: add repository argument to retry_bad_packed_offset cache.h: add repository argument to oid_object_info cache.h: add repository argument to oid_object_info_extended	2018-05-23 14:38:16 +09:00
Derrick Stolee	e2838d85b6	commit-graph: always load commit-graph information Most code paths load commits using lookup_commit() and then parse_commit(). In some cases, including some branch lookups, the commit is parsed using parse_object_buffer() which side-steps parse_commit() in favor of parse_commit_buffer(). With generation numbers in the commit-graph, we need to ensure that any commit that exists in the commit-graph file has its generation number loaded. Create new load_commit_graph_info() method to fill in the information for a commit that exists only in the commit-graph file. Call it from parse_commit_buffer() after loading the other commit information from the given buffer. Only fill this information when specified by the 'check_graph' parameter. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-22 12:36:34 +09:00
Stefan Beller	eee4502baa	shallow: migrate shallow information into the object parser We need to convert the shallow functions all at the same time as we move the data structures they operate on into the repository. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-18 08:13:10 +09:00
Stefan Beller	7a1dc605af	object.c: clear replace map before freeing it Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-18 07:53:33 +09:00
Stefan Beller	cbd53a2193	object-store: move object access functions to object-store.h This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:42:03 +09:00
Stefan Beller	14ba97f81c	alloc: allow arbitrary repositories for alloc functions We have to convert all of the alloc functions at once, because alloc_report uses a funky macro for reporting. It is better for the sake of mechanical conversion to convert multiple functions at once rather than changing the structure of the reporting function. We record all memory allocation in alloc.c, and free them in clear_alloc_state, which is called for all repositories except the_repository. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:16:50 +09:00
Stefan Beller	d587307205	object.c: free replace map in raw_object_store_clear The replace map for objects was missed to free in the object store in the conversion of `174774cd51` (Merge branch 'sb/object-store-replace', 2018-05-08) Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-10 19:06:51 +09:00
Stefan Beller	341e45e46b	object: allow create_object to handle arbitrary repositories Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:37 +09:00
Stefan Beller	346a817a72	object: allow grow_object_hash to handle arbitrary repositories Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	dd5d9deb01	alloc: add repository argument to alloc_commit_index This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	13e3fdcb76	alloc: add repository argument to alloc_object_node This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Jonathan Nieder	c077a4526b	object: add repository argument to grow_object_hash Add a repository argument to allow the caller of grow_object_hash to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	68f95d382b	object: add repository argument to create_object Add a repository argument to allow the callers of create_object to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	99bf115c87	repository: introduce parsed objects field Convert the existing global cache for parsed objects (obj_hash) into repository-specific parsed object caches. Existing code that uses obj_hash are modified to use the parsed object cache of the_repository; future patches will use the parsed object caches of other repositories. Another future use case for a pool of objects is ease of memory management in revision walking: If we can free the rev-list related memory early in pack-objects (e.g. part of repack operation) then it could lower memory pressure significantly when running on large repos. While this has been discussed on the mailing list lately, this series doesn't implement this. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	0df8e96566	cache.h: add repository argument to oid_object_info Add a repository argument to allow the callers of oid_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	1f2e7ceabc	replace-object: add repository argument to lookup_replace_object Add a repository argument to allow callers of lookup_replace_object to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-12 11:38:56 +09:00
Stefan Beller	47f351e9b3	object-store: move lookup_replace_object to replace-object.h lookup_replace_object is a low-level function that most users of the object store do not need to use directly. Move it to replace-object.h to avoid a dependency loop in an upcoming change to its inline definition that will make use of repository.h. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-12 11:38:56 +09:00
Junio C Hamano	cf0b1793ea	Merge branch 'sb/object-store' Refactoring the internal global data structure to make it possible to open multiple repositories, work with and then close them. Rerolled by Duy on top of a separate preliminary clean-up topic. The resulting structure of the topics looked very sensible. * sb/object-store: (27 commits) sha1_file: allow sha1_loose_object_info to handle arbitrary repositories sha1_file: allow map_sha1_file to handle arbitrary repositories sha1_file: allow map_sha1_file_1 to handle arbitrary repositories sha1_file: allow open_sha1_file to handle arbitrary repositories sha1_file: allow stat_sha1_file to handle arbitrary repositories sha1_file: allow sha1_file_name to handle arbitrary repositories sha1_file: add repository argument to sha1_loose_object_info sha1_file: add repository argument to map_sha1_file sha1_file: add repository argument to map_sha1_file_1 sha1_file: add repository argument to open_sha1_file sha1_file: add repository argument to stat_sha1_file sha1_file: add repository argument to sha1_file_name sha1_file: allow prepare_alt_odb to handle arbitrary repositories sha1_file: allow link_alt_odb_entries to handle arbitrary repositories sha1_file: add repository argument to prepare_alt_odb sha1_file: add repository argument to link_alt_odb_entries sha1_file: add repository argument to read_info_alternates sha1_file: add repository argument to link_alt_odb_entry sha1_file: add raw_object_store argument to alt_odb_usable pack: move approximate object count to object store ...	2018-04-11 13:09:55 +09:00
Junio C Hamano	2d5792f071	Merge branch 'bw/c-plus-plus' into ds/lazy-load-trees * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-04-11 10:46:32 +09:00
Stefan Beller	d0b5986622	object-store: close all packs upon clearing the object store Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:55 -07:00
Stefan Beller	a80d72db2a	object-store: move packed_git and packed_git_mru to object store In a process with multiple repositories open, packfile accessors should be associated to a single repository and not shared globally. Move packed_git and packed_git_mru into the_repository and adjust callers to reflect this. [nd: while at there, wrap access to these two fields in get_packed_git() and get_packed_git_mru(). This allows us to lazily initialize these fields without caller doing that explicitly] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:46 -07:00
Stefan Beller	97501e933a	object-store: free alt_odb_list Free the memory and reset alt_odb_{list, tail} to NULL. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-23 11:06:01 -07:00
Stefan Beller	90c62155d6	repository: introduce raw object store field The raw object store field will contain any objects needed for access to objects in a given repository. This patch introduces the raw object store and populates it with the `objectdir`, which used to be part of the repository struct. As the struct gains members, we'll also populate the function to clear the memory for these members. In a later step, we'll introduce a struct object_parser, that will complement the object parsing in a repository struct: The raw object parser is the layer that will provide access to raw object content, while the higher level object parser code will parse raw objects and keeps track of parenthood and other object relationships using 'struct object'. For now only add the lower level to the repository struct. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-23 11:06:01 -07:00
brian m. carlson	b383a13cc0	Convert lookup_replace_object to struct object_id Convert both the argument and the return value to be pointers to struct object_id. Update the callers and their internals to deal with the new type. Remove several temporaries which are no longer needed. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
brian m. carlson	b4f5aca40e	sha1_file: convert read_sha1_file to struct object_id Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
brian m. carlson	abef9020e3	sha1_file: convert sha1_object_info* to object_id Convert sha1_object_info and sha1_object_info_extended to take pointers to struct object_id and rename them to use "oid" instead of "sha1" in their names. Update the declaration and definition and apply the following semantic patch, plus the standard object_id transforms: @@ expression E1, E2; @@ - sha1_object_info(E1.hash, E2) + oid_object_info(&E1, E2) @@ expression E1, E2; @@ - sha1_object_info(E1->hash, E2) + oid_object_info(E1, E2) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1.hash, E2, E3) + oid_object_info_extended(&E1, E2, E3) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1->hash, E2, E3) + oid_object_info_extended(E1, E2, E3) Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
brian m. carlson	17e65451e3	sha1_file: convert check_sha1_signature to struct object_id Convert this function to take a pointer to struct object_id and rename it check_object_signature. Introduce temporaries to convert the return values of lookup_replace_object and lookup_replace_object_extended into struct object_id. The temporaries are needed because in order to convert lookup_replace_object, open_istream needs to be converted, and open_istream needs check_sha1_signature to be converted, causing a loop of dependencies. The temporaries will be removed in a future patch. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
Junio C Hamano	169c9c0169	Merge branch 'bw/c-plus-plus' Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-03-06 14:54:07 -08:00
Brandon Williams	debca9d2fe	object: rename function 'typename' to 'type_name' Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-14 13:10:05 -08:00
Junio C Hamano	f3d618d2bf	Merge branch 'jh/fsck-promisors' In preparation for implementing narrow/partial clone, the machinery for checking object connectivity used by gc and fsck has been taught that a missing object is OK when it is referenced by a packfile specially marked as coming from trusted repository that promises to make them available on-demand and lazily. * jh/fsck-promisors: gc: do not repack promisor packfiles rev-list: support termination at promisor objects sha1_file: support lazily fetching missing objects introduce fetch-object: fetch one promisor object index-pack: refactor writing of .keep files fsck: support promisor objects as CLI argument fsck: support referenced promisor objects fsck: support refs pointing to promisor objects fsck: introduce partialclone extension extension.partialclone: introduce partial clone extension	2018-02-13 13:39:03 -08:00
René Scharfe	4ad315fc99	object: add clear_commit_marks_all() Add a function for clearing the commit marks of all in-core commit objects. It's similar to clear_object_flags(), but more precise, since it leaves the other object types alone. It still has to iterate through them, though. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-28 13:50:05 -08:00
Jonathan Tan	df11e19648	rev-list: support termination at promisor objects Teach rev-list to support termination of an object traversal at any object from a promisor remote (whether one that the local repo also has, or one that the local repo knows about because it has another promisor object that references it). This will be used subsequently in gc and in the connectivity check used by fetch. For efficiency, if an object is referenced by a promisor object, and is in the local repo only as a non-promisor object, object traversal will not stop there. This is to avoid building the list of promisor object references. (In list-objects.c, the case where obj is NULL in process_blob() and process_tree() do not need to be changed because those happen only when there is a conflict between the expected type and the existing object. If the object doesn't exist, an object will be synthesized, which is fine.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:52:42 -08:00
Martin Ågren	7199203937	object_array: add and use `object_array_pop()` In a couple of places, we pop objects off an object array `foo` by decreasing `foo.nr`. We access `foo.nr` in many places, but most if not all other times we do so read-only, e.g., as we iterate over the array. But when we change `foo.nr` behind the array's back, it feels a bit nasty and looks like it might leak memory. Leaks happen if the popped element has an allocated `name` or `path`. At the moment, that is not the case. Still, 1) the object array might gain more fields that want to be freed, 2) a code path where we pop might start using names or paths, 3) one of these code paths might be copied to somewhere where we do, and 4) using a dedicated function for popping is conceptually cleaner. Introduce and use `object_array_pop()` instead. Release memory in the new function. Document that popping an object leaves the associated elements in limbo. The converted places were identified by grepping for "\.nr\>" and looking for "--". Make the new function return NULL on an empty array. This is consistent with `pop_commit()` and allows the following: while ((o = object_array_pop(&foo)) != NULL) { // do something } But as noted above, we don't need to go out of our way to avoid reading `foo.nr`. This is probably more readable: while (foo.nr) { ... o = object_array_pop(&foo); // do something } The name of `object_array_pop()` does not quite align with `add_object_array()`. That is unfortunate. On the other hand, it matches `object_array_clear()`. Arguably it's `add_...` that is the odd one out, since it reads like it's used to "add" an "object array". For that reason, side with `object_array_clear()`. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 10:06:04 +09:00
Jonathan Tan	092c55d094	object: remove "used" field from struct object The "used" field in struct object is only used by builtin/fsck. Remove that field and modify builtin/fsck to use a flag instead. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-20 14:54:08 -07:00
Ævar Arnfjörð Bjarmason	6a83d90207	coccinelle: make use of the "type" FREE_AND_NULL() rule Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-16 12:44:03 -07:00
brian m. carlson	c251c83df2	object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	d3101b533d	Convert lookup_tag to struct object_id Convert lookup_tag to take a pointer to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	740ee055c6	Convert lookup_tree to struct object_id Convert the lookup_tree function to take a pointer to struct object_id. The commit was created with manual changes to tree.c, tree.h, and object.c, plus the following semantic patch: @@ @@ - lookup_tree(EMPTY_TREE_SHA1_BIN) + lookup_tree(&empty_tree_oid) @@ expression E1; @@ - lookup_tree(E1.hash) + lookup_tree(&E1) @@ expression E1; @@ - lookup_tree(E1->hash) + lookup_tree(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	3aca1fc6c9	Convert lookup_blob to struct object_id Convert lookup_blob to take a pointer to struct object_id. The commit was created with manual changes to blob.c and blob.h, plus the following semantic patch: @@ expression E1; @@ - lookup_blob(E1.hash) + lookup_blob(&E1) @@ expression E1; @@ - lookup_blob(E1->hash) + lookup_blob(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	3e9309815d	Convert remaining callers of lookup_blob to object_id All but a few callers of lookup_blob have been converted to struct object_id. Introduce a temporary, which will be removed later, into parse_object to ease the transition, and convert the remaining callers so that we can update lookup_blob to take struct object_id *. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	bc83266abe	Convert lookup_commit* to struct object_id Convert lookup_commit, lookup_commit_or_die, lookup_commit_reference, and lookup_commit_reference_gently to take struct object_id arguments. Introduce a temporary in parse_object buffer in order to convert this function. This is required since in order to convert parse_object and parse_object_buffer, lookup_commit_reference_gently and lookup_commit_or_die would need to be converted. Not introducing a temporary would therefore require that lookup_commit_or_die take a struct object_id , but lookup_commit would take unsigned char , leaving a confusing and hard-to-use interface. parse_object_buffer will lose this temporary in a later patch. This commit was created with manual changes to commit.c, commit.h, and object.c, plus the following semantic patch: @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1.hash, E2) + lookup_commit_reference_gently(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1->hash, E2) + lookup_commit_reference_gently(E1, E2) @@ expression E1; @@ - lookup_commit_reference(E1.hash) + lookup_commit_reference(&E1) @@ expression E1; @@ - lookup_commit_reference(E1->hash) + lookup_commit_reference(E1) @@ expression E1; @@ - lookup_commit(E1.hash) + lookup_commit(&E1) @@ expression E1; @@ - lookup_commit(E1->hash) + lookup_commit(E1) @@ expression E1, E2; @@ - lookup_commit_or_die(E1.hash, E2) + lookup_commit_or_die(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_or_die(E1->hash, E2) + lookup_commit_or_die(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
René Scharfe	35d803bc9a	use SWAP macro Apply the semantic patch swap.cocci to convert hand-rolled swaps to use the macro SWAP. The resulting code is shorter and easier to read, the object code is effectively unchanged. The patch for object.c had to be hand-edited in order to preserve the comment before the change; Coccinelle tried to eat it for some reason. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:17:00 -08:00
brian m. carlson	ed1c9977cb	Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	f2fd0760f6	Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	7999b2cf77	Add several uses of get_object_hash. Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
Junio C Hamano	1156097296	Merge branch 'jk/type-from-string-gently' "git cat-file bl $blob" failed to barf even though there is no object type that is "bl". * jk/type-from-string-gently: type_from_string_gently: make sure length matches	2015-05-05 21:00:29 -07:00
Jeff King	b7994af0f9	type_from_string_gently: make sure length matches When commit `fe8e3b7` refactored type_from_string to allow input that was not NUL-terminated, it switched to using strncmp instead of strcmp. But this means we check only the first "len" bytes of the strings, and ignore any remaining bytes in the object_type_string. We should make sure that it is also "len" bytes, or else we would accept "comm" as "commit", and so forth. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-04-17 13:54:39 -07:00
Jeff King	189a122249	drop add_object_array_with_mode This is a thin compatibility wrapper around add_pending_object_with_path. But the only caller is add_object_array, which is itself just a thin compatibility wrapper. There are no external callers, so we can just remove this middle wrapper. Noticed-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-19 15:28:30 -07:00
Jeff King	9e0c3c4fcd	make add_object_array_with_context interface more sane When you resolve a sha1, you can optionally keep any context found during the resolution, including the path and mode of a tree entry (e.g., when looking up "HEAD:subdir/file.c"). The add_object_array_with_context function lets you then attach that context to an entry in a list. Unfortunately, the interface for doing so is horrible. The object_context structure is large and most object_array users do not use it. Therefore we keep a pointer to the structure to avoid burdening other users too much. But that means when we do use it that we must allocate the struct ourselves. And the struct contains a fixed PATH_MAX-sized buffer, which makes this wholly unsuitable for any large arrays. We can observe that there is only a single user of the "with_context" variant: builtin/grep.c. And in that use case, the only element we care about is the path. We can therefore store only the path as a pointer (the context's mode field was redundant with the object_array_entry itself, and nobody actually cared about the surrounding tree). This still requires a strdup of the pathname, but at least we are only consuming the minimum amount of memory for each string. We can also handle the copying ourselves in add_object_array_*, and free it as appropriate in object_array_release_entry. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:44 -07:00
Jeff King	46be823124	object_array: add a "clear" function There's currently no easy way to free the memory associated with an object_array (and in most cases, we simply leak the memory in a rev_info's pending array). Let's provide a helper to make this easier to handle. We can make use of it in list-objects.c, which does the same thing by hand (but fails to free the "name" field of each entry, potentially leaking memory). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:37 -07:00
Jeff King	68f492359e	object_array: factor out slopbuf-freeing logic This is not a lot of code, but it's a logical construct that should not need to be repeated (and we are about to add a third repetition). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-16 10:10:36 -07:00
Junio C Hamano	1c2ea2cdc0	Merge branch 'rs/realloc-array' Code cleanup. * rs/realloc-array: use REALLOC_ARRAY for changing the allocation size of arrays add macro REALLOC_ARRAY	2014-09-26 14:39:45 -07:00
René Scharfe	2756ca4347	use REALLOC_ARRAY for changing the allocation size of arrays Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-09-18 09:13:42 -07:00
Johannes Schindelin	fe8e3b7180	Refactor type_from_string() to allow continuing after detecting an error In the next commits, we will enhance the fsck_tag() function to check tag objects more thoroughly. To this end, we need a function to verify that a given string is a valid object type, but that does not die() in the negative case. While at it, prepare type_from_string() for counted strings, i.e. strings with an explicitly specified length rather than a NUL termination. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-09-10 13:52:13 -07:00
Junio C Hamano	10b944b37b	Merge branch 'jk/alloc-commit-id' Make sure all in-core commit objects are assigned a unique number so that they can be annotated using the commit-slab API. * jk/alloc-commit-id: diff-tree: avoid lookup_unknown_object object_as_type: set commit index alloc: factor out commit index add object_as_type helper for casting objects parse_object_buffer: do not set object type move setting of object->type to alloc_* functions alloc: write out allocator definitions alloc.c: remove the alloc_raw_commit_node() function	2014-07-22 10:59:25 -07:00
Jeff King	d66bebcbcf	object_as_type: set commit index The point of the "index" field of struct commit is that every allocated commit would have one. It is supposed to be an invariant that whenever object->type is set to OBJ_COMMIT, we have a unique index. Commit `969eba6` (commit: push commit_index update into alloc_commit_node, 2014-06-10) covered this case for newly-allocated commits. However, we may also allocate an "unknown" object via lookup_unknown_object, and only later convert it to a commit. We must make sure that we set the commit index when we switch the type field. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-13 18:59:05 -07:00
Jeff King	8ff226a9d5	add object_as_type helper for casting objects When we call lookup_commit, lookup_tree, etc, the logic goes something like: 1. Look for an existing object struct. If we don't have one, allocate and return a new one. 2. Double check that any object we have is the expected type (and complain and return NULL otherwise). 3. Convert an object with type OBJ_NONE (from a prior call to lookup_unknown_object) to the expected type. We can encapsulate steps 2 and 3 in a helper function which checks whether we have the expected object type, converts OBJ_NONE as appropriate, and returns the object. Not only does this shorten the code, but it also provides one central location for converting OBJ_NONE objects into objects of other types. Future patches will use that to enforce type-specific invariants. Since this is a refactoring, we would want it to behave exactly as the current code. It takes a little reasoning to see that this is the case: - for lookup_{commit,tree,etc} functions, we are just pulling steps 2 and 3 into a function that does the same thing. - for the call in peel_object, we currently only do step 3 (but we want to consolidate it with the others, as mentioned above). However, step 2 is a noop here, as the surrounding conditional makes sure we have OBJ_NONE (which we want to keep to avoid an extraneous call to sha1_object_info). - for the call in lookup_commit_reference_gently, we are currently doing step 2 but not step 3. However, step 3 is a noop here. The object we got will have just come from deref_tag, which must have figured out the type for each object in order to know when to stop peeling. Therefore the type will never be OBJ_NONE. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-13 18:59:05 -07:00
Jeff King	5af01caa08	parse_object_buffer: do not set object type The only way that "obj" can be non-NULL is if it came from one of the lookup_* functions. These functions always ensure that the object has the expected type (and return NULL otherwise), so there is no need for us to set the type. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-13 18:59:05 -07:00
Jeff King	d36f51c13b	move setting of object->type to alloc_* functions The "struct object" type implements basic object polymorphism. Individual instances are allocated as concrete types (or as a union type that can store any object), and a "struct object *" can be cast into its real type after examining its "type" enum. This means it is dangerous to have a type field that does not match the allocation (e.g., setting the type field of a "struct blob" to "OBJ_COMMIT" would mean that a reader might read past the allocated memory). In most of the current code this is not a problem; the first thing we do after allocating an object is usually to set its type field by passing it to create_object. However, the virtual commits we create in merge-recursive.c do not ever get their type set. This does not seem to have caused problems in practice, though (presumably because we always pass around a "struct commit" pointer and never even look at the type). We can fix this oversight and also make it harder for future code to get it wrong by setting the type directly in the object allocation functions. This will also make it easier to fix problems with commit index allocation, as we know that any object allocated by alloc_commit_node will meet the invariant that an object with an OBJ_COMMIT type field will have a unique index number. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-13 18:59:05 -07:00
Karsten Blees	039dc71a7c	hashmap: factor out getting a hash code from a SHA1 Copying the first bytes of a SHA1 is duplicated in six places, however, the implications (the actual value would depend on the endianness of the platform) is documented only once. Add a properly documented API for this. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-07 13:56:24 -07:00
Jeff King	8597ea3afe	commit: record buffer length in cache Most callsites which use the commit buffer try to use the cached version attached to the commit, rather than re-reading from disk. Unfortunately, that interface provides only a pointer to the NUL-terminated buffer, with no indication of the original length. For the most part, this doesn't matter. People do not put NULs in their commit messages, and the log code is happy to treat it all as a NUL-terminated string. However, some code paths do care. For example, when checking signatures, we want to be very careful that we verify all the bytes to avoid malicious trickery. This patch just adds an optional "size" out-pointer to get_commit_buffer and friends. The existing callers all pass NULL (there did not seem to be any obvious sites where we could avoid an immediate strlen() call, though perhaps with some further refactoring we could). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:09:38 -07:00
Jeff King	a97934d820	use get_cached_commit_buffer where appropriate Some call sites check commit->buffer to see whether we have a cached buffer, and if so, do some work with it. In the long run we may want to switch these code paths to make their decision on a different boolean flag (because checking the cache may get a little more expensive in the future). But for now, we can easily support them by converting the calls to use get_cached_commit_buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:08:17 -07:00
Jeff King	66c2827ea4	provide a helper to set the commit buffer Right now this is just a one-liner, but abstracting it will make it easier to change later. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:08:17 -07:00
Michael Haggerty	33bef7ea25	Document some functions defined in object.c Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-28 13:18:09 -08:00
Junio C Hamano	4197361e39	Merge branch 'mg/more-textconv' Make "git grep" and "git show" pay attention to --textconv when dealing with blob objects. * mg/more-textconv: grep: honor --textconv for the case rev:path grep: allow to use textconv filters t7008: demonstrate behavior of grep with textconv cat-file: do not die on --textconv without textconv filters show: honor --textconv for blobs diff_opt: track whether flags have been set explicitly t4030: demonstrate behavior of show with textconv	2013-10-23 13:21:31 -07:00
Nicolas Pitre	9f36c9b7f7	lookup_object: remove hashtable_index() and optimize hash_obj() hashtable_index() appears to be a close duplicate of hash_obj(). Keep only the later and make it usable for all cases. Also remove the modulus as this is an expensive operation. The size argument is always a power of 2 anyway, so a simple mask operation provides the same result. On a 'git rev-list --all --objects' run this decreased the time spent in lookup_object from 27.5% to 24.1%. [jc: with a few comments on "modulus turned into mask" by Peff] Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-11 12:25:33 -07:00
Junio C Hamano	5701c3d701	Merge branch 'sb/parse-object-buffer-eaten' * sb/parse-object-buffer-eaten: parse_object_buffer: correct freeing the buffer	2013-07-22 11:23:33 -07:00
Stefan Beller	8e92e8f242	parse_object_buffer: correct freeing the buffer If we exit early in the function parse_object_buffer, we did not write to eaten_p. Then the calling function parse_object, which looks like the following with respect to the eaten variable, cannot rely on a proper value set in eaten, hence the freeing of the buffer depends on random values in memory. struct object parse_object(const unsigned char *sha1) { int eaten; ... obj = parse_object_buffer(sha1, type, size, buffer, &eaten); if (!eaten) free(buffer); } This change makes sure, the buffer freeing condition is deterministic. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-17 18:10:51 -07:00
Junio C Hamano	ede63a195c	Merge branch 'mh/reflife' Define memory ownership and lifetime rules for what for-each-ref feeds to its callbacks (in short, "you do not own it, so make a copy if you want to keep it"). * mh/reflife: (25 commits) refs: document the lifetime of the args passed to each_ref_fn register_ref(): make a copy of the bad reference SHA-1 exclude_existing(): set existing_refs.strdup_strings string_list_add_refs_by_glob(): add a comment about memory management string_list_add_one_ref(): rename first parameter to "refname" show_head_ref(): rename first parameter to "refname" show_head_ref(): do not shadow name of argument add_existing(): do not retain a reference to sha1 do_fetch(): clean up existing_refs before exiting do_fetch(): reduce scope of peer_item object_array_entry: fix memory handling of the name field find_first_merges(): remove unnecessary code find_first_merges(): initialize merges variable using initializer fsck: don't put a void-shaped peg in a char-shaped hole object_array_remove_duplicates(): rewrite to reduce copying revision: use object_array_filter() in implementation of gc_boundary() object_array: add function object_array_filter() revision: split some overly-long lines cmd_diff(): make it obvious which cases are exclusive of each other cmd_diff(): rename local variable "list" -> "entry" ...	2013-06-14 08:46:14 -07:00
Michael Haggerty	31faeb2088	object_array_entry: fix memory handling of the name field Previously, the memory management of the object_array_entry::name field was inconsistent and undocumented. object_array_entries are ultimately created by a single function, add_object_array_with_mode(), which has an argument "const char name". This function used to simply set the name field to reference the string pointed to by the name parameter, and nobody on the object_array side ever freed the memory. Thus, it assumed that the memory for the name field would be managed by the caller, and that the lifetime of that string would be at least as long as the lifetime of the object_array_entry. But callers were inconsistent: Some passed pointers to constant strings or argv entries, which was OK. * Some passed pointers to newly-allocated memory, but didn't arrange for the memory ever to be freed. * Some passed the return value of sha1_to_hex(), which is a pointer to a statically-allocated buffer that can be overwritten at any time. * Some passed pointers to refnames that they received from a for_each_ref()-type iteration, but the lifetimes of such refnames is not guaranteed by the refs API. Bring consistency to this mess by changing object_array to make its own copy for the object_array_entry::name field and free this memory when an object_array_entry is deleted from the array. Many callers were passing the empty string as the name parameter, so as a performance optimization, treat the empty string specially. Instead of making a copy, store a pointer to a statically-allocated empty string to object_array_entry::name. When deleting such an entry, skip the free(). Change the callers that were already passing copies to add_object_array_with_mode() to either skip the copy, or (if the memory needed to be allocated anyway) freeing the memory itself. A part of this commit effectively reverts `70d26c6e76` read_revisions_from_stdin: make copies for handle_revision_arg because the copying introduced by that commit (which is still necessary) is now done at a deeper level. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-02 15:28:46 -07:00
Junio C Hamano	4818cfcdcc	Merge branch 'jk/lookup-object-prefer-latest' Optimizes object lookup when the object hashtable starts to become crowded. * jk/lookup-object-prefer-latest: lookup_object: prioritize recently found objects	2013-05-29 14:29:59 -07:00
Michael Haggerty	1506510c17	object_array_remove_duplicates(): rewrite to reduce copying The old version copied one entry to its destination position, then deleted any matching entries from the tail of the array. This required the tail of the array to be copied multiple times. It didn't affect the complexity of the algorithm because the whole tail has to be searched through anyway. But all the copying was unnecessary. Instead, check for the existence of an entry with the same name in the head of the list before copying an entry to its final position. This way each entry has to be copied at most one time. Extract a helper function contains_name() to do a bit of the work. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-28 09:25:01 -07:00
Michael Haggerty	aeb4a51ef8	object_array: add function object_array_filter() Add a function that allows unwanted entries in an object_array to be removed. This encapsulation is a step towards giving object_array ownership of its entries' name memory. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-28 09:25:01 -07:00
Michael J Gruber	afa15f3cd8	grep: honor --textconv for the case rev:path Make "grep" honor the "--textconv" option also for the object case, i.e. when used with an argument "rev:path". Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-10 10:27:34 -07:00
Jeff King	9a414486d9	lookup_object: prioritize recently found objects The lookup_object function is backed by a hash table of all objects we have seen in the program. We manage collisions with a linear walk over the colliding entries, checking each with hashcmp(). The main cost of lookup is in these hashcmp() calls; finding our item in the first slot is cheaper than finding it in the second slot, which is cheaper than the third, and so on. If we assume that there is some locality to the object lookups (e.g., if X and Y collide, and we have just looked up X, the next lookup is more likely to be for X than for Y), then we can improve our average lookup speed by checking X before Y. This patch does so by swapping a found item to the front of the collision chain. The p0001 perf test reveals that this does indeed exploit locality in the case of "rev-list --all --objects": Test origin this tree ------------------------------------------------------------------------- 0001.1: rev-list --all 0.40(0.38+0.02) 0.40(0.36+0.03) +0.0% 0001.2: rev-list --all --objects 2.24(2.17+0.05) 1.86(1.79+0.05) -17.0% This is not surprising, as the full object traversal will hit the same tree entries over and over (e.g., for every commit that doesn't change "Documentation/", we will have to look up the same sha1 just to find out that we already processed it). The reason why this technique works (and does not violate any properties of the hash table) is subtle and bears some explanation. Let's imagine we get a lookup for sha1 `X`, and it hashes to bucket `i` in our table. That stretch of the table may look like: index \| i-1 \| i \| i+1 \| i+2 \| ----------------------------------- entry ... \| A \| B \| C \| X \| ... ----------------------------------- We start our probe at i, see that B does not match, nor does C, and finally find X. There may be multiple C's in the middle, but we know that there are no empty slots (or else we would not find X at all). We do not know the original index of B; it may be `i`, or it may be less than i (e.g., if it were `i-1`, it would collide with A and spill over into the `i` bucket). So it is acceptable for us to move it to the right of a contiguous stretch of entries (because we will find it from a linear walk starting anywhere at `i` or before), but never to the left (if we moved it to `i-1`, we would miss it when starting our walk at `i`). We do know the original index of X; it is `i`, so it is safe to place it anywhere in the contiguous stretch between `i` and where we found it (`i+2` in the this case). This patch does a pure swap; after finding X in the situation above, we would end with: index \| i-1 \| i \| i+1 \| i+2 \| ----------------------------------- entry ... \| A \| X \| C \| B \| ... ----------------------------------- We could instead bump X into the `i` slot, and then shift the whole contiguous chain down by one, resulting in: index \| i-1 \| i \| i+1 \| i+2 \| ----------------------------------- entry ... \| A \| X \| B \| C \| ... ----------------------------------- That puts our chain in true most-recently-used order. However, experiments show that it is not any faster (and in fact, is slightly slower due to the extra manipulation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-05-02 08:36:50 -07:00
Jeff King	75a9549047	avoid segfaults on parse_object failure Many call-sites of parse_object assume that they will get a non-NULL return value; this is not the case if we encounter an error while parsing the object. This patch adds a wrapper function around parse_object that handles dying automatically, and uses it anywhere we immediately try to access the return value as a non-NULL pointer (i.e., anywhere that we would currently segfault). This wrapper may also be useful in other places. The most obvious one is code like: o = parse_object(sha1); if (!o) die(...); However, these should not be mechanically converted to parse_object_or_die, as the die message is sometimes customized. Later patches can address these sites on a case-by-case basis. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-03-17 12:49:03 -07:00
Pete Wyckoff	82247e9bd5	remove superfluous newlines in error messages The error handling routines add a newline. Remove the duplicate ones in error messages. Signed-off-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-30 15:45:51 -07:00
Junio C Hamano	419f2ecf78	Merge branch 'hv/submodule-recurse-push' "git push --recurse-submodules" learns to optionally look into the histories of submodules bound to the superproject and push them out. By Heiko Voigt * hv/submodule-recurse-push: push: teach --recurse-submodules the on-demand option Refactor submodule push check to use string list instead of integer Teach revision walking machinery to walk multiple times sequencially	2012-04-24 14:40:20 -07:00
Heiko Voigt	bcc0a3ea38	Teach revision walking machinery to walk multiple times sequencially Previously it was not possible to iterate revisions twice using the revision walking api. We add a reset_revision_walk() which clears the used flags. This allows us to do multiple sequencial revision walks. We add the appropriate calls to the existing submodule machinery doing revision walks. This is done to avoid surprises if future code wants to call these functions more than once during the processes lifetime. Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-30 08:57:49 -07:00
Nguyễn Thái Ngọc Duy	090ea12671	parse_object: avoid putting whole blob in core Traditionally, all the callers of check_sha1_signature() first called read_sha1_file() to prepare the whole object data in core, and called this function. The function is used to revalidate what we read from the object database actually matches the object name we used to ask for the data from the object database. Update the API to allow callers to pass NULL as the object data, and have the function read and hash the object data using streaming API to recompute the object name, without having to hold everything in core at the same time. This is most useful in parse_object() that parses a blob object, because this caller does not have to keep the actual blob data around in memory after a "struct blob" is returned. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-07 09:07:38 -08:00
Jeff King	ccdc6037fe	parse_object: try internal cache before reading object db When parse_object is called, we do the following: 1. read the object data into a buffer via read_sha1_file 2. call parse_object_buffer, which then: a. calls the appropriate lookup_{commit,tree,blob,tag} to either create a new "struct object", or to find an existing one. We know the appropriate type from the lookup in step 1. b. calls the appropriate parse_{commit,tree,blob,tag} to parse the buffer for the new (or existing) object In step 2b, all of the called functions are no-ops for object "X" if "X->object.parsed" is set. I.e., when we have already parsed an object, we end up going to a lot of work just to find out at a low level that there is nothing left for us to do (and we throw away the data from read_sha1_file unread). We can optimize this by moving the check for "do we have an in-memory object" from 2a before the expensive call to read_sha1_file in step 1. This might seem circular, since step 2a uses the type information determined in step 1 to call the appropriate lookup function. However, we can notice that all of the lookup_* functions are backed by lookup_object. In other words, all of the objects are kept in a master hash table, and we don't actually need the type to do the "do we have it" part of the lookup, only to do the "and create it if it doesn't exist" part. This can save time whenever we call parse_object on the same sha1 twice in a single program. Some code paths already perform this optimization manually, with either: if (!obj->parsed) obj = parse_object(obj->sha1); if you already have a "struct object", or: struct object *obj = lookup_unknown_object(sha1); if (!obj \|\| !obj->parsed) obj = parse_object(sha1); if you don't. This patch moves the optimization into parse_object itself. Most git operations won't notice any impact. Either they don't parse a lot of duplicate sha1s, or the calling code takes special care not to re-parse objects. I timed two code paths that do benefit (there may be more, but these two were immediately obvious and easy to time). The first is fast-export, which calls parse_object on each object it outputs, like this: object = parse_object(sha1); if (!object) die(...); if (object->flags & SHOWN) return; which means that just to realize we have already shown an object, we will read the whole object from disk! With this patch, my best-of-five time for "fast-export --all" on git.git dropped from 26.3s to 21.3s. The second case is upload-pack, which will call parse_object for each advertised ref (because it needs to peel tags to show "^{}" entries). This doesn't matter for most repositories, because they don't have a lot of refs pointing to the same objects. However, if you have a big alternates repository with a shared object db for a number of child repositories, then the alternates repository will have duplicated refs representing each of its children. For example, GitHub's alternates repository for git.git has ~120,000 refs, of which only ~3200 are unique. The time for upload-pack to print its list of advertised refs dropped from 3.4s to 0.76s. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-01-05 13:30:54 -08:00
Junio C Hamano	68be2fea50	receive-pack, fetch-pack: reject bogus pack that records objects twice When receive-pack & fetch-pack are run and store the pack obtained over the wire to a local repository, they internally run the index-pack command with the --strict option. Make sure that we reject incoming packfile that records objects twice to avoid spreading such a damage. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-11-16 22:05:21 -08:00
Junio C Hamano	4bbf5a2615	read_sha1_file(): get rid of read_sha1_file_repl() madness Most callers want to silently get a replacement object, and they do not care what the real name of the replacement object is. Worse yet, no sane interface to return the underlying object without replacement is provided. Remove the function and make only the few callers that want the name of the replacement object find it themselves. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-15 15:23:33 -07:00
Junio C Hamano	4682693e9c	Merge branch 'maint' * maint: tag.c: whitespace breakages fix Fix whitespace issue in object.c t5505: add missing &&	2010-09-06 00:12:04 -07:00
Junio C Hamano	af24059fa2	Merge branch 'xx/trivial' into maint * xx/trivial: tag.c: whitespace breakages fix Fix whitespace issue in object.c t5505: add missing &&	2010-09-06 00:11:59 -07:00
Jared Hance	55b4e9e432	Fix whitespace issue in object.c Change some expanded tabs (spaces) to tabs in object.c. Signed-off-by: Jared Hance <jaredhance@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-09-05 22:12:29 -07:00
Junio C Hamano	f92d62ec4e	Merge branch 'nd/maint-fix-replace' * nd/maint-fix-replace: parse_object: pass on the original sha1, not the replaced one	2010-09-03 22:23:13 -07:00
Nguyễn Thái Ngọc Duy	2e3400c052	parse_object: pass on the original sha1, not the replaced one Commit `0e87c36` (object: call "check_sha1_signature" with the replacement sha1) changed the first argument passed to parse_object_buffer() from "sha1" to "repl". With that change, the returned obj pointer has the replacement SHA1 in obj->sha1, not the original one. But when using lookup_commit() and then parse_commit() on a commit, we get an object pointer with the original sha1, but the commit content comes from the replacement commit. So the result we get from using parse_object() is different from the we get from using lookup_commit() followed by parse_commit(). It looks much simpler and safer to fix this inconsistency by passing "sha1" to parse_object_bufer() instead of "repl". The commit comment should be used to tell the the replacement commit is replacing another commit and why. So it should be easy to see that we have a replacement commit instead of an original one. And it is not a problem if the content of the commit is not consistent with the sha1 as cat-file piped to hash-object can be used to see the difference. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-09-03 22:13:08 -07:00
Jonathan Nieder	97a20eea19	fix "bundle --stdin" segfault When passed an empty list, objects_array_remove_duplicates() corrupts it by changing the number of entries from 0 to 1. The problem lies in the condition of its main loop: for (ref = 0; ref < array->nr - 1; ref++) { The loop body manipulates the supplied object array. In the case of an empty array, it should not be doing anything at all. But array->nr is an unsigned quantity, so the code enters the loop, in particular increasing array->nr. Fix this by comparing (ref + 1 < array->nr) instead. This bug can be triggered by git bundle --stdin: $ echo HEAD \| git bundle create some.bundle --stdin’ Segmentation fault (core dumped) The list of commits to bundle appears to be empty because of another bug: by the time the revision-walking machinery gets to look at it, standard input has already been consumed by rev-list, so this function gets an empty list of revisions. After this patch, git bundle --stdin still does not work; it just doesn’t segfault any more. Reported-by: Joey Hess <joey@kitenet.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-04-19 22:16:35 -07:00
Junio C Hamano	c76189875b	object.c: remove unused functions object_list_append() and object_list_length}() are not used anywhere. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-17 22:49:36 -08:00

1 2 3 4

200 commits