development/git - HydraGit

mirror of https://github.com/git/git synced 2024-09-17 23:41:33 +00:00

Author	SHA1	Message	Date
Junio C Hamano	61061abba7	Merge branch 'jh/object-filtering' In preparation for implementing narrow/partial clone, the object walking machinery has been taught a way to tell it to "filter" some objects from enumeration. * jh/object-filtering: rev-list: support --no-filter argument list-objects-filter-options: support --no-filter list-objects-filter-options: fix 'keword' typo in comment pack-objects: add list-objects filtering rev-list: add list-objects filtering support list-objects: filter objects in traverse_commit_list oidset: add iterator methods to oidset oidmap: add oidmap iterator methods dir: allow exclusions from blob in addition to file	2017-12-27 11:16:21 -08:00
Jeff Hostetler	f4371a883f	rev-list: support --no-filter argument Teach rev-list to support --no-filter to override a previous --filter=<filter_spec> argument. This is to be consistent with commands that use OPT_PARSE macros. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-05 09:44:37 -08:00
Jeff Hostetler	caf3827e2f	rev-list: add list-objects filtering support Teach rev-list to use the filtering provided by the traverse_commit_list_filtered() interface to omit unwanted objects from the result. Object filtering is only allowed when one of the "--objects*" options are used. When the "--filter-print-omitted" option is used, the omitted objects are printed at the end. These are marked with a "~". This option can be combined with "--quiet" to get a list of just the omitted objects. Add t6112 test. In the future, we will introduce a "partial clone" mechanism wherein an object in a repo, obtained from a remote, may reference a missing object that can be dynamically fetched from that remote once needed. This "partial clone" mechanism will have a way, sometimes slow, of determining if a missing link is one of the links expected to be produced by this mechanism. This patch introduces handling of missing objects to help debugging and development of the "partial clone" mechanism, and once the mechanism is implemented, for a power user to perform operations that are missing-object aware without incurring the cost of checking if a missing link is expected. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-22 14:11:57 +09:00
Junio C Hamano	f116163171	Merge branch 'ma/bisect-leakfix' Leak fixes. * ma/bisect-leakfix: bisect: fix memory leak when returning best element bisect: fix off-by-one error in `best_bisection_sorted()` bisect: fix memory leak in `find_bisection()` bisect: change calling-convention of `find_bisection()`	2017-11-15 12:14:28 +09:00
Junio C Hamano	8cc633286a	Merge branch 'bw/diff-opt-impl-to-bitfields' A single-word "unsigned flags" in the diff options is being split into a structure with many bitfields. * bw/diff-opt-impl-to-bitfields: diff: make struct diff_flags members lowercase diff: remove DIFF_OPT_CLR macro diff: remove DIFF_OPT_SET macro diff: remove DIFF_OPT_TST macro diff: remove touched flags diff: add flag to indicate textconv was set via cmdline diff: convert flags to be stored in bitfields add, reset: use DIFF_OPT_SET macro to set a diff flag	2017-11-09 14:31:27 +09:00
Martin Ågren	24d707f636	bisect: change calling-convention of `find_bisection()` This function takes a commit list and returns a commit list. The returned list is built by modifying the original list. Thus the caller should not use the original list again (and after the next commit fixes a memory leak, it must not). Change the function signature so that it takes a **list and has void return type. That should make it harder to misuse this function. While we're here, document this function. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-06 10:15:29 +09:00
Brandon Williams	0d1e0e7801	diff: make struct diff_flags members lowercase Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:40 +09:00
Brandon Williams	3b69daed86	diff: remove DIFF_OPT_TST macro Remove the `DIFF_OPT_TST` macro and instead access the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_TST(&E, fld) + E.flags.fld @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_TST(ptr, fld) + ptr->flags.fld Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
brian m. carlson	206649672e	pack-bitmap: convert traverse_bitmap_commit_list to object_id Convert traverse_bitmap_commit_list and the callbacks it takes to use a pointer to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-16 11:05:51 +09:00
Junio C Hamano	d33a433236	Merge branch 'jc/simplify-progress' The API to start showing progress meter after a short delay has been simplified. * jc/simplify-progress: progress: simplify "delayed" progress API	2017-08-24 10:20:02 -07:00
Junio C Hamano	8aade107dd	progress: simplify "delayed" progress API We used to expose the full power of the delayed progress API to the callers, so that they can specify, not just the message to show and expected total amount of work that is used to compute the percentage of work performed so far, the percent-threshold parameter P and the delay-seconds parameter N. The progress meter starts to show at N seconds into the operation only if we have not yet completed P per-cent of the total work. Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there are oddballs that chose more random-looking values like 95%. For a smoother workload, (50%, 1s) would allow us to start showing the progress meter earlier than (0%, 2s), while keeping the chance of not showing progress meter for long running operation the same as the latter. For a task that would take 2s or more to complete, it is likely that less than half of it would complete within the first second, if the workload is smooth. But for a spiky workload whose earlier part is easier, such a setting is likely to fail to show the progress meter entirely and (0%, 2s) is more appropriate. But that is merely a theory. Realistically, it is of dubious value to ask each codepath to carefully consider smoothness of their workload and specify their own setting by passing two extra parameters. Let's simplify the API by dropping both parameters and have everybody use (0%, 2s). Oh, by the way, the percent-threshold parameter and the structure member were consistently misspelled, which also is now fixed ;-) Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-19 14:01:34 -07:00
Junio C Hamano	8fbaf0b13b	Merge branch 'jk/rev-list-empty-input' "git log --tag=no-such-tag" showed log starting from HEAD, which has been fixed---it now shows nothing. * jk/rev-list-empty-input: revision: do not fallback to default when rev_input_given is set rev-list: don't show usage when we see empty ref patterns revision: add rev_input_given flag t6018: flesh out empty input/output rev-list tests	2017-08-11 13:27:07 -07:00
Junio C Hamano	3ab01ac3f7	Merge branch 'jk/reflog-walk' Numerous bugs in walking of reflogs via "log -g" and friends have been fixed. * jk/reflog-walk: reflog-walk: apply --since/--until to reflog dates reflog-walk: stop using fake parents rev-list: check reflog_info before showing usage get_revision_1(): replace do-while with an early return log: do not free parents when walking reflog log: clarify comment about reflog cycles revision: disallow reflog walking with revs->limited t1414: document some reflog-walk oddities	2017-08-11 13:27:01 -07:00
Jeff King	0159ba3226	rev-list: don't show usage when we see empty ref patterns If the user gives us no starting point for a traversal, we want to complain with our normal usage message. But if they tried to do so with "--all" or "--glob", but that happened not to match any refs, the usage message isn't helpful. We should just give them the empty output they asked for instead. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-02 15:45:21 -07:00
Jeff King	d75dfb1089	rev-list: pass diffopt->use_colors through to pretty-print When rev-list pretty-prints a commit, it creates a new pretty_print_context and copies items from the rev_info struct. We don't currently copy the "use_color" field, though. Nobody seems to have noticed because the only part of pretty.c that cares is the %C(auto,...) placeholder, and presumably not many people use that with the rev-list plumbing (as opposed to with git-log). It will become more noticeable in a future patch, though, when we start treating all user-format colors as auto-colors (in which case it would become impossible to format colors with rev-list, even with --color=always). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-13 12:42:51 -07:00
Jeff King	7f97de5ee1	rev-list: check reflog_info before showing usage When git-rev-list sees no pending commits, it shows a usage message. This works even when reflog-walking is requested, because the reflog-walk code currently puts the reflog tips into the pending queue. In preparation for refactoring the reflog-walk code, let's explicitly check whether we have any reflogs to walk. For now this is a noop, but the existing reflog tests will make sure that it kicks in after the refactoring. Likewise, we'll add a test that "rev-list -g" without specifying any reflogs continues to fail (so that we know our check does not kick in too aggressively). Note that the implementation needs to go into its own sub-function, as the walk code does not expose its innards outside of reflog-walk.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-09 10:00:48 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Junio C Hamano	50ad8561de	Merge branch 'jk/consistent-h' "git $cmd -h" for builtin commands calls the implementation of the command (i.e. cmd_$cmd() function) without doing any repository set-up, and the commands that expect RUN_SETUP is done by the Git potty needs to be prepared to show the help text without barfing. * jk/consistent-h: t0012: test "-h" with builtins git: add hidden --list-builtins option version: convert to parse-options diff- and log- family: handle "git cmd -h" early submodule--helper: show usage for "-h" remote-{ext,fd}: print usage message on invalid arguments upload-archive: handle "-h" option early credential: handle invalid arguments earlier	2017-06-19 12:38:45 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	5a88f97cff	diff- and log- family: handle "git cmd -h" early "git $builtin -h" bypasses the usual repository setup and calls the cmd_$builtin() function, expecting it to show the help text. Unfortunately the commands in the log- and the diff- family want to call into the revisions machinery, which by definition needs to have a repository already discovered. Strictly speaking, they may not need a repository only for parsing "-h", but it is a good discipline to future-proof codepath to ensure that setup_revisions() is called after we know that a repository is there. Handle the "git $builtin -h" special case very early in these commands to work around potential issues. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:39:59 +09:00
Junio C Hamano	6b526ced6f	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-29 12:34:43 +09:00
brian m. carlson	c251c83df2	object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
Johannes Schindelin	cb71f8bdb5	PRItime: introduce a new "printf format" for timestamps Currently, Git's source code treats all timestamps as if they were unsigned longs. Therefore, it is okay to write "%lu" when printing them. There is a substantial problem with that, though: at least on Windows, time_t is larger than unsigned long, and hence we will want to switch away from the ill-specified `unsigned long` data type. So let's introduce the pseudo format "PRItime" (currently simply being defined to "lu") to make it easier to change the data type used for timestamps. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-23 20:19:15 -07:00
brian m. carlson	dc01505f7f	Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Since we will likely be introducing a new hash function at some point, and that hash function might be longer than 40 hex characters, use the constant GIT_MAX_HEXSZ, which is designed to be suitable for allocations, instead of GIT_SHA1_HEXSZ. This will ease the transition down the line by distinguishing between places where we need to allocate memory suitable for the largest hash from those where we need to handle the current hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-26 22:08:21 -07:00
René Scharfe	2490574d15	use oid_to_hex_r() for converting struct object_id hashes to hex strings Patch generated by Coccinelle and contrib/coccinelle/object_id.cocci. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-30 14:23:40 -08:00
Jacob Keller	98985c6911	rev-list: use hdr_termination instead of a always using a newline When adding support for prefixing output of log and other commands using --line-prefix, commit `660e113ce1` ("graph: add support for --line-prefix on all graph-aware output", 2016-08-31) accidentally broke rev-list --header output. In order to make the output appear with a line-prefix, the flow was changed to always use the graph subsystem for display. Unfortunately the graph flow in rev-list did not use info->hdr_termination as it was assumed that graph output would never need to putput NULs. Since we now always use the graph code in order to handle the case of line-prefix, simply replace putchar('\n') with putchar(info->hdr_termination) which will correct this issue. Add a test for the --header case to make sure we don't break it in the future. Reported-by: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-20 14:44:37 -07:00
Jacob Keller	660e113ce1	graph: add support for --line-prefix on all graph-aware output Add an extension to git-diff and git-log (and any other graph-aware displayable output) such that "--line-prefix=<string>" will print the additional line-prefix on every line of output. To make this work, we have to fix a few bugs in the graph API that force graph_show_commit_msg to be used only when you have a valid graph. Additionally, we extend the default_diff_output_prefix handler to work even when no graph is enabled. This is somewhat of a hack on top of the graph API, but I think it should be acceptable here. This will be used by a future extension of submodule display which displays the submodule diff as the actual diff between the pre and post commit in the submodule project. Add some tests for both git-log and git-diff to ensure that the prefix is honored correctly. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-31 18:07:09 -07:00
Jeff King	434ea3cdad	rev-list: add optional progress reporting It's easy to ask rev-list to do a traversal that may takes many seconds (e.g., by calling "--objects --all"). In theory you can monitor its progress by the output you get to stdout, but this isn't always easy. Some operations, like "--count", don't make any output until the end. And some callers, like check_everything_connected(), are using it just for the error-checking of the traversal, and throw away stdout entirely. This patch adds a "--progress" option which can be used to give some eye-candy for a user waiting for a long traversal. This is just a rev-list option and not a regular traversal option, because it needs cooperation from the callbacks in builtin/rev-list.c to do the actual count. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-20 12:10:44 -07:00
Junio C Hamano	6d8c5454b6	Merge branch 'jk/rev-list-count-with-bitmap' "git rev-list --count" whose walk-length is limited with "-n" option did not work well with the counting optimized to look at the bitmap index. * jk/rev-list-count-with-bitmap: rev-list: disable bitmaps when "-n" is used with listing objects rev-list: "adjust" results of "--count --use-bitmap-index -n"	2016-06-20 11:01:03 -07:00
Jeff King	fb85db84dc	rev-list: disable bitmaps when "-n" is used with listing objects You can ask rev-list to use bitmaps to speed up an --objects traversal, which should generally give you your answers much faster. Likewise, you can ask rev-list to limit such a traversal with `-n`, in which case we'll show only a limited set of commits (and only the tree and commit objects directly reachable from those commits). But if you do both together, the results are nonsensical. We end up limiting any fallback traversal we do to _find_ the bitmaps, but the actual set of objects we output will be picked arbitrarily from the union of any bitmaps we do find, and will involve the objects of many more commits. It's possible that somebody might want this as a "show me what you can, but limit the amount of work you do" flag. But as with the prior commit clamping "--count", the results are basically non-deterministic; you'll get the values from some commits between `n` and the total number, and you can't tell which. And unlike the `--count` case, we can't easily generate the "real" value from the bitmap values (you can't just walk back `-n` commits and subtract out the reachable objects from the boundary commits; the bitmaps for `X` record its total reachability, so you don't know which objects are directly from `X` itself, which from `X^`, and so on). So let's just fallback to the non-bitmap code path in this case, so we always give a sane answer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-03 09:01:02 -07:00
Jeff King	5c9f9bf313	rev-list: "adjust" results of "--count --use-bitmap-index -n" If you ask rev-list for: git rev-list --count --use-bitmap-index HEAD we optimize out the actual traversal and just give you the number of bits set in the commit bitmap. This is faster, which is good. But if you ask to limit the size of the traversal, like: git rev-list --count --use-bitmap-index -n 100 HEAD we'll still output the full bitmapped number we found. On the surface, that might even seem OK. You explicitly asked to use the bitmap index, and it was cheap to compute the real answer, so we gave it to you. But there's something much more complicated going on under the hood. If we don't have a bitmap directly for HEAD, then we have to actually traverse backwards, looking for a bitmapped commit. And _that_ traversal is bounded by our `-n` count. This is a good thing, because it bounds the work we have to do, which is probably what the user wanted by asking for `-n`. But now it makes the output quite confusing. You might get many values: - your `-n` value, if we walked back and never found a bitmap (or fewer if there weren't that many commits) - the actual full count, if we found a bitmap root for every path of our traversal with in the `-n` limit - any number in between! We might have walked back and found _some_ bitmaps, but then cut off the traversal early with some commits not accounted for in the result. So you cannot even see a value higher than your `-n` and say "OK, bitmaps kicked in, this must be the real full count". The only sane thing is for git to just clamp the value to a maximum of the `-n` value, which means we should output the exact same results whether bitmaps are in use or not. The test in t5310 demonstrates this by using `-n 1`. Without this patch we fail in the full-bitmap case (where we do not have to traverse at all) but _not_ in the partial-bitmap case (where we have to walk down to find an actual bitmap). With this patch, both cases just work. I didn't implement the crazy in-between case, just because it's complicated to set up, and is really a subset of the full-count case, which we do cover. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-03 09:00:59 -07:00
Jeff King	2824e1841b	list-objects: pass full pathname to callbacks When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:04 -07:00
Jeff King	dc06dc8800	list-objects: drop name_path entirely In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-03-16 10:41:03 -07:00
Jeff King	de1e67d070	list-objects: pass full pathname to callbacks When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:17 -08:00
Jeff King	bd64516aca	list-objects: drop name_path entirely In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 12:51:15 -08:00
brian m. carlson	ed1c9977cb	Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	f2fd0760f6	Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	7999b2cf77	Add several uses of get_object_hash. Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
Jeff King	d59f765ac9	use sha1_to_hex_r() instead of strcpy Before sha1_to_hex_r() existed, a simple way to get hex sha1 into a buffer was with: strcpy(buf, sha1_to_hex(sha1)); This isn't wrong (assuming the buf is 41 characters), but it makes auditing the code base for bad strcpy() calls harder, as these become false positives. Let's convert them to sha1_to_hex_r(), and likewise for some calls to find_unique_abbrev(). While we're here, we'll double-check that all of the buffers are correctly sized, and use the more obvious GIT_SHA1_HEXSZ constant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-05 11:08:05 -07:00
Jeff King	2aea7a51a1	rev-list: make it obvious that we do not support notes The rev-list command does not have the internal infrastructure to display notes. Running: git rev-list --notes HEAD will silently ignore the "--notes" option. Running: git rev-list --notes --grep=. HEAD will crash on an assert. Running: git rev-list --format=%N HEAD will place a literal "%N" in the output (it does not even expand to an empty string). Let's have rev-list tell the user that it cannot fill the user's request, rather than silently producing wrong data. Likewise, let's remove mention of the notes options from the rev-list documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-24 10:33:15 -07:00
Junio C Hamano	3afcec9057	Merge branch 'ls/hint-rev-list-count' into maint * ls/hint-rev-list-count: rev-list: add --count to usage guide	2015-07-27 12:21:47 -07:00
Junio C Hamano	b3a30f6e0c	Merge branch 'ls/hint-rev-list-count' * ls/hint-rev-list-count: rev-list: add --count to usage guide	2015-07-10 14:26:13 -07:00
Jeff King	c8a70d3509	rev-list: disable --use-bitmap-index when pruning commits The reachability bitmaps do not have enough information to tell us which commits might have changed path "foo", so the current code produces wrong answers for: git rev-list --use-bitmap-index --count HEAD -- foo (it silently ignores the "foo" limiter). Instead, we should fall back to doing a normal traversal (it is OK to fall back rather than complain, because --use-bitmap-index is a pure optimization, and might not kick in for other reasons, such as there being no bitmaps in the repository). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-07-01 12:00:50 -07:00
Lawrence Siebert	75d2e5a7b0	rev-list: add --count to usage guide --count should be mentioned in the usage guide, this updates code and documentation. Signed-off-by: Lawrence Siebert <lawrencesiebert@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-07-01 09:29:11 -07:00
Jeff King	8597ea3afe	commit: record buffer length in cache Most callsites which use the commit buffer try to use the cached version attached to the commit, rather than re-reading from disk. Unfortunately, that interface provides only a pointer to the NUL-terminated buffer, with no indication of the original length. For the most part, this doesn't matter. People do not put NULs in their commit messages, and the log code is happy to treat it all as a NUL-terminated string. However, some code paths do care. For example, when checking signatures, we want to be very careful that we verify all the bytes to avoid malicious trickery. This patch just adds an optional "size" out-pointer to get_commit_buffer and friends. The existing callers all pass NULL (there did not seem to be any obvious sites where we could avoid an immediate strlen() call, though perhaps with some further refactoring we could). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:09:38 -07:00
Jeff King	a97934d820	use get_cached_commit_buffer where appropriate Some call sites check commit->buffer to see whether we have a cached buffer, and if so, do some work with it. In the long run we may want to switch these code paths to make their decision on a different boolean flag (because checking the cache may get a little more expensive in the future). But for now, we can easily support them by converting the calls to use get_cached_commit_buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:08:17 -07:00
Jeff King	0fb370da9c	provide a helper to free commit buffer This converts two lines into one at each caller. But more importantly, it abstracts the concept of freeing the buffer, which will make it easier to change later. Note that we also need to provide a "detach" mechanism for a tricky case in index-pack. We are passed a buffer for the object generated by processing the incoming pack. If we are not using --strict, we just calculate the sha1 on that buffer and return, leaving the caller to free it. But if we are using --strict, we actually attach that buffer to an object, pass the object to the fsck functions, and then detach the buffer from the object again (so that the caller can free it as usual). In this case, we don't want to free the buffer ourselves, but just make sure it is no longer associated with the commit. Note that we are making the assumption here that the attach/detach process does not impact the buffer at all (e.g., it is never reallocated or modified). That holds true now, and we have no plans to change that. However, as we abstract the commit_buffer code, this dependency becomes less obvious. So when we detach, let's also make sure that we get back the same buffer that we gave to the commit_buffer code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 12:07:47 -07:00
Junio C Hamano	0f9e62e084	Merge branch 'jk/pack-bitmap' Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...	2014-02-27 14:01:48 -08:00
Vicent Marti	aa32939fea	rev-list: add bitmap mode to speed up object lists The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list (i.e., not the path names at which trees and blobs were found). Calling `git rev-list --objects --use-bitmap-index [committish]` will perform an object iteration based on a bitmap result instead of actually walking the object graph. These are some example timings for `torvalds/linux` (warm cache, best-of-five): $ time git rev-list --objects master > /dev/null real 0m34.191s user 0m33.904s sys 0m0.268s $ time git rev-list --objects --use-bitmap-index master > /dev/null real 0m1.041s user 0m0.976s sys 0m0.064s Likewise, using `git rev-list --count --use-bitmap-index` will speed up the counting operation by building the resulting bitmap and performing a fast popcount (number of bits set on the bitmap) on the result. Here are some sample timings of different ways to count commits in `torvalds/linux`: $ time git rev-list master \| wc -l 399882 real 0m6.524s user 0m6.060s sys 0m3.284s $ time git rev-list --count master 399882 real 0m4.318s user 0m4.236s sys 0m0.076s $ time git rev-list --use-bitmap-index --count master 399882 real 0m0.217s user 0m0.176s sys 0m0.040s This also respects negative refs, so you can use it to count a slice of history: $ time git rev-list --count v3.0..master 144843 real 0m1.971s user 0m1.932s sys 0m0.036s $ time git rev-list --use-bitmap-index --count v3.0..master real 0m0.280s user 0m0.220s sys 0m0.056s Though note that the closer the endpoints, the less it helps. In the traversal case, we have fewer commits to cross, so we take less time. But the bitmap time is dominated by generating the pack revindex, which is constant with respect to the refs given. Note that you cannot yet get a fast --left-right count of a symmetric difference (e.g., "--count --left-right master...topic"). The slow part of that walk actually happens during the merge-base determination when we parse "master...topic". Even though a count does not actually need to know the real merge base (it only needs to take the symmetric difference of the bitmaps), the revision code would require some refactoring to handle this case. Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. This can be used to exercise the bitmap code, and also to verify that the contents of the .bitmap file are sane. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-30 12:19:22 -08:00
Junio C Hamano	c01499ef69	C: have space around && and \|\| operators Correct all hits from git grep -e '$&&\\|\|\|$[^ ]' -e '[^ ]$&&\\|\|\|$' -- '*.c' i.e. && or \|\| operators that are followed by anything but a SP, or that follow something other than a SP or a HT, so that these operators have a SP around it when necessary. We usually refrain from making this kind of a tree-wide change in order to avoid unnecessary conflicts with other "real work" patches, but in this case, the end result does not have a potentially cumbersome tree-wide impact, while this is a tree-wide cleanup. Fixes to compat/regex/regcomp.c and xdiff/xemit.c are to replace a HT immediately after && with a SP. This is based on Felipe's patch to bultin/symbolic-ref.c; I did all the finding out what other files in the whole tree need to be fixed and did the fix and also the log message while reviewing that single liner, so any screw-ups in this version are mine. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-16 10:26:39 -07:00

1 2

76 commits