development/git - HydraGit

mirror of https://github.com/git/git synced 2024-11-05 18:59:29 +00:00

Author	SHA1	Message	Date
Junio C Hamano	c0d75f0e2e	Merge branch 'sb/diff-blobfind-pickaxe' "diff" family of commands learned "--find-object=<object-id>" option to limit the findings to changes that involve the named object. * sb/diff-blobfind-pickaxe: diff: use HAS_MULTI_BITS instead of counting bits manually diff: properly error out when combining multiple pickaxe options diffcore: add a pickaxe option to find a specific blob diff: introduce DIFF_PICKAXE_KINDS_MASK diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit diff.h: make pickaxe_opts an unsigned bit field	2018-01-23 13:16:37 -08:00
Junio C Hamano	bc3dca07f4	Merge branch 'nd/ita-wt-renames-in-status' "git status" after moving a path in the working tree (hence making it appear "removed") and then adding with the -N option (hence making that appear "added") detected it as a rename, but did not report the old and new pathnames correctly. * nd/ita-wt-renames-in-status: wt-status.c: handle worktree renames wt-status.c: rename rename-related fields in wt_status_change_data wt-status.c: catch unhandled diff status codes wt-status.c: coding style fix Use DIFF_DETECT_RENAME for detect_rename assignments t2203: test status output with porcelain v2 format	2018-01-23 13:16:28 -08:00
Nguyễn Thái Ngọc Duy	4e056c989f	diff.c: flush stdout before printing rename warnings The diff output is buffered in a FILE object and could still be partially buffered when we print these warnings (directly to fd 2). The output is messed up like this worktree.c \| 138 +- worktree.h warning: inexact rename detection was skipped due to too many files. \| 12 +- wrapper.c \| 83 +- It gets worse if the warning is printed after color codes for the graph part are already printed. You'll get a warning in green or red. Flush stdout first, so we can get something like this instead: xdiff/xutils.c \| 42 +- xdiff/xutils.h \| 4 +- 1033 files changed, 150824 insertions(+), 69395 deletions(-) warning: inexact rename detection was skipped due to too many files. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-16 14:34:20 -08:00
Torsten Bögershausen	8462ff43e4	convert_to_git(): safe_crlf/checksafe becomes int conv_flags When calling convert_to_git(), the checksafe parameter defined what should happen if the EOL conversion (CRLF --> LF --> CRLF) does not roundtrip cleanly. In addition, it also defined if line endings should be renormalized (CRLF --> LF) or kept as they are. checksafe was an safe_crlf enum with these values: SAFE_CRLF_FALSE: do nothing in case of EOL roundtrip errors SAFE_CRLF_FAIL: die in case of EOL roundtrip errors SAFE_CRLF_WARN: print a warning in case of EOL roundtrip errors SAFE_CRLF_RENORMALIZE: change CRLF to LF SAFE_CRLF_KEEP_CRLF: keep all line endings as they are In some cases the integer value 0 was passed as checksafe parameter instead of the correct enum value SAFE_CRLF_FALSE. That was no problem because SAFE_CRLF_FALSE is defined as 0. FALSE/FAIL/WARN are different from RENORMALIZE and KEEP_CRLF. Therefore, an enum is not ideal. Let's use a integer bit pattern instead and rename the parameter to conv_flags to make it more generically usable. This allows us to extend the bit pattern in a subsequent commit. Reported-By: Randall S. Becker <rsbecker@nexbridge.com> Helped-By: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-16 12:35:56 -08:00
Stefan Beller	4d8c51aa19	diff: use HAS_MULTI_BITS instead of counting bits manually This aligns the style to the previous patch. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	5e505257f2	diff: properly error out when combining multiple pickaxe options In `f506b8e8b5` (git log/diff: add -G<regexp> that greps in the patch text, 2010-08-23) we were hesitant to check if the user requests both -S and -G at the same time. Now that the pickaxe family also offers --find-object, which looks slightly more different than the former two, let's add a check that those are not used at the same time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	15af58c1ad	diffcore: add a pickaxe option to find a specific blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) One might be tempted to extend git-describe to also work with blobs, such that `git describe <blob-id>` gives a description as '<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer number of responses (>110), it turns out this is tricky to get right. The hard part to get right is picking the correct 'commit-ish' as that could be the commit that (re-)introduced the blob or the blob that removed the blob; the blob could exist in different branches. Junio hinted at a different approach of solving this problem, which this patch implements. Teach the diff machinery another flag for restricting the information to what is shown. For example: $ ./git log --oneline --find-object=v2.0.0:Makefile `b2feb64309` Revert the whole "ask curl-config" topic for now `47fbfded53` i18n: only extract comments marked with "TRANSLATORS:" we observe that the Makefile as shipped with 2.0 was appeared in v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The reason why these commits both occur prior to v2.0.0 are evil merges that are not found using this new mechanism. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Stefan Beller	cf63051ada	diff: introduce DIFF_PICKAXE_KINDS_MASK Currently the check whether to perform pickaxing is done via checking `diffopt->pickaxe`, which contains the command line argument that we want to pickaxe for. Soon we'll introduce a new type of pickaxing, that will not store anything in the `.pickaxe` field, so let's migrate the check to be dependent on pickaxe_opts. It is not enough to just replace the check for pickaxe by pickaxe_opts, because flags might be set, but pickaxing was not requested ('-i'). To cope with that, introduce a mask to check only for the bits indicating the modes of operation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 15:02:40 -08:00
Junio C Hamano	f427b94985	Merge branch 'cc/skip-to-optional-val' Introduce a helper to simplify code to parse a common pattern that expects either "--key" or "--key=<something>". * cc/skip-to-optional-val: t4045: reindent to make helpers readable diff: add tests for --relative without optional prefix value diff: use skip_to_optional_arg_default() in parsing --relative diff: use skip_to_optional_arg_default() diff: use skip_to_optional_arg() index-pack: use skip_to_optional_arg() git-compat-util: introduce skip_to_optional_arg()	2017-12-28 14:08:46 -08:00
Nguyễn Thái Ngọc Duy	06dba2b023	Use DIFF_DETECT_RENAME for detect_rename assignments This field can have two values (2 for copy). Use this name instead for clarity. Many places have already used this constant. Note, the detect_rename assignments in merge-recursive.c remain unchanged because it's actually a boolean there. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-27 12:38:35 -08:00
Junio C Hamano	8d7fefaac4	Merge branch 'ar/unconfuse-three-dots' Ancient part of codebase still shows dots after an abbreviated object name just to show that it is not a full object name, but these ellipses are confusing to people who newly discovered Git who are used to seeing abbreviated object names and find them confusing with the range syntax. * ar/unconfuse-three-dots: t2020: test variations that matter t4013: test new output from diff --abbrev --raw diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value t4013: prepare for upcoming "diff --raw --abbrev" output format change checkout: describe_detached_head: remove ellipsis after committish print_sha1_ellipsis: introduce helper Documentation: user-manual: limit usage of ellipsis Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").	2017-12-19 11:33:58 -08:00
Junio C Hamano	d7c6c2369a	Merge branch 'jt/diff-anchored-patience' "git diff" learned a variant of the "--patience" algorithm, to which the user can specify which 'unique' line to be used as anchoring points. * jt/diff-anchored-patience: diff: support anchoring line(s)	2017-12-19 11:33:56 -08:00
Junio C Hamano	646685460c	Merge branch 'en/rename-progress' Historically, the diff machinery for rename detection had a hardcoded limit of 32k paths; this is being lifted to allow users trade cycles with a (possibly) easier to read result. * en/rename-progress: diffcore-rename: make diff-tree -l0 mean -l<large> sequencer: show rename progress during cherry picks diff: remove silent clamp of renameLimit progress: fix progress meters when dealing with lots of work sequencer: warn when internal merge may be suboptimal due to renameLimit	2017-12-19 11:33:55 -08:00
Junio C Hamano	1efad51197	diff: use skip_to_optional_arg_default() in parsing --relative Helped-by: Jacob Keller <jacob.keller@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-11 16:10:12 -08:00
Christian Couder	cf81f94da4	diff: use skip_to_optional_arg_default() Let's simplify diff option parsing using skip_to_optional_arg_default(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-11 16:10:12 -08:00
Christian Couder	948cbe6703	diff: use skip_to_optional_arg() Let's simplify diff option parsing using skip_to_optional_arg(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-11 16:10:12 -08:00
Ann T Ropea	7cb6ac1e4b	diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value Neither Git nor the user are in need of this (visual) aid anymore, but we must offer a transition period. A follow-up patch (series) will rectify the situation by covering the new output format as well as the backward compatible one. Also, fix a typo: "abbbreviated" ---> "abbreviated". Signed-off-by: Ann T Ropea <bedhanger@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-06 07:32:59 -08:00
Jonathan Tan	2477ab2ea8	diff: support anchoring line(s) Teach diff a new algorithm, one that attempts to prevent user-specified lines from appearing as a deletion or addition in the end result. The end user can use this by specifying "--anchored=<text>" one or more times when using Git commands like "diff" and "show". Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-28 10:40:04 +09:00
Junio C Hamano	10f65c239a	Merge branch 'jc/ignore-cr-at-eol' The "diff" family of commands learned to ignore differences in carriage return at the end of line. * jc/ignore-cr-at-eol: diff: --ignore-cr-at-eol xdiff: reassign xpparm_t.flags bits	2017-11-27 11:06:31 +09:00
Elijah Newren	9f7e4bfa3b	diff: remove silent clamp of renameLimit In commit `0024a5492` (Fix the rename detection limit checking; 2007-09-14), the renameLimit was clamped to 32767. This appears to have been to simply avoid integer overflow in the following computation: num_create * num_src <= rename_limit * rename_limit although it also could be viewed as a hardcoded bound on the amount of CPU time we're willing to allow users to tell git to spend on handling renames. An upper bound may make sense, but unfortunately this upper bound was neither communicated to the users, nor documented anywhere. Although large limits can make things slow, we have users who would be ecstatic to have a small five file change be correctly cherry picked even if they have to manually specify a large limit and wait ten minutes for the renames to be detected. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-15 13:11:25 +09:00
Junio C Hamano	8cc633286a	Merge branch 'bw/diff-opt-impl-to-bitfields' A single-word "unsigned flags" in the diff options is being split into a structure with many bitfields. * bw/diff-opt-impl-to-bitfields: diff: make struct diff_flags members lowercase diff: remove DIFF_OPT_CLR macro diff: remove DIFF_OPT_SET macro diff: remove DIFF_OPT_TST macro diff: remove touched flags diff: add flag to indicate textconv was set via cmdline diff: convert flags to be stored in bitfields add, reset: use DIFF_OPT_SET macro to set a diff flag	2017-11-09 14:31:27 +09:00
Junio C Hamano	4e9762ed47	Merge branch 'ao/diff-populate-filespec-lstat-errorpath-fix' After an error from lstat(), diff_populate_filespec() function sometimes still went ahead and used invalid data in struct stat, which has been fixed. * ao/diff-populate-filespec-lstat-errorpath-fix: diff: fix lstat() error handling in diff_populate_filespec()	2017-11-09 14:31:26 +09:00
Junio C Hamano	e9282f02b2	diff: --ignore-cr-at-eol A new option --ignore-cr-at-eol tells the diff machinery to treat a carriage-return at the end of a (complete) line as if it does not exist. Just like other "--ignore-*" options to ignore various kinds of whitespace differences, this will help reviewing the real changes you made without getting distracted by spurious CRLF<->LF conversion made by your editor program. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> [jch: squashed in command line completion by Dscho] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-08 10:05:27 +09:00
Brandon Williams	0d1e0e7801	diff: make struct diff_flags members lowercase Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:40 +09:00
Brandon Williams	b2100e5291	diff: remove DIFF_OPT_CLR macro Remove the `DIFF_OPT_CLR` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_CLR(&E, fld) + E.flags.fld = 0 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_CLR(ptr, fld) + ptr->flags.fld = 0 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:51:30 +09:00
Brandon Williams	23dcf77f48	diff: remove DIFF_OPT_SET macro Remove the `DIFF_OPT_SET` macro and instead set the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_SET(&E, fld) + E.flags.fld = 1 @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_SET(ptr, fld) + ptr->flags.fld = 1 Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Brandon Williams	3b69daed86	diff: remove DIFF_OPT_TST macro Remove the `DIFF_OPT_TST` macro and instead access the flags directly. This conversion is done using the following semantic patch: @@ expression E; identifier fld; @@ - DIFF_OPT_TST(&E, fld) + E.flags.fld @@ type T; T *ptr; identifier fld; @@ - DIFF_OPT_TST(ptr, fld) + ptr->flags.fld Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:03 +09:00
Brandon Williams	afa73c5384	diff: add flag to indicate textconv was set via cmdline git-show is unique in that it wants to use textconv by default except for when it is showing blobs. When asked to show a blob, show doesn't want to use textconv unless the user explicitly requested that it be used by providing the command line flag '--textconv'. Currently this is done by using a parallel set of 'touched' flags which get set every time a particular flag is set or cleared. In a future patch we want to eliminate this parallel set of flags so instead of relying on if the textconv flag has been touched, add a new flag 'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is set to true via the command line. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:02 +09:00
Brandon Williams	02f2f56bc3	diff: convert flags to be stored in bitfields We cannot add many more flags to the diff machinery due to the limitations of the number of flags that can be stored in a single unsigned int. In order to allow for more flags to be added to the diff machinery in the future this patch converts the flags to be stored in bitfields in 'struct diff_flags'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 11:50:02 +09:00
Andrey Okoshkin	10e0ca843d	diff: fix lstat() error handling in diff_populate_filespec() Add lstat() error handling not only for ENOENT case. Otherwise uninitialised 'struct stat st' variable is used later in case of lstat() non-ENOENT failure which leads to processing of rubbish values of file mode ('S_ISLNK(st.st_mode)' check) or size ('xsize_t(st.st_size)'). Signed-off-by: Andrey Okoshkin <a.okoshkin@samsung.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-29 10:16:36 +09:00
Junio C Hamano	446d12cb3f	xdiff: reassign xpparm_t.flags bits We have packed the bits too tightly in such a way that it is not easy to add a new type of whitespace ignoring option, a new type of LCS algorithm, or a new type of post-cleanup heuristics. Reorder bits a bit to give room for these three classes of options to grow. Also make use of XDF_WHITESPACE_FLAGS macro where we check any of these bits are on, instead of using DIFF_XDL_TST() macro on individual possibilities. That way, the "is any of the bits on?" code does not have to change when we add more ways to ignore whitespaces. While at it, add a comment in front of the bit definitions to clarify in which structure these defined bits may appear. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-27 15:57:30 +09:00
Stefan Beller	01be97c2b2	diff.c: get rid of duplicate implementation The implementations in diff.c to detect moved lines needs to compare strings and hash strings, which is implemented in that file, as well as in the xdiff library. Remove the rather recent implementation in diff.c and rely on the well exercised code in the xdiff lib. With this change the hash used for bucketing the strings for the moved line detection changes from FNV32 (that is provided via the hashmaps memhash) to DJB2 (which is used internally in xdiff). Benchmarks found on the web[1] do not indicate that these hashes are different in performance for readable strings. [1] https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-26 11:23:32 +09:00
Jeff King	b66b507292	diff: handle NULs in get_string_hash() For computing moved lines, we feed the characters of each line into a hash. When we've been asked to ignore whitespace, then we pick each character using next_byte(), which returns -1 on end-of-string, which it determines using the start/end pointers we feed it. However our check of its return value treats "0" the same as "-1", meaning we'd quit if the string has an embedded NUL. This is unlikely to ever come up in practice since our line boundaries generally come from calling strlen() in the first place. But it was a bit surprising to me as a reader of the next_byte() code. And it's possible that we may one day feed this function with more exotic input, which otherwise works with arbitrary ptr/len pairs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-21 21:12:53 +09:00
Jeff King	da58318e76	diff: fix whitespace-skipping with --color-moved The code for handling whitespace with --color-moved represents partial strings as a pair of pointers. There are two possible conventions for the end pointer: 1. It points to the byte right after the end of the string. 2. It points to the final byte of the string. But we seem to use both conventions in the code: a. we assign the initial pointers from the NUL-terminated string using (1) b. we eat trailing whitespace by checking the second pointer for isspace(), which needs (2) c. the next_byte() function checks for end-of-string with "if (cp > endp)", which is (2) d. in next_byte() we skip past internal whitespace with "while (cp < end)", which is (1) This creates fewer bugs than you might think, because there are some subtle interactions. Because of (a) and (c), we always return the NUL-terminator from next_byte(). But all of the callers of next_byte() happen to handle that gracefully. Because of the mismatch between (d) and (c), next_byte() could accidentally return a whitespace character right at endp. But because of the interaction of (a) and (b), we fail to actually chomp trailing whitespace, meaning our endp _always_ points to a NUL, canceling out the problem. But that does leave (b) as a real bug: when ignoring whitespace only at the end-of-line, we don't correctly trim it, and fail to match up lines. We can fix the whole thing by moving consistently to one convention. Since convention (1) is idiomatic in our code base, we'll pick that one. The existing "-w" and "-b" tests continue to pass, and a new "--ignore-space-at-eol" shows off the breakage we're fixing. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-21 21:12:35 +09:00
Junio C Hamano	1c0b983a77	Merge branch 'jk/ref-filter-colors-fix' This is the "theoretically more correct" approach of simply stepping back to the state before plumbing commands started paying attention to "color.ui" configuration variable. Let's run with this one. * jk/ref-filter-colors-fix: tag: respect color.ui config Revert "color: check color.ui in git_default_config()" Revert "t6006: drop "always" color config tests" Revert "color: make "always" the same as "auto" in config"	2017-10-18 10:19:08 +09:00
Jeff King	33c643bb08	Revert "color: check color.ui in git_default_config()" This reverts commit `136c8c8b8f`. That commit was trying to address a bug caused by `4c7f1819b3` (make color.ui default to 'auto', 2013-06-10), in which plumbing like diff-tree defaulted to "auto" color, but did not respect a "color.ui" directive to disable it. But it also meant that we started respecting "color.ui" set to "always". This was a known problem, but `4c7f1819b3` argued that nobody ought to be doing that. However, that turned out to be wrong, and we got a number of bug reports related to "add -p" regressing in v2.14.2. Let's revert `136c8c8b8`, fixing the regression to "add -p". This leaves the problem from `4c7f1819b3` unfixed, but: 1. It's a pretty obscure problem in the first place. I only noticed it while working on the color code, and we haven't got a single bug report or complaint about it. 2. We can make a more moderate fix on top by respecting "never" but not "always" for plumbing commands. This is just the minimal fix to go back to the working state we had before v2.14.2. Note that this isn't a pure revert. We now have a test in t3701 which shows off the "add -p" regression. This can be flipped to success. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-17 15:09:52 +09:00
Junio C Hamano	91ccfb8517	Merge branch 'sb/diff-color-move' A recently added "--color-moved" feature of "diff" fell into infinite loop when ignoring whitespace changes, which has been fixed. * sb/diff-color-move: diff: fix infinite loop with --color-moved --ignore-space-change	2017-10-17 13:29:19 +09:00
Jeff King	fa5ba2c1dd	diff: fix infinite loop with --color-moved --ignore-space-change The --color-moved code uses next_byte() to advance through the blob contents. When the user has asked to ignore whitespace changes, we try to collapse any whitespace change down to a single space. However, we enter the conditional block whenever we see the IGNORE_WHITESPACE_CHANGE flag, even if the next byte isn't whitespace. This means that the combination of "--color-moved and --ignore-space-change" was completely broken. Worse, because we return from next_byte() without having advanced our pointer, the function makes no forward progress in the buffer and loops infinitely. Fix this by entering the conditional only when we actually see whitespace. We can apply this also to the IGNORE_WHITESPACE change. That code path isn't buggy (because it falls through to returning the next non-whitespace byte), but it makes the logic more clear if we only bother to look at whitespace flags after seeing that the next byte is whitespace. Reported-by: Orgad Shaneh <orgads@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-16 11:57:45 +09:00
Junio C Hamano	98c57ea6f0	Merge branch 'sb/diff-color-move' The output from "git diff --summary" was broken in a recent topic that has been merged to 'master' and lost a LF after reporting of mode change. This has been fixed. * sb/diff-color-move: diff: correct newline in summary for renamed files	2017-10-03 15:42:49 +09:00
Junio C Hamano	14a8168e2f	Merge branch 'rj/no-sign-compare' Many codepaths have been updated to squelch -Wsign-compare warnings. * rj/no-sign-compare: ALLOC_GROW: avoid -Wsign-compare warnings cache.h: hex2chr() - avoid -Wsign-compare warnings commit-slab.h: avoid -Wsign-compare warnings git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings	2017-09-29 11:23:42 +09:00
Stefan Beller	58aaced444	diff: correct newline in summary for renamed files In `146fdb0dfe` (diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY, 2017-06-29), the conversion from direct printing to the symbol emission dropped the new line character for renamed, copied and rewritten files. Add the emission of a newline, add a test for this case. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-28 13:15:59 +09:00
Junio C Hamano	c50424a6f0	Merge branch 'jk/write-in-full-fix' Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern	2017-09-25 15:24:06 +09:00
Ramsay Jones	071bcaab64	ALLOC_GROW: avoid -Wsign-compare warnings Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-22 13:21:11 +09:00
Junio C Hamano	d811ba1897	Merge branch 'rs/strbuf-leakfix' Many leaks of strbuf have been fixed. * rs/strbuf-leakfix: (34 commits) wt-status: release strbuf after use in wt_longstatus_print_tracking() wt-status: release strbuf after use in read_rebase_todolist() vcs-svn: release strbuf after use in end_revision() utf8: release strbuf on error return in strbuf_utf8_replace() userdiff: release strbuf after use in userdiff_get_textconv() transport-helper: release strbuf after use in process_connect_service() sequencer: release strbuf after use in save_head() shortlog: release strbuf after use in insert_one_record() sha1_file: release strbuf on error return in index_path() send-pack: release strbuf on error return in send_pack() remote: release strbuf after use in set_url() remote: release strbuf after use in migrate_file() remote: release strbuf after use in read_remote_branches() refs: release strbuf on error return in write_pseudoref() notes: release strbuf after use in notes_copy_from_stdin() merge: release strbuf after use in write_merge_heads() merge: release strbuf after use in save_state() mailinfo: release strbuf on error return in handle_boundary() mailinfo: release strbuf after use in handle_from() help: release strbuf on error return in exec_woman_emacs() ...	2017-09-19 10:47:57 +09:00
Jeff King	06f46f237a	avoid "write_in_full(fd, buf, len) != len" pattern The return value of write_in_full() is either "-1", or the requested number of bytes[1]. If we make a partial write before seeing an error, we still return -1, not a partial value. This goes back to `f6aa66cb95` (write_in_full: really write in full or return error on disk full., 2007-01-11). So checking anything except "was the return value negative" is pointless. And there are a couple of reasons not to do so: 1. It can do a funny signed/unsigned comparison. If your "len" is signed (e.g., a size_t) then the compiler will promote the "-1" to its unsigned variant. This works out for "!= len" (unless you really were trying to write the maximum size_t bytes), but is a bug if you check "< len" (an example of which was fixed recently in config.c). We should avoid promoting the mental model that you need to check the length at all, so that new sites are not tempted to copy us. 2. Checking for a negative value is shorter to type, especially when the length is an expression. 3. Linus says so. In `d34cf19b89` (Clean up write_in_full() users, 2007-01-11), right after the write_in_full() semantics were changed, he wrote: I really wish every "write_in_full()" user would just check against "<0" now, but this fixes the nasty and stupid ones. Appeals to authority aside, this makes it clear that writing it this way does not have an intentional benefit. It's a historical curiosity that we never bothered to clean up (and which was undoubtedly cargo-culted into new sites). So let's convert these obviously-correct cases (this includes write_str_in_full(), which is just a wrapper for write_in_full()). [1] A careful reader may notice there is one way that write_in_full() can return a different value. If we ask write() to write N bytes and get a return value that is _larger_ than N, we could return a larger total. But besides the fact that this would imply a totally broken version of write(), it would already invoke undefined behavior. Our internal remaining counter is an unsigned size_t, which means that subtracting too many byte will wrap it around to a very large number. So we'll instantly begin reading off the end of the buffer, trying to write gigabytes (or petabytes) of data. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-14 15:17:59 +09:00
Rene Scharfe	5a612017eb	diff: release strbuf after use in show_stats() Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-07 08:49:27 +09:00
Rene Scharfe	348eda249e	diff: release strbuf after use in show_rename_copy() Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-07 08:49:27 +09:00
Rene Scharfe	fa842d843d	diff: release strbuf after use in diff_summary() Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-07 08:49:27 +09:00
Jeff King	076aa2cbda	tempfile: auto-allocate tempfiles on heap The previous commit taught the tempfile code to give up ownership over tempfiles that have been renamed or deleted. That makes it possible to use a stack variable like this: struct tempfile t; create_tempfile(&t, ...); ... if (!err) rename_tempfile(&t, ...); else delete_tempfile(&t); But doing it this way has a high potential for creating memory errors. The tempfile we pass to create_tempfile() ends up on a global linked list, and it's not safe for it to go out of scope until we've called one of those two deactivation functions. Imagine that we add an early return from the function that forgets to call delete_tempfile(). With a static or heap tempfile variable, the worst case is that the tempfile hangs around until the program exits (and some functions like setup_shallow_temporary rely on this intentionally, creating a tempfile and then leaving it for later cleanup). But with a stack variable as above, this is a serious memory error: the variable goes out of scope and may be filled with garbage by the time the tempfile code looks at it. Let's see if we can make it harder to get this wrong. Since many callers need to allocate arbitrary numbers of tempfiles, we can't rely on static storage as a general solution. So we need to turn to the heap. We could just ask all callers to pass us a heap variable, but that puts the burden on them to call free() at the right time. Instead, let's have the tempfile code handle the heap allocation _and_ the deallocation (when the tempfile is deactivated and removed from the list). This changes the return value of all of the creation functions. For the cleanup functions (delete and rename), we'll add one extra bit of safety: instead of taking a tempfile pointer, we'll take a pointer-to-pointer and set it to NULL after freeing the object. This makes it safe to double-call functions like delete_tempfile(), as the second call treats the NULL input as a noop. Several callsites follow this pattern. The resulting patch does have a fair bit of noise, as each caller needs to be converted to handle: 1. Storing a pointer instead of the struct itself. 2. Passing the pointer instead of taking the struct address. 3. Handling a "struct tempfile " return instead of a file descriptor. We could play games to make this less noisy. For example, by defining the tempfile like this: struct tempfile { struct heap_allocated_part_of_tempfile { int fd; ...etc } actual_data; } Callers would continue to have a "struct tempfile", and it would be "active" only when the inner pointer was non-NULL. But that just makes things more awkward in the long run. There aren't that many callers, so we can simply bite the bullet and adjust all of them. And the compiler makes it easy for us to find them all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-06 17:19:54 +09:00
Jeff King	49bd0fc222	tempfile: do not delete tempfile on failed close When close_tempfile() fails, we delete the tempfile and reset the fields of the tempfile struct. This makes it easier for callers to return without cleaning up, but it also makes this common pattern: if (close_tempfile(tempfile)) return error_errno("error closing %s", tempfile->filename.buf); wrong, because the "filename" field has been reset after the failed close. And it's not easy to fix, as in many cases we don't have another copy of the filename (e.g., if it was created via one of the mks_tempfile functions, and we just have the original template string). Let's drop the feature that a failed close automatically deletes the file. This puts the burden on the caller to do the deletion themselves, but this isn't that big a deal. Callers which do: if (write(...) \|\| close_tempfile(...)) { delete_tempfile(...); return -1; } already had to call delete when the write() failed, and so aren't affected. Likewise, any caller which just calls die() in the error path is OK; we'll delete the tempfile during the atexit handler. Because this patch changes the semantics of close_tempfile() without changing its signature, all callers need to be manually checked and converted to the new scheme. This patch covers all in-tree callers, but there may be others for not-yet-merged topics. To catch these, we rename the function to close_tempfile_gently(), which will attract compile-time attention to new callers. (Technically the original could be considered "gentle" already in that it didn't die() on errors, but this one is even more so). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-06 17:19:53 +09:00
Jeff King	45c6b1ed24	always check return value of close_tempfile If close_tempfile() encounters an error, then it deletes the tempfile and resets the "struct tempfile". But many code paths ignore the return value and continue to use the tempfile. Instead, we should generally treat this the same as a write() error. Note that in the postimage of some of these cases our error message will be bogus after a failed close because we look at tempfile->filename (either directly or via get_tempfile_path). But after the failed close resets the tempfile object, this is guaranteed to be the empty string. That will be addressed in a future patch (because there are many more cases of the same problem than just these instances). Note also in the hunk in gpg-interface.c that it's fine to call delete_tempfile() in the error path, even if close_tempfile() failed and already deleted the file. The tempfile code is smart enough to know the second deletion is a noop. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-06 17:19:53 +09:00
Junio C Hamano	eabdcd4ab4	Merge branch 'jt/packmigrate' Code movement to make it easier to hack later. * jt/packmigrate: (23 commits) pack: move for_each_packed_object() pack: move has_pack_index() pack: move has_sha1_pack() pack: move find_pack_entry() and make it global pack: move find_sha1_pack() pack: move find_pack_entry_one(), is_pack_valid() pack: move check_pack_index_ptr(), nth_packed_object_offset() pack: move nth_packed_object_{sha1,oid} pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() pack: move unpack_object_header() pack: move get_size_from_delta() pack: move unpack_object_header_buffer() pack: move {,re}prepare_packed_git and approximate_object_count pack: move install_packed_git() pack: move add_packed_git() pack: move unuse_pack() pack: move use_pack() pack: move pack-closing functions pack: move release_pack_memory() pack: move open_pack_index(), parse_pack_index() ...	2017-08-26 22:55:09 -07:00
Junio C Hamano	614ea03a71	Merge branch 'bw/submodule-config-cleanup' Code clean-up to avoid mixing values read from the .gitmodules file and values read from the .git/config file. * bw/submodule-config-cleanup: submodule: remove gitmodules_config unpack-trees: improve loading of .gitmodules submodule-config: lazy-load a repository's .gitmodules file submodule-config: move submodule-config functions to submodule-config.c submodule-config: remove support for overlaying repository config diff: stop allowing diff to have submodules configured in .git/config submodule: remove submodule_config callback routine unpack-trees: don't respect submodule.update submodule: don't rely on overlayed config when setting diffopts fetch: don't overlay config with submodule-config submodule--helper: don't overlay config in update-clone submodule--helper: don't overlay config in remote_submodule_branch add, reset: ensure submodules can be added or reset submodule: don't use submodule_from_name t7411: check configuration parsing errors	2017-08-26 22:55:08 -07:00
Junio C Hamano	6b8aa3294e	Merge branch 'po/object-id' * po/object-id: sha1_file: convert index_stream to struct object_id sha1_file: convert hash_sha1_file_literally to struct object_id sha1_file: convert index_fd to struct object_id sha1_file: convert index_path to struct object_id read-cache: convert to struct object_id builtin/hash-object: convert to struct object_id	2017-08-26 22:55:07 -07:00
Junio C Hamano	0b96358479	Merge branch 'jt/diff-color-move-fix' A handful of bugfixes and an improvement to "diff --color-moved". * jt/diff-color-move-fix: diff: define block by number of alphanumeric chars diff: respect MIN_BLOCK_LENGTH for last block diff: avoid redundantly clearing a flag	2017-08-26 22:55:04 -07:00
Junio C Hamano	b6c4058f97	Merge branch 'sb/diff-color-move' "git diff" has been taught to optionally paint new lines that are the same as deleted lines elsewhere differently from genuinely new lines. * sb/diff-color-move: (25 commits) diff: document the new --color-moved setting diff.c: add dimming to moved line detection diff.c: color moved lines differently, plain mode diff.c: color moved lines differently diff.c: buffer all output if asked to diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP diff.c: convert word diffing to use emit_diff_symbol diff.c: convert show_stats to use emit_diff_symbol diff.c: convert emit_binary_diff_body to use emit_diff_symbol submodule.c: migrate diff output to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS} diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN] diff.c: migrate emit_line_checked to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO ...	2017-08-26 22:55:03 -07:00
Jonathan Tan	150e3001d0	pack: move has_sha1_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Patryk Obara	98e019b067	sha1_file: convert index_path to struct object_id Convert all remaining callers as well. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-20 21:51:38 -07:00
Junio C Hamano	08a8509e50	diff: retire sane_truncate_fn Long time ago, `23707811` ("diff: do not chomp hunk-header in the middle of a character", 2008-01-02) introduced sane_truncate_line() helper function to trim the "function header" line that is shown at the end of the hunk header line, in order to avoid chomping it in the middle of a single UTF-8 character. It also added a facility to define a custom callback function to make it possible to extend it to non UTF-8 encodings. During the following 8 1/2 years, nobody found need for this custom callback facility. A custom callback function is a wrong design to use here anyway---if your contents need support for non UTF-8 encoding, you shouldn't have to write a custom function and recompile Git to plumb it in. A better approach would be to extend sane_truncate_line() function and have a new member in emit_callback to conditionally trigger it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-17 14:38:22 -07:00
Jonathan Tan	f0b8fb6e59	diff: define block by number of alphanumeric chars The existing behavior of diff --color-moved=zebra does not define the minimum size of a block at all, instead relying on a heuristic applied later to filter out sets of adjacent moved lines that are shorter than 3 lines long. This can be confusing, because a block could thus be colored as moved at the source but not at the destination (or vice versa), depending on its neighbors. Instead, teach diff that the minimum size of a block is 20 alphanumeric characters, the same heuristic used by "git blame". This allows diff to still exclude uninteresting lines appearing on their own (such as those solely consisting of one or a few closing braces), as was the intention of the adjacent-moved-line heuristic. This requires a change in some tests in that some of their lines are no longer considered to be part of a block, because they are too short. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-16 11:44:00 -07:00
Jonathan Tan	09153277f8	diff: respect MIN_BLOCK_LENGTH for last block Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line that does not belong to the current block. In particular, this means that MIN_BLOCK_LENGTH is not checked after all lines are encountered. Perform that check. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-16 11:44:00 -07:00
Jonathan Tan	23b65f9528	diff: avoid redundantly clearing a flag No code in diff.c sets DIFF_SYMBOL_MOVED_LINE except in mark_color_as_moved(), so it is redundant to clear it for the current line. Therefore, clear it only for previous lines. This makes a refactoring in a subsequent patch easier. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-14 12:28:36 -07:00
Brandon Williams	078b75e99b	diff: stop allowing diff to have submodules configured in .git/config Traditionally a submodule is comprised of a gitlink as well as a corresponding entry in the .gitmodules file. Diff doesn't follow this paradigm as its config callback routine falls back to populating the submodule-config if a config entry starts with 'submodule.'. Remove this behavior in order to be consistent with how the submodule-config is populated, via calling 'gitmodules_config()' or 'repo_read_gitmodules()'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-03 13:11:01 -07:00
Jeff King	136c8c8b8f	color: check color.ui in git_default_config() Back in prehistoric times, our decision on whether or not to show color by default relied on using a config callback that either did or didn't load color config like color.diff. When we introduced color.ui, we put it in the same boat: commands had to manually respect it by using git_color_config() or its git_color_default_config() convenience wrapper. But in `4c7f1819b` (make color.ui default to 'auto', 2013-06-10), that changed. Since then, we default color.ui to auto in all programs, meaning that even plumbing commands like "git diff-tree --pretty" might colorize the output. Nobody seems to have complained in the intervening years, presumably because the "is stdout a tty" check does a good job of catching the right cases. But that leaves an interesting curiosity: color.ui defaults to auto even in plumbing, but you can't actually _disable_ the color via config. So if you really hate color and set "color.ui" to false, diff-tree will still show color (but porcelain like git-diff won't). Nobody noticed that either, probably because very few people disable color. One could argue that the plumbing should _always_ disable color unless an explicit --color option is given on the command line. But in practice, this creates a lot of complications for scripts which do want plumbing to show user-visible output. They can't just pass "--color" blindly; they need to check the user's config and decide what to send. Given that nobody has complained about the current behavior, let's assume it's a good path, and follow it to its conclusion: supporting color.ui everywhere. Note that you can create havoc by setting color.ui=always in your config, but that's more or less already the case. We could disallow it entirely, but it is handy for one-offs like: git -c color.ui=always foo >not-a-tty when "foo" does not take a --color option itself. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-13 12:42:51 -07:00
Junio C Hamano	f056cde60e	Merge branch 'rs/use-div-round-up' Code cleanup. * rs/use-div-round-up: use DIV_ROUND_UP	2017-07-12 15:18:23 -07:00
René Scharfe	42c78a216e	use DIV_ROUND_UP Convert code that divides and rounds up to use DIV_ROUND_UP to make the intent clearer and reduce the number of magic constants. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-10 14:24:36 -07:00
Stefan Beller	86b452e276	diff.c: add dimming to moved line detection Any lines inside a moved block of code are not interesting. Boundaries of blocks are only interesting if they are next to another block of moved code. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	176841f0c9	diff.c: color moved lines differently, plain mode Add the 'plain' mode for move detection of code. This omits the checking for adjacent blocks, so it is not as useful. If you have a lot of the same blocks moved in the same patch, the 'Zebra' would end up slow as it is O(n^2) (n is number of same blocks). So this may be useful there and is generally easy to add. Instead be very literal at the move detection, do not skip over short blocks here. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	2e2d5ac184	diff.c: color moved lines differently When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] - if (!is_authorized_user()) [OM] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OM] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NM] + sensitive_stuff(spanning, [NM] + multiple, [NM] + lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] - if (!is_authorized_user()) [OMA] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OMA] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] + sensitive_stuff(spanning, [NMA] + multiple, [NMA] + lines); [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. It is useful to have moved lines colored, but there are annoying corner cases, such as a single line moved, that is very common. For example in a typical patch of C code, we have closing braces that end statement blocks or functions. While it is technically true that these lines are moved as they show up elsewhere, it is harmful for the review as the reviewers attention is drawn to such a minor side annoyance. For now let's have a simple solution of hardcoding the number of moved lines to be at least 3 before coloring them. Note, that the length is applied across all blocks to find the 'lonely' blocks that pollute new code, but do not interfere with a permutated block where each permutation has less lines than 3. Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:59:42 -07:00
Stefan Beller	e6e045f803	diff.c: buffer all output if asked to Introduce a new option 'emitted_symbols' in the struct diff_options which controls whether all output is buffered up until all output is available. It is set internally in diff.c when necessary. We'll have a new struct 'emitted_string' in diff.c which will be used to buffer each line. The emitted_string will duplicate the memory of the line to buffer as that is easiest to reason about for now. In a future patch we may want to decrease the memory usage by not duplicating all output for buffering but rather we may want to store offsets into the file or in case of hunk descriptions such as the similarity score, we could just store the relevant number and reproduce the text later on. This approach was chosen as a first step because it is quite simple compared to the alternative with less memory footprint. emit_diff_symbol factors out the emission part and depending on the diff_options->emitted_symbols the emission will be performed directly when calling emit_diff_symbol or after the whole process is done, i.e. by buffering we have add the possibility for a second pass over the whole output before doing the actual output. In `6440d34` (2012-03-14, diff: tweak a _copy_ of diff_options with word-diff) we introduced a duplicate diff options struct for word emissions as we may have different regex settings in there. When buffering the output, we need to operate on just one buffer, so we have to copy back the emissions of the word buffer into the main buffer. Unconditionally enable output via buffer in this patch as it yields a great opportunity for testing, i.e. all the diff tests from the test suite pass without having reordering issues (i.e. only parts of the output got buffered, and we forgot to buffer other parts). The test suite passes, which gives confidence that we converted all functions to use emit_string for output. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	146fdb0dfe	diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	30b7e1e7ef	diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	bd033291d5	diff.c: convert word diffing to use emit_diff_symbol The word diffing is not line oriented and would need some serious effort to be transformed into a line oriented approach, so just go with a symbol DIFF_SYMBOL_WORD_DIFF that is a partial line. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	0911c475c8	diff.c: convert show_stats to use emit_diff_symbol We call print_stat_summary from builtin/apply, so we still need the version with a file pointer, so introduce print_stat_summary_0 that uses emit_string machinery and keep print_stat_summary with the same arguments around. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	4eed0ebd4d	diff.c: convert emit_binary_diff_body to use emit_diff_symbol Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	f3597138df	submodule.c: migrate diff output to use emit_diff_symbol As the submodule process is no longer attached to the same file pointer 'o->file' as the superprojects process, there is a different result in color.c::check_auto_color. That is why we need to pass coloring explicitly, such that the submodule coloring decision will be made by the child process processing the submodule. Only DIFF_SYMBOL_SUBMODULE_PIPETHROUGH contains color, the other symbols are for embedding the submodule output into the superprojects output. Remove the colors from the function signatures, as all the coloring decisions will be made either inside the child process or the final emit_diff_symbol, but not in the functions driving the submodule diff. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	5af6ea957c	diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:02 -07:00
Stefan Beller	4acaaa7af6	diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES we could save a little bit of memory when buffering in a later mode by just passing the inner part ("%s and %s", file1, file 2), but those a just a few bytes, so instead let's reuse the implementation from DIFF_SYMBOL_HEADER and keep the whole line around. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	a29b0a13bd	diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER The header is constructed lazily including line breaks, so just emit the raw string as is. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	3ee8b7bfe4	diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS} We have to use fprintf instead of emit_line, because we want to emit the tab after the color. This is important for ancient versions of gnu patch AFAICT, although we probably do not want to feed colored output to the patch utility, such that it would not matter if the trailing tab is colored. Keep the corner case as-is though. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	f2bb1218f1	diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE The context marker use the exact same output pattern, so reuse it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	ff958679cd	diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	091f8e28b4	diff.c: migrate emit_line_checked to use emit_diff_symbol Add a new flags field to emit_diff_symbol, that will be used by context lines for: * white space rules that are applicable (The first 12 bits) Take a note in cahe.c as well, when this ws rules are extended we have to fix the bits in the flags field. * how the rules are evaluated (actually this double encodes the sign of the line, but the code is easier to keep this way, bits 13,14,15) * if the line a blank line at EOF (bit 16) The check if new lines need to be marked up as extra lines at the end of file, is now done unconditionally. That should be ok, as 'new_blank_line_at_eof' has a quick early return. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	b9cbfde6b1	diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	68abc6f1c7	diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	c64b420b4c	diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	36a4cefdf4	diff.c: introduce emit_diff_symbol In a later patch we want to buffer all output before emitting it as a new feature ("markup moved lines") conceptually cannot be implemented in a single pass over the output. There are different approaches to buffer all output such as: * Buffering on the char level, i.e. we'd have a char[] which would grow at approximately 80 characters a line. This would keep the output completely unstructured, but might be very easy to implement, such as redirecting all output to a temporary file and working off that. The later passes over the buffer are quite complicated though, because we have to parse back any output and then decide if it should be modified. * Buffer on a line level. As the output is mostly line oriented already, this would make sense, but it still is a bit awkward as we'd have to make sense of it again by looking at the first characters of a line to decide what part of a diff a line is. * Buffer semantically. Imagine there is a formal grammar for the diff output and we'd keep the symbols of this grammar around. This keeps the highest level of structure in the buffered data, such that the actual memory requirements are less than say the first option. Instead of buffering the characters of the line, we'll buffer what we intend to do plus additional information for the specifics. An output of diff --git a/new.txt b/new.txt index fa69b07..412428c 100644 Binary files a/new.txt and b/new.txt differ could be buffered as DIFF_SYMBOL_DIFF_START + new.txt DIFF_SYMBOL_INDEX_MODE + fa69b07 412428c "non-executable" flag DIFF_SYMBOL_BINARY_FILES + new.txt This and the following patches introduce the third option of buffering by first moving any output to emit_diff_symbol, and then introducing the buffering in this function. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	ec33150671	diff.c: factor out diff_flush_patch_all_file_pairs In a later patch we want to do more things before and after all filepairs are flushed. So factor flushing out all file pairs into its own function that the new code can be plugged in easily. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	dfb7728f63	diff.c: move line ending check into emit_hunk_header The emit_hunk_header() function is responsible for assembling a hunk header and calling emit_line() to send the hunk header to the output file. Its only caller fn_out_consume() needs to prepare for a case where the function emits an incomplete line and add the terminating LF. Instead make sure emit_hunk_header() to always send a completed line to emit_line(). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Stefan Beller	f2d2a5def0	diff.c: readability fix We already have dereferenced 'p->two' into a local variable 'two'. Use that. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:13:01 -07:00
Junio C Hamano	50f03c6676	Merge branch 'ab/free-and-null' A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL	2017-06-24 14:28:41 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Junio C Hamano	5812b3f73b	Merge branch 'bw/ls-files-sans-the-index' Code clean-up. * bw/ls-files-sans-the-index: ls-files: factor out tag calculation ls-files: factor out debug info into a function ls-files: convert show_files to take an index ls-files: convert show_ce_entry to take an index ls-files: convert prune_cache to take an index ls-files: convert ce_excluded to take an index ls-files: convert show_ru_info to take an index ls-files: convert show_other_files to take an index ls-files: convert show_killed_files to take an index ls-files: convert write_eolinfo to take an index ls-files: convert overlay_tree_on_cache to take an index tree: convert read_tree to take an index parameter convert: convert renormalize_buffer to take an index convert: convert convert_to_git to take an index convert: convert convert_to_git_filter_fd to take an index convert: convert crlf_to_git to take an index convert: convert get_cached_convert_stats_ascii to take an index	2017-06-24 14:28:40 -07:00
Junio C Hamano	a6f38c109b	Merge branch 'bw/object-id' Conversion from uchar[20] to struct object_id continues. * bw/object-id: (33 commits) diff: rename diff_fill_sha1_info to diff_fill_oid_info diffcore-rename: use is_empty_blob_oid tree-diff: convert path_appendnew to object_id tree-diff: convert diff_tree_paths to struct object_id tree-diff: convert try_to_follow_renames to struct object_id builtin/diff-tree: cleanup references to sha1 diff-tree: convert diff_tree_sha1 to struct object_id notes-merge: convert write_note_to_worktree to struct object_id notes-merge: convert verify_notes_filepair to struct object_id notes-merge: convert find_notes_merge_pair_ps to struct object_id notes-merge: convert merge_from_diffs to struct object_id notes-merge: convert notes_merge* to struct object_id tree-diff: convert diff_root_tree_sha1 to struct object_id combine-diff: convert find_paths_* to struct object_id combine-diff: convert diff_tree_combined to struct object_id diff: convert diff_flush_patch_id to struct object_id patch-ids: convert to struct object_id diff: finish conversion for prepare_temp_file to struct object_id diff: convert reuse_worktree_file to struct object_id diff: convert fill_filespec to struct object_id ...	2017-06-19 12:38:44 -07:00
Ævar Arnfjörð Bjarmason	6a83d90207	coccinelle: make use of the "type" FREE_AND_NULL() rule Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-16 12:44:03 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	b9a7d55d93	Merge branch 'nd/fopen-errors' We often try to open a file for reading whose existence is optional, and silently ignore errors from open/fopen; report such errors if they are not due to missing files. * nd/fopen-errors: mingw_fopen: report ENOENT for invalid file names mingw: verify that paths are not mistaken for remote nicknames log: fix memory leak in open_next_file() rerere.c: move error_errno() closer to the source system call print errno when reporting a system call error wrapper.c: make warn_on_inaccessible() static wrapper.c: add and use fopen_or_warn() wrapper.c: add and use warn_on_fopen_errors() config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD clone: use xfopen() instead of fopen() use xfopen() in more places git_fopen: fix a sparse 'not declared' warning	2017-06-13 13:47:09 -07:00
Brandon Williams	82b474e025	convert: convert convert_to_git to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-13 11:40:51 -07:00
Brandon Williams	94e327e973	diff: rename diff_fill_sha1_info to diff_fill_oid_info Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:23:58 +09:00
Junio C Hamano	583c6a2295	Merge branch 'js/blame-lib' The internal logic used in "git blame" has been libified to make it easier to use by cgit. * js/blame-lib: (29 commits) blame: move entry prepend to libgit blame: move scoreboard setup to libgit blame: move scoreboard-related methods to libgit blame: move fake-commit-related methods to libgit blame: move origin-related methods to libgit blame: move core structures to header blame: create entry prepend function blame: create scoreboard setup function blame: create scoreboard init function blame: rework methods that determine 'final' commit blame: wrap blame_sort and compare_blame_final blame: move progress updates to a scoreboard callback blame: make sanity_check use a callback in scoreboard blame: move no_whole_file_rename flag to scoreboard blame: move xdl_opts flags to scoreboard blame: move show_root flag to scoreboard blame: move reverse flag to scoreboard blame: move contents_from to scoreboard blame: move copy/move thresholds to scoreboard blame: move stat counters to scoreboard ...	2017-06-05 09:18:12 +09:00

1 2 3 4 5 ...

1200 commits