development/git - HydraGit

mirror of https://github.com/git/git synced 2024-11-05 18:59:29 +00:00

Author	SHA1	Message	Date
Junio C Hamano	9da9fff14d	Merge branch 'nd/clone-case-smashing-warning' Recently added check for case smashing filesystems did not correctly utilize the cached stat information, leading to false breakage detected by our test suite, which has been corrected. * nd/clone-case-smashing-warning: clone: fix colliding file detection on APFS	2018-11-21 20:39:03 +09:00
Nguyễn Thái Ngọc Duy	e66ceca94b	clone: fix colliding file detection on APFS Commit `b878579ae7` (clone: report duplicate entries on case-insensitive filesystems - 2018-08-17) adds a warning to user when cloning a repo with case-sensitive file names on a case-insensitive file system. The "find duplicate file" check was doing by comparing inode number (and only fall back to fspathcmp() when inode is known to be unreliable because fspathcmp() can't cover all case folding cases). The inode check is very simple, and wrong. It compares between a 32-bit number (sd_ino) and potentially a 64-bit number (st_ino). When an inode is larger than 2^32 (which seems to be the case for APFS), it will be truncated and stored in sd_ino, but comparing with itself will fail. As a result, instead of showing a pair of files that have the same name, we show just one file (marked before the beginning of the loop). We fail to find the original one. The fix could be just a simple type cast () dup->ce_stat_data.sd_ino == (unsigned int)st->st_ino but this is no longer a reliable test, there are 4G possible inodes that can match sd_ino because we only match the lower 32 bits instead of full 64 bits. There are two options to go. Either we ignore inode and go with fspathcmp() on Apple platform. This means we can't do accurate inode check on HFS anymore, or even on APFS when inode numbers are still below 2^32. Or we just to to reduce the odds of matching a wrong file by checking more attributes, counting mostly on st_size because st_xtime is likely the same. This patch goes with this direction, hoping that false positive chances are too small to be seen in practice. While at there, enable the test on Cygwin (verified working by Ramsay Jones) () this is also already done inside match_stat_data() Reported-by: Carlo Arenas <carenas@gmail.com> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-21 14:05:13 +09:00
Junio C Hamano	c2407322b6	Merge branch 'nd/clone-case-smashing-warning' Running "git clone" against a project that contain two files with pathnames that differ only in cases on a case insensitive filesystem would result in one of the files lost because the underlying filesystem is incapable of holding both at the same time. An attempt is made to detect such a case and warn. * nd/clone-case-smashing-warning: clone: report duplicate entries on case-insensitive filesystems	2018-09-17 13:53:47 -07:00
Duy Nguyen	b878579ae7	clone: report duplicate entries on case-insensitive filesystems Paths that only differ in case work fine in a case-sensitive filesystems, but if those repos are cloned in a case-insensitive one, you'll get problems. The first thing to notice is "git status" will never be clean with no indication what exactly is "dirty". This patch helps the situation a bit by pointing out the problem at clone time. Even though this patch talks about case sensitivity, the patch makes no assumption about folding rules by the filesystem. It simply observes that if an entry has been already checked out at clone time when we're about to write a new path, some folding rules are behind this. In the case that we can't rely on filesystem (via inode number) to do this check, fall back to fspathcmp() which is not perfect but should not give false positives. This patch is tested with vim-colorschemes and Sublime-Gitignore repositories on a JFS partition with case insensitive support on Linux. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-17 12:10:37 -07:00
Nguyễn Thái Ngọc Duy	74cfc0ee1d	entry.c: use the right index instead of the_index checkout-index.c needs update because if checkout->istate is NULL, ie_match_stat() will crash. Previously this is ie_match_stat(&the_index, ..) so it will not crash, but it is not technically correct either. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy	7f944e264e	convert.c: remove an implicit dependency on the_index Make the convert API take an index_state instead of assuming the_index in convert.c. All external call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:42 -07:00
Stefan Beller	cbd53a2193	object-store: move object access functions to object-store.h This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:42:03 +09:00
brian m. carlson	1a750441a7	convert: convert to struct object_id Convert convert.c to struct object_id. Add a use of the_hash_algo to replace hard-coded constants and change a strbuf_add to a strbuf_addstr to avoid another hard-coded constant. Note that a strict conversion using the hexsz constant would cause problems in the future if the internal and user-visible hash algorithms differed, as anticipated by the hash function transition plan. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
brian m. carlson	b4f5aca40e	sha1_file: convert read_sha1_file to struct object_id Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
Brandon Williams	d8f71807c1	entry: rename 'new' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Junio C Hamano	e05336bdda	Merge branch 'bp/fsmonitor' We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64	2017-11-21 14:07:50 +09:00
Junio C Hamano	6909bf6bd9	Merge branch 'ls/filter-process-delayed' Bugfixes to an already graduated series. * ls/filter-process-delayed: write_entry: untangle symlink and regular-file cases write_entry: avoid reading blobs in CE_RETRY case write_entry: fix leak when retrying delayed filter entry.c: check if file exists after checkout entry.c: update cache entry only for existing files	2017-10-11 14:52:24 +09:00
Jeff King	7cbbf9d6a2	write_entry: untangle symlink and regular-file cases The write_entry() function switches on the mode of the entry we're going to write out. The cases for S_IFLNK and S_IFREG are lumped together. In earlier versions of the code, this made some sense. They have a shared preamble (which reads the blob content), a short type-specific body, and a shared conclusion (which writes out the file contents; always for S_IFREG and only sometimes for S_IFLNK). But over time this has grown to make less sense. The preamble now has conditional bits for each type, and the S_IFREG body has grown a lot more complicated. It's hard to follow the logic of which code is running for which mode. Let's give each mode its own case arm. We will still share the conclusion code, which means we now jump to it with a goto. Ideally we'd pull that shared code into its own function, but it touches so much internal state in the write_entry() function that the end result is actually harder to follow than the goto. While we're here, we'll touch up a few bits of whitespace to make the beginning and endings of the cases easier to read. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-10 09:03:07 +09:00
Jeff King	c602d3a989	write_entry: avoid reading blobs in CE_RETRY case When retrying a delayed filter-process request, we don't need to send the blob to the filter a second time. However, we read it unconditionally into a buffer, only to later throw away that buffer. We can make this more efficient by skipping the read in the first place when it isn't necessary. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-10 08:59:57 +09:00
Jeff King	b2401586fc	write_entry: fix leak when retrying delayed filter When write_entry() retries a delayed filter request, we don't need to send the blob content to the filter again, and set the pointer to NULL. But doing so means we leak the contents we read earlier from read_blob_entry(). Let's make sure to free it before dropping the pointer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-10 08:59:02 +09:00
Lars Schneider	11179eb311	entry.c: check if file exists after checkout If we are checking out a file and somebody else racily deletes our file, then we would write garbage to the cache entry. Fix that by checking the result of the lstat() call on that file. Print an error to the user if the file does not exist. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-06 14:59:16 +09:00
Lars Schneider	03b95333db	entry.c: update cache entry only for existing files In `2841e8f` ("convert: add "status=delayed" to filter process protocol", 2017-06-30) we taught the filter process protocol to delay responses. That means an external filter might answer in the first write_entry() call on a file that requires filtering "I got your request, but I can't answer right now. Ask again later!". As Git got no answer, we do not write anything to the filesystem. Consequently, the lstat() call in the finish block of the function writes garbage to the cache entry. The garbage is eventually overwritten when the filter answers with the final file content in a subsequent write_entry() call. Fix the brief time window of garbage in the cache entry by adding a special finish block that does nothing for delayed responses. The cache entry is written properly in a subsequent write_entry() call where the filter responds with the final file content. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-05 20:01:07 +09:00
Ben Peart	883e248b8a	fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:23:01 +09:00
Junio C Hamano	c50424a6f0	Merge branch 'jk/write-in-full-fix' Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern	2017-09-25 15:24:06 +09:00
Jeff King	564bde9ae6	convert less-trivial versions of "write_in_full() != len" The prior commit converted many sites to check the return value of write_in_full() for negativity, rather than a mismatch with the input length. This patch covers similar cases, but where the return value is stored in an intermediate variable. These should get the same treatment, but they need to be reviewed more carefully since it would be a bug if the return value is stored in an unsigned type (which indeed, it is in one of the cases). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-14 15:17:59 +09:00
Junio C Hamano	7fbbd3ec0f	Merge branch 'ls/convert-filter-progress' The codepath to call external process filter for smudge/clean operation learned to show the progress meter. * ls/convert-filter-progress: convert: display progress for filtered objects that have been delayed	2017-09-10 17:08:22 +09:00
Lars Schneider	52f1d62eb4	convert: display progress for filtered objects that have been delayed In `2841e8f` ("convert: add "status=delayed" to filter process protocol", 2017-06-30) we taught the filter process protocol to delayed responses. These responses are processed after the "Checking out files" phase. If the processing takes noticeable time, then the user might think Git is stuck. Display the progress of the delayed responses to let the user know that Git is still processing objects. This works very well for objects that can be filtered quickly. If filtering of an individual object takes noticeable time, then the user might still think that Git is stuck. However, in that case the user would at least know what Git is doing. It would be technical more correct to display "Checking out files whose content filtering has been delayed". For brevity we only print "Filtering content". The finish_delayed_checkout() call was moved below the stop_progress() call in unpack-trees.c to ensure that the "Checking out files" progress is properly stopped before the "Filtering content" progress starts in finish_delayed_checkout(). Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Suggested-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 12:41:20 -07:00
Lars Schneider	2841e8f81c	convert: add "status=delayed" to filter process protocol Some `clean` / `smudge` filters may require a significant amount of time to process a single blob (e.g. the Git LFS smudge filter might perform network requests). During this process the Git checkout operation is blocked and Git needs to wait until the filter is done to continue with the checkout. Teach the filter process protocol, introduced in `edcc8581` ("convert: add filter.<driver>.process option", 2016-10-16), to accept the status "delayed" as response to a filter request. Upon this response Git continues with the checkout operation. After the checkout operation Git calls "finish_delayed_checkout" which queries the filter for remaining blobs. If the filter is still working on the completion, then the filter is expected to block. If the filter has completed all remaining blobs then an empty response is expected. Git has a multiple code paths that checkout a blob. Support delayed checkouts only in `clone` (in unpack-trees.c) and `checkout` operations for now. The optimization is most effective in these code paths as all files of the tree are processed. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:50:41 -07:00
Stefan Beller	cd279e2e1b	entry.c: submodule recursing: respect force flag correctly In case of a non-forced worktree update, the submodule movement is tested in a dry run first, such that it doesn't matter if the actual update is done via the force flag. However for correctness, we want to give the flag as specified by the user. All callers have been inspected and updated if needed. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 21:18:29 -07:00
Stefan Beller	6d14eac3ec	entry.c: create submodules when interesting When a submodule is introduced with a new revision we need to create the submodule in the worktree as well. As 'submodule_move_head' handles edge cases, all we have to do is call it from within the function that creates new files in the working tree for workingtree operations. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-16 14:07:16 -07:00
brian m. carlson	7eda0e4fbb	streaming: make stream_blob_to_fd take struct object_id Since all of its callers have been updated, modify stream_blob_to_fd to take a struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
brian m. carlson	99d1a9861a	cache: convert struct cache_entry to use struct object_id Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
Nguyễn Thái Ngọc Duy	e1ebb3c25b	entry.c: use error_errno() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-05-09 12:29:08 -07:00
Jeff King	3733e69464	use xmallocz to avoid size arithmetic We frequently allocate strings as xmalloc(len + 1), where the extra 1 is for the NUL terminator. This can be done more simply with xmallocz, which also checks for integer overflow. There's no case where switching xmalloc(n+1) to xmallocz(n) is wrong; the result is the same length, and malloc made no guarantees about what was in the buffer anyway. But in some cases, we can stop manually placing NUL at the end of the allocated buffer. But that's only safe if it's clear that the contents will always fill the buffer. In each case where this patch does so, I manually examined the control flow, and I tried to err on the side of caution. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Jeff King	330c8e2670	entry.c: convert strcpy to xsnprintf This particular conversion is non-obvious, because nobody has passed our function the length of the destination buffer. However, the interface to checkout_entry specifies that the buffer must be at least TEMPORARY_FILENAME_LENGTH bytes long, so we can check that (meaning the existing code was not buggy, but merely worrisome to somebody reading it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-09-25 10:18:18 -07:00
Nguyễn Thái Ngọc Duy	078a58e825	read-cache: mark updated entries for split index The large part of this patch just follows CE_ENTRY_CHANGED marks. replace_index_entry() is updated to update split_index->base->cache[] as well so base->cache[] does not reference to a freed entry. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:40 -07:00
Nguyễn Thái Ngọc Duy	d4a2024aef	entry.c: update cache_changed if refresh_cache is set in checkout_entry() Other fill_stat_cache_info() is on new entries, which should set CE_ENTRY_ADDED in cache_changed, so we're safe. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-13 11:49:39 -07:00
Junio C Hamano	ec8cd4fc11	Merge branch 'mh/remove-subtree-long-pathname-fix' * mh/remove-subtree-long-pathname-fix: entry.c: fix possible buffer overflow in remove_subtree() checkout_entry(): use the strbuf throughout the function	2014-03-25 11:07:09 -07:00
Michael Haggerty	2f29e0c6fa	entry.c: fix possible buffer overflow in remove_subtree() remove_subtree() manipulated path in a fixed-size buffer even though the length of the input, let alone the length of entries within the directory, were not known in advance. Change the function to take a strbuf argument and use that object as its scratch space. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-13 10:57:48 -07:00
Michael Haggerty	f63272a35e	checkout_entry(): use the strbuf throughout the function There is no need to break out the "buf" and "len" members into separate temporary variables. Rename path_buf to path and use path.buf and path.len directly. This makes it easier to reason about the data flow in the function. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-13 10:56:50 -07:00
Junio C Hamano	af2a651d2e	checkout_entry(): clarify the use of topath[] parameter The said function has this signature: extern int checkout_entry(struct cache_entry ce, const struct checkout state, char *topath); At first glance, it might appear that the caller of checkout_entry() can specify to which path the contents are written out by the last parameter, and it is tempting to add "const" in front of its type. In reality, however, topath[] is to point at a buffer to store the temporary path generated by the callchain originating from this function, and the temporary path is always short, much shorter than the buffer prepared by its only caller in builtin/checkout-index.c. Document the code a bit to clarify so that future callers know how to use the function better. Noticed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-24 14:59:39 -07:00
Nguyễn Thái Ngọc Duy	fd356f6aa8	entry.c: convert checkout_entry to use strbuf The old code does not do boundary check so any paths longer than PATH_MAX can cause buffer overflow. Replace it with strbuf to handle paths of arbitrary length. The OS may reject if the path is too long though. But in that case we report the cause (e.g. name too long) and usually move on to checking out the next entry. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-24 14:58:37 -07:00
Junio C Hamano	d3aeb31dc4	Merge branch 'nd/const-struct-cache-entry' * nd/const-struct-cache-entry: Convert "struct cache_entry *" to "const ..." wherever possible	2013-07-22 11:24:01 -07:00
Thomas Rast	42063f95a0	apply, entry: speak of submodules instead of subprojects There are only four (with some generous rounding) instances in the current source code where we speak of "subproject" instead of "submodule". They are as follows: * one error message in git-apply and two in entry.c * the patch format for submodule changes The latter was introduced in `0478675` (Expose subprojects as special files to "git diff" machinery, 2007-04-15), apparently before the terminology was settled. We can of course not change the patch format. Let's at least change the error messages to consistently call them "submodule". Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-18 10:56:06 -07:00
Nguyễn Thái Ngọc Duy	9c5e6c802c	Convert "struct cache_entry " to "const ..." wherever possible I attempted to make index_state->cache[] a "const struct cache_entry " to find out how existing entries in index are modified and where. The question I have is what do we do if we really need to keep track of on-disk changes in the index. The result is - diff-lib.c: setting CE_UPTODATE - name-hash.c: setting CE_HASHED - preload-index.c, read-cache.c, unpack-trees.c and builtin/update-index: obvious - entry.c: write_entry() may refresh the checked out entry via fill_stat_cache_info(). This causes "non-const struct cache_entry " in builtin/apply.c, builtin/checkout-index.c and builtin/checkout.c - builtin/ls-files.c: --with-tree changes stagemask and may set CE_UPDATE Of these, write_entry() and its call sites are probably most interesting because it modifies on-disk info. But this is stat info and can be retrieved via refresh, at least for porcelain commands. Other just uses ce_flags for local purposes. So, keeping track of "dirty" entries is just a matter of setting a flag in index modification functions exposed by read-cache.c. Except unpack-trees, the rest of the code base does not do anything funny behind read-cache's back. The actual patch is less valueable than the summary above. But if anyone wants to re-identify the above sites. Applying this patch, then this: diff --git a/cache.h b/cache.h index 430d021..1692891 100644 --- a/cache.h +++ b/cache.h @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode) #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1) struct index_state { - struct cache_entry cache; + const struct cache_entry cache; unsigned int version; unsigned int cache_nr, cache_alloc, cache_changed; struct string_list *resolve_undo; will help quickly identify them without bogus warnings. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-09 09:12:48 -07:00
Junio C Hamano	b9c78e9723	Merge branch 'jk/check-corrupt-objects-carefully' Have the streaming interface and other codepaths more carefully examine for corrupt objects. * jk/check-corrupt-objects-carefully: clone: leave repo in place after checkout errors clone: run check_everything_connected clone: die on errors from unpack_trees add tests for cloning corrupted repositories streaming_write_entry: propagate streaming errors add test for streaming corrupt blobs avoid infinite loop in read_istream_loose read_istream_filtered: propagate read error from upstream check_sha1_signature: check return value from read_istream stream_blob_to_fd: detect errors reading from stream	2013-04-03 09:34:29 -07:00
Junio C Hamano	39c5835dd6	Merge branch 'jk/checkout-attribute-lookup' Codepath to stream blob object contents directly from the object store to filesystem did not use the correct path to find conversion filters when writing to temporary files. * jk/checkout-attribute-lookup: t2003: work around path mangling issue on Windows entry: fix filter lookup t2003: modernize style	2013-03-28 14:37:46 -07:00
Jeff King	d9c31e14d0	streaming_write_entry: propagate streaming errors When we are streaming an index blob to disk, we store the error from stream_blob_to_fd in the "result" variable, and then immediately overwrite that with the return value of "close". That means we catch errors on close (e.g., problems committing the file to disk), but miss anything which happened before then. We can fix this by using bitwise-OR to accumulate errors in our result variable. While we're here, we can also simplify the error handling with an early return, which makes it easier to see under which circumstances we need to clean up. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-03-27 13:47:09 -07:00
John Keeping	7297a44012	entry: fix filter lookup When looking up the stream filter, write_entry() should be passing the path of the file in the repository, not the path to which the content is going to be written. This allows the file to be correctly looked up against the .gitattributes files in the working tree. This change makes the streaming case match the non-streaming case which passes ce->name to convert_to_working_tree later in the same function. The two tests added here test the different paths through write_entry since the CRLF filter is a streaming filter but the user-defined smudge filter is not streamed. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-03-14 14:49:48 -07:00
Junio C Hamano	47a02ff2ca	streaming: make streaming-write-entry to be more reusable The static function in entry.c takes a cache entry and streams its blob contents to a file in the working tree. Refactor the logic to a new API function stream_blob_to_fd() that takes an object name and an open file descriptor, so that it can be reused by other callers. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-07 09:07:37 -08:00
Junio C Hamano	b6691092d7	Add streaming filter API This introduces an API to plug custom filters to an input stream. The caller gets get_stream_filter("path") to obtain an appropriate filter for the path, and then uses it when opening an input stream via open_istream(). After that, the caller can read from the stream with read_istream(), and close it with close_istream(), just like an unfiltered stream. This only adds a "null" filter that is a pass-thru filter, but later changes can add LF-to-CRLF and other filters, and the callers of the streaming API do not have to change. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-26 16:47:15 -07:00
Junio C Hamano	de6182db67	streaming_write_entry(): support files with holes One typical use of a large binary file is to hold a sparse on-disk hash table with a lot of holes. Help preserving the holes with lseek(). Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-20 23:16:53 -07:00
Junio C Hamano	dd8e912190	streaming_write_entry(): use streaming API in write_entry() When the output to a path does not have to be converted, we can read from the object database from the streaming API and write to the file in the working tree, without having to hold everything in the memory. The ident, auto- and safe- crlf conversions inherently require you to read the whole thing before deciding what to do, so while it is technically possible to support them by using a buffer of an unbound size or rewinding and reading the stream twice, it is less practical than the traditional "read the whole thing in core and convert" approach. Adding streaming filters for the other conversions on top of this should be doable by tweaking the can_bypass_conversion() function (it should be renamed to can_filter_stream() when it happens). Then the streaming API can be extended to wrap the git_istream streaming_write_entry() opens on the underlying object in another git_istream that reads from it, filters what is read, and let the streaming_write_entry() read the filtered result. But that is outside the scope of this series. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-20 18:46:58 -07:00
Junio C Hamano	fd5db55d8b	write_entry(): separate two helper functions out In the write-out codepath, a block of code determines what file in the working tree to write to, and opens an output file descriptor to it. After writing the contents out to the file, another block of code runs fstat() on the file descriptor when appropriate. Separate these blocks out to open_output_fd() and fstat_output() helper functions. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-20 18:38:54 -07:00
Nguyễn Thái Ngọc Duy	d43e90732b	entry.c: remove "checkout-index" from error messages Back then when entry.c was part of checkout-index (or checkout-cache at that time [1]). It makes sense to print the command name in error messages. Nowadays entry.c is in libgit and can be used by any commands, printing "git checkout-index: blah" does no more than confusion. The error messages without it still give enough information. [1] `12dccc1` (Make fiel checkout function available to the git library - 2005-06-05) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-11-29 14:03:07 -08:00

1 2 3

104 commits