Commit graph

153 commits

Author SHA1 Message Date
Junio C Hamano ff2be2610a Merge branch 'jk/blame-first-parent'
"git blame --first-parent v1.0..v2.0" was not rejected but did not
limit the blame to commits on the first parent chain.

* jk/blame-first-parent:
  blame: handle --first-parent
2015-10-05 12:30:24 -07:00
Junio C Hamano 7b09c459d3 Merge branch 'jk/date-local'
"git log --date=local" used to only show the normal (default)
format in the local timezone.  The command learned to take 'local'
as an instruction to use the local timezone with other formats,
e.g. "git show --date=rfc-local".

* jk/date-local:
  t6300: add tests for "-local" date formats
  t6300: make UTC and local dates different
  date: make "local" orthogonal to date format
  date: check for "local" before anything else
  t6300: add test for "raw" date format
  t6300: introduce test_date() helper
  fast-import: switch crash-report date to iso8601
  Documentation/rev-list: don't list date formats
  Documentation/git-for-each-ref: don't list date formats
  Documentation/config: don't list date formats
  Documentation/blame-options: don't list date formats
2015-10-05 12:30:13 -07:00
Jeff King 95a4fb0eac blame: handle --first-parent
The revision.c options-parser will parse "--first-parent"
for us, but the blame code does not actually respect it, as
we simply iterate over the whole list returned by
first_scapegoat(). We can fix this by returning a
truncated parent list.

Note that we could technically also do so by limiting the
return value of num_scapegoats(), but that is less robust.
We would rely on nobody ever looking at the "next" pointer
from the returned list.

Combining "--reverse" with "--first-parent" is more
complicated, and will probably involve cooperation from
revision.c. Since the desired semantics are not even clear,
let's punt on this for now, but explicitly disallow it to
avoid confusing users (this is not really a regression,
since it did something nonsensical before).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-16 09:59:05 -07:00
Jeff King add00ba2de date: make "local" orthogonal to date format
Most of our "--date" modes are about the format of the date:
which items we show and in what order. But "--date=local" is
a bit of an oddball. It means "show the date in the normal
format, but using the local timezone". The timezone we use
is orthogonal to the actual format, and there is no reason
we could not have "localized iso8601", etc.

This patch adds a "local" boolean field to "struct
date_mode", and drops the DATE_LOCAL element from the
date_mode_type enum (it's now just DATE_NORMAL plus
local=1). The new feature is accessible to users by adding
"-local" to any date mode (e.g., "iso-local"), and we retain
"local" as an alias for "default-local" for backwards
compatibility.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-03 15:45:26 -07:00
Jeff King f932729cc7 memoize common git-path "constant" files
One of the most common uses of git_path() is to pass a
constant, like git_path("MERGE_MSG"). This has two
drawbacks:

  1. The return value is a static buffer, and the lifetime
     is dependent on other calls to git_path, etc.

  2. There's no compile-time checking of the pathname. This
     is OK for a one-off (after all, we have to spell it
     correctly at least once), but many of these constant
     strings appear throughout the code.

This patch introduces a series of functions to "memoize"
these strings, which are essentially globals for the
lifetime of the program. We compute the value once, take
ownership of the buffer, and return the cached value for
subsequent calls.  cache.h provides a helper macro for
defining these functions as one-liners, and defines a few
common ones for global use.

Using a macro is a little bit gross, but it does nicely
document the purpose of the functions. If we need to touch
them all later (e.g., because we learned how to change the
git_dir variable at runtime, and need to invalidate all of
the stored values), it will be much easier to have the
complete list.

Note that the shared-global functions have separate, manual
declarations. We could do something clever with the macros
(e.g., expand it to a declaration in some places, and a
declaration _and_ a definition in path.c). But there aren't
that many, and it's probably better to stay away from
too-magical macros.

Likewise, if we abandon the C preprocessor in favor of
generating these with a script, we could get much fancier.
E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz".
But the small amount of saved typing is probably not worth
the resulting confusion to readers who want to grep for the
function's definition.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:14 -07:00
Junio C Hamano d939af12bd Merge branch 'jk/date-mode-format'
Teach "git log" and friends a new "--date=format:..." option to
format timestamps using system's strftime(3).

* jk/date-mode-format:
  strbuf: make strbuf_addftime more robust
  introduce "format" date-mode
  convert "enum date_mode" into a struct
  show-branch: use DATE_RELATIVE instead of magic number
2015-08-03 11:01:27 -07:00
Junio C Hamano be9cb560e3 Merge branch 'mh/init-delete-refs-api'
Clean up refs API and make "git clone" less intimate with the
implementation detail.

* mh/init-delete-refs-api:
  delete_ref(): use the usual convention for old_sha1
  cmd_update_ref(): make logic more straightforward
  update_ref(): don't read old reference value before delete
  check_branch_commit(): make first parameter const
  refs.h: add some parameter names to function declarations
  refs: move the remaining ref module declarations to refs.h
  initial_ref_transaction_commit(): check for ref D/F conflicts
  initial_ref_transaction_commit(): check for duplicate refs
  refs: remove some functions from the module's public interface
  initial_ref_transaction_commit(): function for initial ref creation
  repack_without_refs(): make function private
  prune_refs(): use delete_refs()
  prune_remote(): use delete_refs()
  delete_refs(): bail early if the packed-refs file cannot be rewritten
  delete_refs(): make error message more generic
  delete_refs(): new function for the refs API
  delete_ref(): handle special case more explicitly
  remove_branches(): remove temporary
  delete_ref(): move declaration to refs.h
2015-08-03 11:01:17 -07:00
Jeff King aa1462cc3d introduce "format" date-mode
This feeds the format directly to strftime. Besides being a
little more flexible, the main advantage is that your system
strftime may know more about your locale's preferred format
(e.g., how to spell the days of the week).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-29 11:39:10 -07:00
Jeff King a5481a6c94 convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.

Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor.  However, the tricky case is where we use the
enum labels as constants, like:

  show_date(t, tz, DATE_NORMAL);

Ideally we could say:

  show_date(t, tz, &{ DATE_NORMAL });

but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:

  1. Manually add a "struct date_mode d = { DATE_NORMAL }"
     definition to each caller, and pass "&d". This makes
     the callers uglier, because they sometimes do not even
     have their own scope (e.g., they are inside a switch
     statement).

  2. Provide a pre-made global "date_normal" struct that can
     be passed by address. We'd also need "date_rfc2822",
     "date_iso8601", and so forth. But at least the ugliness
     is defined in one place.

  3. Provide a wrapper that generates the correct struct on
     the fly. The big downside is that we end up pointing to
     a single global, which makes our wrapper non-reentrant.
     But show_date is already not reentrant, so it does not
     matter.

This patch implements 3, along with a minor macro to keep
the size of the callers sane.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-29 11:39:07 -07:00
Junio C Hamano 20d16da5ca Merge branch 'qn/blame-show-email'
"git blame" learned blame.showEmail configuration variable.

* qn/blame-show-email:
  blame: add blame.showEmail configuration
2015-06-24 12:21:41 -07:00
Michael Haggerty fb58c8d507 refs: move the remaining ref module declarations to refs.h
Some functions from the refs module were still declared in cache.h.
Move them to refs.h.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:12 -07:00
Quentin Neill 8b504db309 blame: add blame.showEmail configuration
Complement existing --show-email option with fallback
configuration variable, with tests.

Signed-off-by: Quentin Neill <quentin.neill@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-01 15:50:43 -07:00
Junio C Hamano 4ba5bb5531 Merge branch 'rs/janitorial'
Code clean-up.

* rs/janitorial:
  dir: remove unused variable sb
  clean: remove unused variable buf
  use file_exists() to check if a file exists in the worktree
2015-06-01 12:45:15 -07:00
Junio C Hamano 6e0ac8e45f Merge branch 'ah/usage-strings'
A few usage string updates.

* ah/usage-strings:
  blame, log: format usage strings similarly to those in documentation
2015-06-01 12:45:10 -07:00
René Scharfe dbe44faadb use file_exists() to check if a file exists in the worktree
Call file_exists() instead of open-coding it.  That's shorter, simpler
and the intent becomes clearer.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 13:49:10 -07:00
Junio C Hamano 5fa9e4c4f1 Merge branch 'tb/blame-resurrect-convert-to-git'
Some time ago, "git blame" (incorrectly) lost the convert_to_git()
call when synthesizing a fake "tip" commit that represents the
state in the working tree, which broke folks who record the history
with LF line ending to make their project portabile across
platforms while terminating lines in their working tree files with
CRLF for their platform.

* tb/blame-resurrect-convert-to-git:
  blame: CRLF in the working tree and LF in the repo
2015-05-11 14:23:52 -07:00
Alex Henrie ce41720cad blame, log: format usage strings similarly to those in documentation
Earlier, 9c9b4f2f (standardize usage info string format, 2015-01-13)
tried to make usage-string in line with the documentation by

    - Placing angle brackets around fill-in-the-blank parameters
    - Putting dashes in multiword parameter names
    - Adding spaces to [-f|--foobar] to make [-f | --foobar]
    - Replacing <foobar>* with [<foobar>...]

but it missed a few places.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-03 16:55:26 -07:00
Torsten Bögershausen 4bf256d67a blame: CRLF in the working tree and LF in the repo
A typical setup under Windows is to set core.eol to CRLF, and text
files are marked as "text" in .gitattributes, or core.autocrlf is
set to true.

After 4d4813a5 "git blame" no longer works as expected for such a
set-up.  Every line is annotated as "Not Committed Yet", even though
the working directory is clean.  This is because the commit removed
the conversion in blame.c for all files, with or without CRLF in the
repo.

Having files with CRLF in the repo and core.autocrlf=input is a
temporary situation, and the files, if committed as is, will be
normalized in the repo, which _will_ be a notable change.  Blaming
them with "Not Committed Yet" is the right result.  Revert commit
4d4813a5 which was a misguided attempt to "solve" a non-problem.

Add two test cases in t8003 to verify the correct CRLF conversion.

Suggested-By: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-03 11:00:10 -07:00
Junio C Hamano 073bb8ebb8 Merge branch 'es/blame-commit-info-fix'
"git blame" died, trying to free an uninitialized piece of memory.

* es/blame-commit-info-fix:
  builtin/blame: destroy initialized commit_info only
2015-02-22 12:28:24 -08:00
Junio C Hamano bb831db677 Merge branch 'ah/usage-strings'
* ah/usage-strings:
  standardize usage info string format
2015-02-11 13:44:20 -08:00
Junio C Hamano 092c4be7f5 Merge branch 'jk/blame-commit-label'
"git blame HEAD -- missing" failed to correctly say "HEAD" when it
tried to say "No such path 'missing' in HEAD".

* jk/blame-commit-label:
  blame.c: fix garbled error message
  use xstrdup_or_null to replace ternary conditionals
  builtin/commit.c: use xstrdup_or_null instead of envdup
  builtin/apply.c: use xstrdup_or_null instead of null_strdup
  git-compat-util: add xstrdup_or_null helper
2015-02-11 13:39:50 -08:00
Eric Sunshine e60059276b builtin/blame: destroy initialized commit_info only
Since ea02ffa3 (mailmap: simplify map_user() interface, 2013-01-05),
find_alignment() has been invoking commit_info_destroy() on an
uninitialized auto 'struct commit_info' (when METAINFO_SHOWN is not
set). commit_info_destroy() calls strbuf_release() for each
'commit_info' strbuf member, which randomly invokes free() on
whatever random stack value happens to reside in strbuf.buf, thus
leading to periodic crashes.

Reported-by: Dilyan Palauzov <dilyan.palauzov@aegee.org>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-10 10:31:48 -08:00
Alex Henrie 9c9b4f2f8b standardize usage info string format
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:

- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-14 09:32:04 -08:00
Lukas Fleischer a46442f167 blame.c: fix garbled error message
The helper functions prepare_final() and prepare_initial() return a
pointer to a string that is a member of an object in the revs->pending
array. This array is later rebuilt when running prepare_revision_walk()
which potentially transforms the pointer target into a bogus string. Fix
this by maintaining a copy of the original string.

Signed-off-by: Lukas Fleischer <git@cryptocrack.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-13 10:05:53 -08:00
Ronnie Sahlberg 7695d118e5 refs.c: change resolve_ref_unsafe reading argument to be a flags field
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading).  Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.

While at it, swap two of the arguments in the function to put output
arguments at the end.  As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.

Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:24 -07:00
Junio C Hamano ceeacc501b Merge branch 'bb/date-iso-strict'
"log --date=iso" uses a slight variant of ISO 8601 format that is
made more human readable.  A new "--date=iso-strict" option gives
datetime output that is more strictly conformant.

* bb/date-iso-strict:
  pretty: provide a strict ISO 8601 date format
2014-09-19 11:38:32 -07:00
Junio C Hamano 929df991c2 Merge branch 'sb/blame-msg-i18n'
* sb/blame-msg-i18n:
  builtin/blame.c: add translation to warning about failed revision walk
2014-09-09 12:54:03 -07:00
Beat Bolli 466fb6742d pretty: provide a strict ISO 8601 date format
Git's "ISO" date format does not really conform to the ISO 8601
standard due to small differences, and it cannot be parsed by ISO
8601-only parsers, e.g. those of XML toolchains.

The output from "--date=iso" deviates from ISO 8601 in these ways:

  - a space instead of the `T` date/time delimiter
  - a space between time and time zone
  - no colon between hours and minutes of the time zone

Add a strict ISO 8601 date format for displaying committer and
author dates.  Use the '%aI' and '%cI' format specifiers and add
'--date=iso-strict' or '--date=iso8601-strict' date format names.

See http://thread.gmane.org/gmane.comp.version-control.git/255879 and
http://thread.gmane.org/gmane.comp.version-control.git/52414/focus=52585
for discussion.

Signed-off-by: Beat Bolli <bbolli@ewanet.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-29 12:37:02 -07:00
Stefan Beller 201087422d builtin/blame.c: add translation to warning about failed revision walk
Signed-off-by: Stefan Beller <stefanbeller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-12 11:01:44 -07:00
Jeff King fe24d396e1 move setting of object->type to alloc_* functions
The "struct object" type implements basic object
polymorphism.  Individual instances are allocated as
concrete types (or as a union type that can store any
object), and a "struct object *" can be cast into its real
type after examining its "type" enum.  This means it is
dangerous to have a type field that does not match the
allocation (e.g., setting the type field of a "struct blob"
to "OBJ_COMMIT" would mean that a reader might read past the
allocated memory).

In most of the current code this is not a problem; the first
thing we do after allocating an object is usually to set its
type field by passing it to create_object. However, the
virtual commits we create in merge-recursive.c do not ever
get their type set. This does not seem to have caused
problems in practice, though (presumably because we always
pass around a "struct commit" pointer and never even look at
the type).

We can fix this oversight and also make it harder for future
code to get it wrong by setting the type directly in the
object allocation functions.

This will also make it easier to fix problems with commit
index allocation, as we know that any object allocated by
alloc_commit_node will meet the invariant that an object
with an OBJ_COMMIT type field will have a unique index
number.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-28 10:14:33 -07:00
Junio C Hamano 12621cb222 Merge branch 'rs/code-cleaning'
* rs/code-cleaning:
  remote-testsvn: use internal argv_array of struct child_process in cmd_import()
  bundle: use internal argv_array of struct child_process in create_bundle()
  fast-import: use hashcmp() for SHA1 hash comparison
  transport: simplify fetch_objs_via_rsync() using argv_array
  run-command: use internal argv_array of struct child_process in run_hook_ve()
  use commit_list_count() to count the members of commit_lists
  strbuf: use strbuf_addstr() for adding C strings
2014-07-22 10:59:37 -07:00
Junio C Hamano 10b944b37b Merge branch 'jk/alloc-commit-id'
Make sure all in-core commit objects are assigned a unique number
so that they can be annotated using the commit-slab API.

* jk/alloc-commit-id:
  diff-tree: avoid lookup_unknown_object
  object_as_type: set commit index
  alloc: factor out commit index
  add object_as_type helper for casting objects
  parse_object_buffer: do not set object type
  move setting of object->type to alloc_* functions
  alloc: write out allocator definitions
  alloc.c: remove the alloc_raw_commit_node() function
2014-07-22 10:59:25 -07:00
Junio C Hamano 9ab0882255 Merge branch 'maint'
* maint:
  use xmemdupz() to allocate copies of strings given by start and length
  use xcalloc() to allocate zero-initialized memory
2014-07-21 12:35:39 -07:00
René Scharfe 5c0b13f85a use xmemdupz() to allocate copies of strings given by start and length
Use xmemdupz() to allocate the memory, copy the data and make sure to
NUL-terminate the result, all in one step.  The resulting code is
shorter, doesn't contain the constants 1 and '\0', and avoids
duplicating function parameters.

For blame, the last copied byte (o->file.ptr[o->file.size]) is always
set to NUL by fake_working_tree_commit() or read_sha1_file(), so no
information is lost by the conversion to using xmemdupz().

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-21 10:37:02 -07:00
René Scharfe 4bbaa1eb6f use commit_list_count() to count the members of commit_lists
Call commit_list_count() instead of open-coding it repeatedly.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-17 13:36:25 -07:00
Junio C Hamano 788cef81d4 Merge branch 'nd/split-index'
An experiment to use two files (the base file and incremental
changes relative to it) to represent the index to reduce I/O cost
of rewriting a large index when only small part of the working tree
changes.

* nd/split-index: (32 commits)
  t1700: new tests for split-index mode
  t2104: make sure split index mode is off for the version test
  read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
  read-tree: note about dropping split-index mode or index version
  read-tree: force split-index mode off on --index-output
  rev-parse: add --shared-index-path to get shared index path
  update-index --split-index: do not split if $GIT_DIR is read only
  update-index: new options to enable/disable split index mode
  split-index: strip pathname of on-disk replaced entries
  split-index: do not invalidate cache-tree at read time
  split-index: the reading part
  split-index: the writing part
  read-cache: mark updated entries for split index
  read-cache: save deleted entries in split index
  read-cache: mark new entries for split index
  read-cache: split-index mode
  read-cache: save index SHA-1 after reading
  entry.c: update cache_changed if refresh_cache is set in checkout_entry()
  cache-tree: mark istate->cache_changed on prime_cache_tree()
  cache-tree: mark istate->cache_changed on cache tree update
  ...
2014-07-16 11:25:40 -07:00
Junio C Hamano 5c18fde0d9 Merge branch 'jk/commit-buffer-length' into maint
A handful of code paths had to read the commit object more than
once when showing header fields that are usually not parsed.  The
internal data structure to keep track of the contents of the commit
object has been updated to reduce the need for this double-reading,
and to allow the caller find the length of the object.

* jk/commit-buffer-length:
  reuse cached commit buffer when parsing signatures
  commit: record buffer length in cache
  commit: convert commit->buffer to a slab
  commit-slab: provide a static initializer
  use get_commit_buffer everywhere
  convert logmsg_reencode to get_commit_buffer
  use get_commit_buffer to avoid duplicate code
  use get_cached_commit_buffer where appropriate
  provide helpers to access the commit buffer
  provide a helper to set the commit buffer
  provide a helper to free commit buffer
  sequencer: use logmsg_reencode in get_message
  logmsg_reencode: return const buffer
  do not create "struct commit" with xcalloc
  commit: push commit_index update into alloc_commit_node
  alloc: include any-object allocations in alloc_report
  replace dangerous uses of strbuf_attach
  commit_tree: take a pointer/len pair rather than a const strbuf
2014-07-16 11:16:38 -07:00
Jeff King d36f51c13b move setting of object->type to alloc_* functions
The "struct object" type implements basic object
polymorphism.  Individual instances are allocated as
concrete types (or as a union type that can store any
object), and a "struct object *" can be cast into its real
type after examining its "type" enum.  This means it is
dangerous to have a type field that does not match the
allocation (e.g., setting the type field of a "struct blob"
to "OBJ_COMMIT" would mean that a reader might read past the
allocated memory).

In most of the current code this is not a problem; the first
thing we do after allocating an object is usually to set its
type field by passing it to create_object. However, the
virtual commits we create in merge-recursive.c do not ever
get their type set. This does not seem to have caused
problems in practice, though (presumably because we always
pass around a "struct commit" pointer and never even look at
the type).

We can fix this oversight and also make it harder for future
code to get it wrong by setting the type directly in the
object allocation functions.

This will also make it easier to fix problems with commit
index allocation, as we know that any object allocated by
alloc_commit_node will meet the invariant that an object
with an OBJ_COMMIT type field will have a unique index
number.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-13 18:59:05 -07:00
Junio C Hamano 8061ae8b46 Merge branch 'jk/commit-buffer-length'
Move "commit->buffer" out of the in-core commit object and keep
track of their lengths.  Use this to optimize the code paths to
validate GPG signatures in commit objects.

* jk/commit-buffer-length:
  reuse cached commit buffer when parsing signatures
  commit: record buffer length in cache
  commit: convert commit->buffer to a slab
  commit-slab: provide a static initializer
  use get_commit_buffer everywhere
  convert logmsg_reencode to get_commit_buffer
  use get_commit_buffer to avoid duplicate code
  use get_cached_commit_buffer where appropriate
  provide helpers to access the commit buffer
  provide a helper to set the commit buffer
  provide a helper to free commit buffer
  sequencer: use logmsg_reencode in get_message
  logmsg_reencode: return const buffer
  do not create "struct commit" with xcalloc
  commit: push commit_index update into alloc_commit_node
  alloc: include any-object allocations in alloc_report
  replace dangerous uses of strbuf_attach
  commit_tree: take a pointer/len pair rather than a const strbuf
2014-07-02 12:53:02 -07:00
Junio C Hamano 8d87e35bab Merge branch 'rs/blame-refactor'
* rs/blame-refactor:
  blame: simplify prepare_lines()
  blame: factor out get_next_line()
2014-06-25 12:23:36 -07:00
Junio C Hamano 4d27d8cbc4 Merge branch 'bc/blame-crlf-test' into maint
"git blame" assigned the blame to the copy in the working-tree if
the repository is set to core.autocrlf=input and the file used CRLF
line endings.

* bc/blame-crlf-test:
  blame: correctly handle files regardless of autocrlf
2014-06-25 11:46:45 -07:00
René Scharfe 60d85e110b blame: simplify prepare_lines()
Changing get_next_line() to return the end pointer instead of NULL in
case no newline character is found treats allows us to treat complete
and incomplete lines the same, simplifying the code.  Switching to
counting lines instead of EOLs allows us to start counting at the
first character, instead of having to call get_next_line() first.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 14:52:50 -07:00
René Scharfe 29aa0b2061 blame: factor out get_next_line()
Move the code for finding the start of the next line into a helper
function in order to reduce duplication.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 14:52:16 -07:00
Jeff King 8597ea3afe commit: record buffer length in cache
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.

For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.

This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 12:09:38 -07:00
Jeff King b66103c3ba convert logmsg_reencode to get_commit_buffer
Like the callsites in the previous commit, logmsg_reencode
already falls back to read_sha1_file when necessary.
However, I split its conversion out into its own commit
because it's a bit more complex.

We return either:

  1. The original commit->buffer

  2. A newly allocated buffer from read_sha1_file

  3. A reencoded buffer (based on either 1 or 2 above).

while trying to do as few extra reads/allocations as
possible. Callers currently free the result with
logmsg_free, but we can simplify this by pointing them
straight to unuse_commit_buffer. This is a slight layering
violation, in that we may be passing a buffer from (3).
However, since the end result is to free() anything except
(1), which is unlikely to change, and because this makes the
interface much simpler, it's a reasonable bending of the
rules.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 12:08:17 -07:00
Jeff King 66c2827ea4 provide a helper to set the commit buffer
Right now this is just a one-liner, but abstracting it will
make it easier to change later.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 12:08:17 -07:00
Nguyễn Thái Ngọc Duy a5400efe29 cache-tree: mark istate->cache_changed on cache tree invalidation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Jeff King b000c59b0c logmsg_reencode: return const buffer
The return value from logmsg_reencode may be either a newly
allocated buffer or a pointer to the existing commit->buffer.
We would not want the caller to accidentally free() or
modify the latter, so let's mark it as const.  We can cast
away the constness in logmsg_free, but only once we have
determined that it is a free-able buffer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-12 10:29:43 -07:00
Jeff King 10322a0aaf do not create "struct commit" with xcalloc
In both blame and merge-recursive, we sometimes create a
"fake" commit struct for convenience (e.g., to represent the
HEAD state as if we would commit it). By allocating
ourselves rather than using alloc_commit_node, we do not
properly set the "index" field of the commit. This can
produce subtle bugs if we then use commit-slab on the
resulting commit, as we will share the "0" index with
another commit.

We can fix this by using alloc_commit_node() to allocate.
Note that we cannot free the result, as it is part of our
commit allocator. However, both cases were already leaking
the allocated commit anyway, so there's nothing to fix up.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-12 10:29:42 -07:00
Junio C Hamano e934c67b66 Merge branch 'bc/blame-crlf-test'
If a file contained CRLF line endings in a repository with
core.autocrlf=input, then blame always marked lines as "Not
Committed Yet", even if they were unmodified.

* bc/blame-crlf-test:
  blame: correctly handle files regardless of autocrlf
2014-06-06 11:26:50 -07:00