Commit graph

63 commits

Author SHA1 Message Date
Đoàn Trần Công Danh f1aa299443 mailinfo: allow squelching quoted CRLF warning
In previous change, Git starts to warn for quoted CRLF in decoded
base64/QP email. Despite those warnings are usually helpful,
quoted CRLF could be part of some users' workflow.

Let's give them an option to turn off the warning completely.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
Đoàn Trần Công Danh dd9323b7fb mailinfo: stop parsing options manually
In a later change, mailinfo will learn more options, let's switch to our
robust parse_options framework before that step.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:26 +09:00
Đoàn Trần Công Danh d582992e80 mailinfo: load default metainfo_charset lazily
In a later change, we will use parse_option to parse mailinfo's options.
In mailinfo, both "-u", "-n", and "--encoding" try to set the same
field, with "-u" reset that field to some default value from
configuration variable "i18n.commitEncoding".

Let's delay the setting of that field until we finish processing all
options. By doing that, "i18n.commitEncoding" can be parsed on demand.
More importantly, it cleans the way for using parse_option.

This change introduces some inconsistent brackets "{}" in "if/else if"
construct, however, we will rewrite them in the next few changes.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:25 +09:00
Jeff King e4da43b1f0 prefix_filename: return newly allocated string
The prefix_filename() function returns a pointer to static
storage, which makes it easy to use dangerously. We already
fixed one buggy caller in hash-object recently, and the
calls in apply.c are suspicious (I didn't dig in enough to
confirm that there is a bug, but we call the function once
in apply_all_patches() and then again indirectly from
parse_chunk()).

Let's make it harder to get wrong by allocating the return
value. For simplicity, we'll do this even when the prefix is
empty (and we could just return the original file pointer).
That will cause us to allocate sometimes when we wouldn't
otherwise need to, but this function isn't called in
performance critical code-paths (and it already _might_
allocate on any given call, so a caller that cares about
performance is questionable anyway).

The downside is that the callers need to remember to free()
the result to avoid leaking. Most of them already used
xstrdup() on the result, so we know they are OK. The
remainder have been converted to use free() as appropriate.

I considered retaining a prefix_filename_unsafe() for cases
where we know the static lifetime is OK (and handling the
cleanup is awkward). This is only a handful of cases,
though, and it's not worth the mental energy in worrying
about whether the "unsafe" variant is OK to use in any
situation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-21 11:18:41 -07:00
Jeff King 116fb64e43 prefix_filename: drop length parameter
This function takes the prefix as a ptr/len pair, but in
every caller the length is exactly strlen(ptr). Let's
simplify the interface and just take the string. This saves
callers specifying it (and in some cases handling a NULL
prefix).

In a handful of cases we had the length already without
calling strlen, so this is technically slower. But it's not
likely to matter (after all, if the prefix is non-empty
we'll allocate and copy it into a buffer anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-21 11:12:53 -07:00
Junio C Hamano 3f0ec0687d mailinfo: read local configuration
Since b9605bc4f2 ("config: only read .git/config from configured
repos", 2016-09-12), we do not read from ".git/config" unless we
know we are in a repository.  "git mailinfo" however didn't do the
repository discovery and instead relied on the old behaviour.  This
was mostly OK because it was merely run as a helper program by other
porcelain scripts that first chdir's up to the root of the working
tree.

Teach the command to run a "gentle" version of repository discovery
so that local configuration variables like mailinfo.scissors are
honoured.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-22 13:13:16 -08:00
Junio C Hamano c6905e45f0 mailinfo: libify
Move the bulk of the code from builtin/mailinfo.c to mailinfo.c
so that new callers can start calling mailinfo() directly.

Note that a few calls to exit() and die() need to be cleaned up
for the API to be truly useful, which will come in later steps.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:59:34 -07:00
Junio C Hamano 05e625e5bf mailinfo: keep the parsed log message in a strbuf
When mailinfo() is eventually libified, the calling "git am" still
will have to write out the log message in the "msg" file for hooks
and other users of the information, but it does not have to reopen
and reread what it wrote earlier if the function kept it in a strbuf.

This also removes the need for seeking and truncating the output
file when we see a scissors mark in the input, which in turn allows
us to lose two callsites of die_errno().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:57:17 -07:00
Junio C Hamano 4933910ab7 mailinfo: handle_commit_msg() shouldn't be called after finding patchbreak
There is a strange "if (!mi->cmitmsg) return 0" at the very beginning
of handle_commit_msg(), but the condition should never trigger, because:

 * The only place cmitmsg is set to NULL is after this function sees
   a patch break, closes the FILE * to write the commit log message
   and returns 1.  This function returns non-zero only from that
   codepath.

 * The caller of this function, upon seeing a non-zero return,
   increments filter_stage, starts treating the input as patch text
   and will never call handle_commit_msg() again.

Replace it with an assert(!mi->filter_stage) to ensure the above
observation will stay to be true.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:57:17 -07:00
Junio C Hamano 8e919277e0 mailinfo: move content/content_top to struct mailinfo
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:57:17 -07:00
Junio C Hamano d895bf0f57 mailinfo: move [ps]_hdr_data to struct mailinfo
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:56:17 -07:00
Junio C Hamano 8f63588a6e mailinfo: move cmitmsg and patchfile to struct mailinfo
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:55:01 -07:00
Junio C Hamano f1e037b9af mailinfo: move charset to struct mailinfo
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:55:01 -07:00
Junio C Hamano ab50e38b5d mailinfo: move transfer_encoding to struct mailinfo
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:53:25 -07:00
Junio C Hamano 28c6bfe94c mailinfo: move check for metainfo_charset to convert_to_utf8()
All callers of this function refrain from calling it when
mi->metainfo_charset is NULL; move the check to the callee,
as it already has a few conditions at its beginning to turn
it into a no-op.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:50:17 -07:00
Junio C Hamano 28be2d083c mailinfo: move metainfo_charset to struct mailinfo
This requires us to pass the struct down to decode_header() and
convert_to_utf8() callchain.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:50:17 -07:00
Junio C Hamano ad57ef9da9 mailinfo: move use_scissors and use_inbody_headers to struct mailinfo
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:42:57 -07:00
Junio C Hamano 6200b751bb mailinfo: move add_message_id and message_id to struct mailinfo
This requires us to pass the structure into check_header() codepath.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:42:57 -07:00
Junio C Hamano 43550efa71 mailinfo: move patch_lines to struct mailinfo
This one is trivial thanks to previous steps that started passing
the structure throughout the input codepaths.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:39:01 -07:00
Junio C Hamano 13c6df2642 mailinfo: move filter/header stage to struct mailinfo
Earlier we got rid of two function-scope static variables that kept
track of the states of helper functions by making them extra arguments
that are passed throughout the callchain.  Now we have a convenient
place to store and pass them around in the form of "struct mailinfo",
change them into two fields in the struct.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:39:01 -07:00
Junio C Hamano 173aef7c2e mailinfo: move global "FILE *fin, *fout" to struct mailinfo
This requires us to pass "struct mailinfo" to more functions
throughout the codepath that read input lines.  Incidentally,
later steps are helped by this patch passing the struct to
more callchains.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:39:01 -07:00
Junio C Hamano 849106d511 mailinfo: move keep_subject & keep_non_patch_bracket to struct mailinfo
These two are the only easy ones that do not require passing the
structure around to deep corners of the callchain.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:37:53 -07:00
Junio C Hamano c69f2395ba mailinfo: introduce "struct mailinfo" to hold globals
In this first step, move only 'email' and 'name' fields in there and
remove the corresponding globals.  In subsequent patches, more
globals will be moved to this and the structure will be passed
around as a new parameter to more functions.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:37:52 -07:00
Junio C Hamano 6e21b5089f mailinfo: move global "line" into mailinfo() function
With the previous steps, it becomes clear that the mailinfo()
function is the only one that wants the "line" to be directly
touchable.  Move it to the function scope of this function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:37:52 -07:00
Junio C Hamano fbbcafd060 mailinfo: do not let find_boundary() touch global "line" directly
With the previous two commits, we established that the local
variable "line" in handle_body() and handle_boundary() functions
always refer to the global "line" that is used as the common and
shared "current line from the input".  They are the only callers of
the last function that refers to the global line directly, i.e.
find_boundary().  Pass "line" as a parameter to this leaf function
to complete the clean-up.  Now the only function that directly refers
to the global "line" is the caller of handle_body() at the very
beginning of this whole callchain.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:37:50 -07:00
Junio C Hamano 69e24defd6 mailinfo: do not let handle_boundary() touch global "line" directly
This function has a single caller, and called with the global "line"
holding the multi-part boundary line the caller saw while processing
the e-mail body.  The function then goes into a loop to process each
line of the input, and fills the same global "line" variable from
the input as it needs to read more lines to process the multi-part
headers.

Let the caller explicitly pass a pointer to this global "line"
variable as an argument, and have the function itself use that
strbuf throughout, instead of referring to the global "line" itself.

There still is a helper function that this function calls that still
touches the global directly; it will be updated as the series progresses.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:36:37 -07:00
Junio C Hamano fde00d50f6 mailinfo: do not let handle_body() touch global "line" directly
This function has a single caller, and called with the global "line"
holding the first line of the e-mail body after the caller finished
processing the e-mail headers.  The function then goes into a loop
to process each line of the input, starting from what was given by
its caller, and fills the same global "line" variable from the input
as it needs to process more lines.

Let the caller explicitly pass a pointer to this global "line"
variable as an argument, and have the function itself use that
strbuf throughout, instead of referring to the global "line" itself.

There are helper functions that this function calls that still touch
the global directly; they will be updated as the series progresses.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:36:37 -07:00
Junio C Hamano 269e239c48 mailinfo: get rid of function-local static states
Two helper functions use "static int" in their scope to keep track
of the state while repeatedly getting called once for each input
line.  Move these state variables to their ultimate caller and pass
down pointers to them along the callchain, as a small step in
preparation for making this entire callchain more reentrant.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:36:37 -07:00
Junio C Hamano c1b40bd7b6 mailinfo: move definition of MAX_HDR_PARSED closer to its use
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:34:49 -07:00
Junio C Hamano 30f50c3426 mailinfo: move cleanup_space() before its users
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:33:39 -07:00
Junio C Hamano 4f0f9d46c7 mailinfo: move check_header() after the helpers it uses
This way, we can lose a forward decl for decode_header().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:32:43 -07:00
Junio C Hamano 9cc243f7a9 mailinfo: move read_one_header_line() closer to its callers
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:30:15 -07:00
Junio C Hamano 39afcd3819 mailinfo: move handle_boundary() lower
This function wants to call find_boundary() and is called only from
one place without any recursing, so it becomes easier to read if it
appears after the called function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:20:49 -07:00
Junio C Hamano 12d19e80b0 mailinfo: plug strbuf leak during continuation line handling
Whether this loop is left via EOF/break or upon finding a
non-continuation line, the storage used for the contination line
handling is left behind.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-21 15:18:50 -07:00
Junio C Hamano e38ee06e99 mailinfo: explicitly close file handle to the patch output
This does not make a difference within the context of "git mailinfo"
that runs once and exits, as flushing and closing would happen upon
process termination.  It however will matter when we eventually make
it callable as an API function.

Besides, cleaning after yourself once you are done is a good hygiene.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-18 22:13:27 -07:00
Junio C Hamano b6af8ed13a mailinfo: fix an off-by-one error in the boundary stack
We pre-increment the pointer that we will use to store something at,
so the pointer is already beyond the end of the array if it points
at content[MAX_BOUNDARIES].

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-18 22:13:27 -07:00
Junio C Hamano 3a8fcdaf84 mailinfo: fold decode_header_bq() into decode_header()
In olden days we might have wanted to behave differently in
decode_header() if the header line was encoded with RFC2047, but we
apparently do not do so, hence this helper function can go, together
with its return value.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-18 22:13:27 -07:00
Junio C Hamano 2a5ce7cf0d mailinfo: remove a no-op call convert_to_utf8(it, "")
The called function checks if the second parameter is either a NULL
or an empty string at the very beginning and returns without doing
anything.  Remove the useless call.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-18 22:13:27 -07:00
Alex Henrie 9c9b4f2f8b standardize usage info string format
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:

- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-14 09:32:04 -08:00
Paolo Bonzini 452dfbed1a git-mailinfo: add --message-id
This option adds the content of the Message-Id header at the end of the
commit message prepared by git-mailinfo.  This is useful in order to
associate commit messages automatically with mailing list discussions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-25 15:24:55 -08:00
Eric Sunshine 85de86a16b mailinfo: work around -Wstring-plus-int warning
The just-released Apple Xcode 6.0.1 has -Wstring-plus-int enabled by
default which complains about pointer arithmetic applied to a string
literal:

    builtin/mailinfo.c:303:24: warning:
        adding 'long' to a string does not append to the string
            return !memcmp(SAMPLE + (cp - line), cp, strlen(SAMPLE) ...
                           ~~~~~~~^~~~~~~~~~~~~

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-22 13:46:43 -07:00
Jeff King 2da1f36671 mailinfo: make ">From" in-body header check more robust
Since commit 81c5cf7 (mailinfo: skip bogus UNIX From line inside
body, 2006-05-21), we have treated lines like ">From" in the body as
headers. This makes "git am" work for people who erroneously paste
the whole output from format-patch:

  From 12345abcd...fedcba543210 Mon Sep 17 00:00:00 2001
  From: them
  Subject: [PATCH] whatever

into their email body (assuming that an mbox writer then quotes
"From" as ">From", as otherwise we would actually mailsplit on the
in-body line).

However, this has false positives if somebody actually has a commit
body that starts with "From "; in this case we erroneously remove
the line entirely from the commit message. We can make this check
more robust by making sure the line actually looks like a real mbox
"From" line.

Inspect the line that begins with ">From " a more carefully to only
skip lines that match the expected pattern (note that the datestamp
part of the format-patch output is designed to be kept constant to
help those who write magic(5) entries).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-16 11:05:46 -07:00
Junio C Hamano d37e8c54a6 Merge branch 'rs/mailinfo-header-cmp'
Avoid running over the end of header string while parsing an
incoming e-mail message to extract the patch.

* rs/mailinfo-header-cmp:
  mailinfo: use strcmp() for string comparison
2014-06-09 11:27:53 -07:00
René Scharfe b1a013dd6a mailinfo: use strcmp() for string comparison
The array header is defined as:

	static const char *header[MAX_HDR_PARSED] = {
	     "From","Subject","Date",
	};

When looking for the index of a specfic string in that array, simply
use strcmp() instead of memcmp().  This avoids running over the end of
the string (e.g. with memcmp("Subject", "From", 7)) and gets rid of
magic string length constants.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-02 13:30:18 -07:00
Christian Couder 5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Junio C Hamano 6b8731258d Merge branch 'jc/same-encoding'
Various codepaths checked if two encoding names are the same using
ad-hoc code and some of them ended up asking iconv() to convert
between "utf8" and "UTF-8".  The former is not a valid way to spell
the encoding name, but often people use it by mistake, and we
equated them in some but not all codepaths. Introduce a new helper
function to make these codepaths consistent.

* jc/same-encoding:
  reencode_string(): introduce and use same_encoding()

Conflicts:
	builtin/mailinfo.c
2012-11-15 10:24:05 -08:00
Junio C Hamano 0e18bcd5e9 reencode_string(): introduce and use same_encoding()
Callers of reencode_string() that re-encodes a string from one
encoding to another all used ad-hoc way to bypass the case where the
input and the output encodings are the same.  Some did strcmp(),
some did strcasecmp(), yet some others when converting to UTF-8 used
is_encoding_utf8().

Introduce same_encoding() helper function to make these callers use
the same logic.  Notably, is_encoding_utf8() has a work-around for
common misconfiguration to use "utf8" to name UTF-8 encoding, which
does not match "UTF-8" hence strcasecmp() would not consider the
same.  Make use of it in this helper function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-04 08:10:33 -05:00
Junio C Hamano 6e2035715e Merge branch 'lt/mailinfo-handle-attachment-more-sanely' into maint
A patch attached as application/octet-stream (e.g. not text/*) were
mishandled, not correctly honoring Content-Transfer-Encoding
(e.g. base64).

* lt/mailinfo-handle-attachment-more-sanely:
  mailinfo: don't require "text" mime type for attachments
2012-10-08 11:33:00 -07:00
Junio C Hamano 5ce993a812 Merge branch 'lt/mailinfo-handle-attachment-more-sanely'
A patch attached as application/octet-stream (e.g. not text/*) were
mishandled, not correctly honoring Content-Transfer-Encoding
(e.g. base64).

* lt/mailinfo-handle-attachment-more-sanely:
  mailinfo: don't require "text" mime type for attachments
2012-10-02 21:13:35 -07:00
Linus Torvalds 9d55b2e12f mailinfo: don't require "text" mime type for attachments
Currently "git am" does insane things if the mbox it is given contains
attachments with a MIME type that aren't "text/*".

In particular, it will still decode them, and pass them "one line at a
time" to the mail body filter, but because it has determined that they
aren't text (without actually looking at the contents, just at the mime
type) the "line" will be the encoding line (eg 'base64') rather than a
line of *content*.

Which then will cause the text filtering to fail, because we won't
correctly notice when the attachment text switches from the commit message
to the actual patch. Resulting in a patch failure, even if patch may be a
perfectly well-formed attachment, it's just that the message type may be
(for example) "application/octet-stream" instead of "text/plain".

Just remove all the bogus games with the message_type. The only difference
that code creates is how the data is passed to the filter function
(chunked per-pred-code line or per post-decode line), and that difference
is *wrong*, since chunking things per pre-decode line can never be a
sensible operation, and cannot possibly matter for binary data anyway.

This code goes all the way back to March of 2007, in commit 87ab799234
("builtin-mailinfo.c infrastrcture changes"), and apparently Don used to
pass random mbox contents to git. However, the pre-decode vs post-decode
logic really shouldn't matter even for that case, and more importantly, "I
fed git am crap" is not a valid reason to break *real* patch attachments.

If somebody really cares, and determines that some attachment is binary
data (by looking at the data, not the MIME-type), the whole attachment
should be dismissed, rather than fed in random-sized chunks to
"handle_filter()".

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-30 17:29:27 -07:00