Commit graph

68 commits

Author SHA1 Message Date
Junio C Hamano 6b8731258d Merge branch 'jc/same-encoding'
Various codepaths checked if two encoding names are the same using
ad-hoc code and some of them ended up asking iconv() to convert
between "utf8" and "UTF-8".  The former is not a valid way to spell
the encoding name, but often people use it by mistake, and we
equated them in some but not all codepaths. Introduce a new helper
function to make these codepaths consistent.

* jc/same-encoding:
  reencode_string(): introduce and use same_encoding()

Conflicts:
	builtin/mailinfo.c
2012-11-15 10:24:05 -08:00
Junio C Hamano 0e18bcd5e9 reencode_string(): introduce and use same_encoding()
Callers of reencode_string() that re-encodes a string from one
encoding to another all used ad-hoc way to bypass the case where the
input and the output encodings are the same.  Some did strcmp(),
some did strcasecmp(), yet some others when converting to UTF-8 used
is_encoding_utf8().

Introduce same_encoding() helper function to make these callers use
the same logic.  Notably, is_encoding_utf8() has a work-around for
common misconfiguration to use "utf8" to name UTF-8 encoding, which
does not match "UTF-8" hence strcasecmp() would not consider the
same.  Make use of it in this helper function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-04 08:10:33 -05:00
Junio C Hamano 6e2035715e Merge branch 'lt/mailinfo-handle-attachment-more-sanely' into maint
A patch attached as application/octet-stream (e.g. not text/*) were
mishandled, not correctly honoring Content-Transfer-Encoding
(e.g. base64).

* lt/mailinfo-handle-attachment-more-sanely:
  mailinfo: don't require "text" mime type for attachments
2012-10-08 11:33:00 -07:00
Junio C Hamano 5ce993a812 Merge branch 'lt/mailinfo-handle-attachment-more-sanely'
A patch attached as application/octet-stream (e.g. not text/*) were
mishandled, not correctly honoring Content-Transfer-Encoding
(e.g. base64).

* lt/mailinfo-handle-attachment-more-sanely:
  mailinfo: don't require "text" mime type for attachments
2012-10-02 21:13:35 -07:00
Linus Torvalds 9d55b2e12f mailinfo: don't require "text" mime type for attachments
Currently "git am" does insane things if the mbox it is given contains
attachments with a MIME type that aren't "text/*".

In particular, it will still decode them, and pass them "one line at a
time" to the mail body filter, but because it has determined that they
aren't text (without actually looking at the contents, just at the mime
type) the "line" will be the encoding line (eg 'base64') rather than a
line of *content*.

Which then will cause the text filtering to fail, because we won't
correctly notice when the attachment text switches from the commit message
to the actual patch. Resulting in a patch failure, even if patch may be a
perfectly well-formed attachment, it's just that the message type may be
(for example) "application/octet-stream" instead of "text/plain".

Just remove all the bogus games with the message_type. The only difference
that code creates is how the data is passed to the filter function
(chunked per-pred-code line or per post-decode line), and that difference
is *wrong*, since chunking things per pre-decode line can never be a
sensible operation, and cannot possibly matter for binary data anyway.

This code goes all the way back to March of 2007, in commit 87ab799234
("builtin-mailinfo.c infrastrcture changes"), and apparently Don used to
pass random mbox contents to git. However, the pre-decode vs post-decode
logic really shouldn't matter even for that case, and more importantly, "I
fed git am crap" is not a valid reason to break *real* patch attachments.

If somebody really cares, and determines that some attachment is binary
data (by looking at the data, not the MIME-type), the whole attachment
should be dismissed, rather than fed in random-sized chunks to
"handle_filter()".

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Don Zickus <dzickus@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-30 17:29:27 -07:00
Junio C Hamano 52938b113b Merge branch 'jc/maint-mailinfo-mime-attr' into maint
* jc/maint-mailinfo-mime-attr:
  mailinfo: do not concatenate charset= attribute values from mime headers
2012-09-29 22:30:48 -07:00
Junio C Hamano b1bb02dede Merge branch 'jc/maint-mailinfo-mime-attr'
When "git am" is fed an input that has multiple "Content-type: ..."
header, it did not grok charset= attribute correctly.

* jc/maint-mailinfo-mime-attr:
  mailinfo: do not concatenate charset= attribute values from mime headers
2012-09-25 10:39:56 -07:00
Junio C Hamano 176943b965 mailinfo: do not concatenate charset= attribute values from mime headers
"Content-type: text/plain; charset=UTF-8" header should not appear
twice in the input, but it is always better to gracefully deal with
such a case.  The current code concatenates the value to the values
we have seen previously, producing nonsense such as "utf8UTF-8".

Instead of concatenating, forget the previous value and use the last
value we see.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-17 15:24:52 -07:00
Junio C Hamano cd14f3e17c Merge branch 'jc/mailinfo-RE'
We strip the prefix from "Re: subject" and also from a less common
"re: subject", but left even less common "RE: subject" intact.

* jc/mailinfo-RE:
  mailinfo: strip "RE: " prefix
2012-09-14 21:39:27 -07:00
Junio C Hamano d5b4d80d1c mailinfo: strip "RE: " prefix
We already strip the more common Re: and re:, and we do not often
see RE: from saner MUA, but this prefix does exist and gets used
from time to time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-09 02:59:58 -07:00
Linus Torvalds 08a94a145c commit/commit-tree: correct latin1 to utf-8
When a line in the message is not a valid utf-8, "git mailinfo"
attempts to convert it to utf-8 assuming the input is latin1 (and
punt if it does not convert cleanly).  Using the same heuristics in
"git commit" and "git commit-tree" lets the editor output be in
latin1 to make the overall system more consistent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-21 16:10:53 -07:00
Thomas Rast ee2d1cb402 mailinfo: with -b, keep space after [foo]
The logic for the -b mode, where [PATCH] is dropped but [foo] is not,
silently ate all spaces after the ].

Fix this by keeping the next isspace() character, if there is any.
Being more thorough is pointless, as the later cleanup_space() call
will normalize any sequence of whitespace to a single ' '.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-16 16:06:57 -08:00
Jeff King 5b38456ec7 mailinfo: always clean up rfc822 header folding
Without the "-k" option, mailinfo will convert a folded
subject header like:

  Subject: this is a
    subject that doesn't
    fit on one line

into a single line. With "-k", however, we assumed that
these newlines were significant and represented something
that the sending side would want us to preserve.

For messages created by format-patch, this assumption was
broken by a1f6baa (format-patch: wrap long header lines,
2011-02-23).  For messages sent by arbitrary MUAs, this was
probably never a good assumption to make, as they may have
been folding subjects in accordance with rfc822's line
length recommendations all along.

This patch now joins folded lines with a single whitespace
character. This treats header folding purely as a syntactic
feature of the transport mechanism, not as something that
format-patch is trying to tell us about the original
subject.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-26 14:13:38 -07:00
Pat Notz a6fa59924d commit: helper methods to reduce redundant blocks of code
* builtin/commit.c: Replace block of code with a one-liner call to
  logmsg_reencode().

* commit.c: new function for looking up a comit by name

* pretty.c: helper methods for getting output encodings

  Add helpers get_log_output_encoding() and
  get_commit_output_encoding() that eliminate some messy and duplicate
  if-blocks.

Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-04 13:53:34 -07:00
Gary V. Vaughan 4b05548fc0 enums: omit trailing comma for portability
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.

enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.

Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.

Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 16:59:27 -07:00
Jonathan Nieder 9974e290e7 Teach mailinfo %< as an alternative scissors mark
Handle perforations found “in the wild” more robustly by recognizing
“%<” as an alternative scissors mark.

This feature is only meant to support old habits.  Discourage new use
of the percent-based version by only documenting the 8< symbol so new
users’ perforations can still be recognized by old versions of Git.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-04 10:17:55 -07:00
Junio C Hamano 2e0e8b68e3 Merge branch 'lt/deepen-builtin-source'
* lt/deepen-builtin-source:
  Move 'builtin-*' into a 'builtin/' subdirectory

Conflicts:
	Makefile
2010-03-10 15:25:18 -08:00
Linus Torvalds 81b50f3ce4 Move 'builtin-*' into a 'builtin/' subdirectory
This shrinks the top-level directory a bit, and makes it much more
pleasant to use auto-completion on the thing. Instead of

	[torvalds@nehalem git]$ em buil<tab>
	Display all 180 possibilities? (y or n)
	[torvalds@nehalem git]$ em builtin-sh
	builtin-shortlog.c     builtin-show-branch.c  builtin-show-ref.c
	builtin-shortlog.o     builtin-show-branch.o  builtin-show-ref.o
	[torvalds@nehalem git]$ em builtin-shor<tab>
	builtin-shortlog.c  builtin-shortlog.o
	[torvalds@nehalem git]$ em builtin-shortlog.c

you get

	[torvalds@nehalem git]$ em buil<tab>		[type]
	builtin/   builtin.h
	[torvalds@nehalem git]$ em builtin		[auto-completes to]
	[torvalds@nehalem git]$ em builtin/sh<tab>	[type]
	shortlog.c     shortlog.o     show-branch.c  show-branch.o  show-ref.c     show-ref.o
	[torvalds@nehalem git]$ em builtin/sho		[auto-completes to]
	[torvalds@nehalem git]$ em builtin/shor<tab>	[type]
	shortlog.c  shortlog.o
	[torvalds@nehalem git]$ em builtin/shortlog.c

which doesn't seem all that different, but not having that annoying
break in "Display all 180 possibilities?" is quite a relief.

NOTE! If you do this in a clean tree (no object files etc), or using an
editor that has auto-completion rules that ignores '*.o' files, you
won't see that annoying 'Display all 180 possibilities?' message - it
will just show the choices instead.  I think bash has some cut-off
around 100 choices or something.

So the reason I see this is that I'm using an odd editory, and thus
don't have the rules to cut down on auto-completion.  But you can
simulate that by using 'ls' instead, or something similar.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-22 14:29:41 -08:00
Renamed from builtin-mailinfo.c (Browse further)