Commit graph

21 commits

Author SHA1 Message Date
Johannes Schindelin c839fcff65 import-tars: ignore the global PAX header
The tar importer in `contrib/fast-import/import-tars.perl` has a very
convenient feature: if _all_ paths stored in the imported `.tar` start
with a common prefix, e.g. `git-2.26.0/` in the tar at
https://github.com/git/git/archive/v2.26.0.tar.gz, then this prefix is
stripped.

This feature makes a ton of sense because it is relatively common to
import two or more revisions of the same project into Git, and obviously
we don't want all files to live in a tree whose name changes from
revision to revision.

Now, the problem with that feature is that it breaks down if there is a
`pax_global_header` "file" located outside of said prefix, at the top of
the tree. This is the case for `.tar` files generated by Git's very own
`git archive` command: it inserts that header, and `git archive` allows
specifying a common prefix (that the header does _not_ share with the
other files contained in the archive) via `--prefix=my-project-1.0.0/`.

Let's just skip any global header when importing `.tar` files into Git.

Note: this global header might contain useful information. For example,
in the output of `git archive`, it lists the original commit, which _is_
useful information. A future improvement to the `import-tars.perl`
script might be to include that information in the commit message, or do
other things with the information (e.g. use `mtime` information
contained in the global header as date of the commit). This patch does
not prevent any future patch from making that happen, it only prevents
the header from being treated as if it was a regular file.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-24 14:39:47 -07:00
Pedro Alvarez Piedehierro 12ecea46e3 import-tars: read overlong names from pax extended header
Importing gcc tarballs[1] with import-tars script (in contrib) fails
when hitting a pax extended header.

Make sure we always read the extended attributes from the pax entries,
and store the 'path' value if found to be used in the next ustar entry.

The code to parse pax extended headers was written consulting the Pax
Pax Interchange Format documentation [2].

[1] http://ftp.gnu.org/gnu/gcc/gcc-7.3.0/gcc-7.3.0.tar.xz
[2] https://www.freebsd.org/cgi/man.cgi?manpath=FreeBSD+8-current&query=tar&sektion=5

Signed-off-by: Pedro Alvarez <palvarez89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-24 08:35:51 +09:00
Johannes Schindelin 04e0869876 import-tars: support hard links
Previously, we simply treated hard links as if they were plain files
with size 0, ignoring the link type "1" and hence the link target.

What we should do instead, of course, is to use the link target to get
at the import mark for the contents, even if we cannot recreate the hard
link per se, as Git has no concept of hard links.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03 09:46:11 -07:00
Ingmar Vanhassel 2a94552887 import-tars: Add support for tarballs compressed with lzma, xz
Also handle the extensions .tlz and .txz, aliases for .tar.lzma and
.tar.xz respectively.

Signed-off-by: Ingmar Vanhassel <ingmar@exherbo.org>
Liked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 17:15:43 -07:00
Ingmar Vanhassel b6aaaa4470 import-tars: Add missing closing bracket
This fixes an obvious syntax error that snuck in commit 7e787953:

  syntax error at /home/ingmar/bin//git-import-tars line 143, near "/^$/ { "
  syntax error at /home/ingmar/bin//git-import-tars line 145, near "} else"
  syntax error at /home/ingmar/bin//git-import-tars line 152, near "}"

Signed-off-by: Ingmar Vanhassel <ingmar@exherbo.org>
Acked-and-Tested-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-09 14:58:13 -07:00
Peter Krefting 7e787953fb import-tars: Allow per-tar author and commit message.
If the "--metainfo=<ext>" option is given on the command line, a file
called "<filename.tar>.<ext>" will be used to create the commit message
for "<filename.tar>", instead of using "Imported from filename.tar".

The author and committer of the tar ball can also be overridden by
embedding an "Author:" or "Committer:" header in the metainfo file.

Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-03 09:44:41 -07:00
Johannes Schindelin 6fb37f86bc import-tars: support symlinks
Without this patch, symbolic links are turned into empty files.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-18 09:53:53 -07:00
Giuseppe Bilotta 6872f606d9 import-tars: separate author from committer
The import-tars script is typically employed to (re)create the past
history of a project from stored tars. Although assigning authorship in
these cases can be a somewhat arbitrary process, it makes sense to set
the author to whoever created the tars in the first place (if it's
known), and (s)he can in general be different from the committer
(whoever is running the script).

Implement this by having separate author and committer data, making them
settable from the usual GIT_* environment variables.

Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-20 09:33:28 -07:00
Junio C Hamano fdcb769916 Merge branch 'maint'
* maint:
  format-patch: add MIME-Version header when we add content-type.
  Fixed link in user-manual
  import-tars: Use the "Link indicator" to identify directories
  git name-rev writes beyond the end of malloc() with large generations
  Documentation/branch: fix small typo in -D example
2007-05-16 12:43:05 -07:00
Johannes Schindelin df8cfac815 import-tars: Use the "Link indicator" to identify directories
Earlier, we used the mode to determine if a name was associated with
a directory. This fails, since some tar programs do not set the mode
correctly. However, the link indicator _has_ to be set correctly.

Noticed by Chris Riddoch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-16 14:54:22 -04:00
Junio C Hamano ffcc952b33 Merge branch 'maint'
* maint:
  Fix documentation of tag in git-fast-import.txt
  Properly handle '0' filenames in import-tars
2007-05-10 14:48:04 -07:00
Shawn O. Pearce d966e6aa66 Properly handle '0' filenames in import-tars
Randal L. Schwartz pointed out multiple times that we should be
testing the length of the name string here, not if it is "true".
The problem is the string '0' is actually false in Perl when we
try to evaluate it in this context, as '0' is 0 numerically and
the number 0 is treated as a false value.  This would cause us
to break out of the import loop early if anyone had a file or
directory named "0".

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-07 21:13:40 -04:00
Shawn O. Pearce db81e67a7d Merge branch 'gfi-maint' into gfi-master
* gfi-maint:
  Teach import-tars about GNU tar's @LongLink extension.
2007-05-02 13:24:10 -04:00
Johannes Schindelin 775477aa1d Teach import-tars about GNU tar's @LongLink extension.
This extension allows GNU tar to process file names in excess of the 100
characters defined by the original tar standard. It does this by faking a
file, named '././@LongLink' containing the true file name, and then adding
the file with a truncated name. The idea is that tar without this
extension will write out a file with the long file name, and write the
contents into a file with truncated name.

Unfortunately, GNU tar does a lousy job at times. When truncating results
in a _directory_ name, it will happily use _that_ as a truncated name for
the file.

An example where this actually happens is gcc-4.1.2, where the full path
of the file WeThrowThisExceptionHelper.java truncates _exactly_ before the
basename. So, we have to support that ad-hoc extension.

This bug was noticed by Chris Riddoch on IRC.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-05-02 13:22:34 -04:00
Junio C Hamano 39231b1c32 Merge branch 'maint'
* maint:
  http.c: Fix problem with repeated calls of http_init
  Add missing reference to GIT_COMMITTER_DATE in git-commit-tree documentation
  Fix import-tars fix.
  Update .mailmap with "Michael"
  Do not barf on too long action description
  Catch empty pathnames in trees during fsck
  Don't allow empty pathnames in fast-import
  import-tars: be nice to wrong directory modes
  git-svn: Added 'find-rev' command
  git shortlog documentation: add long options and fix a typo
2007-04-29 01:52:43 -07:00
Junio C Hamano d0c32b6339 Fix import-tars fix.
This heeds advice from our resident Perl expert to make sure
the script is not confused with a string that ends with /\n

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-29 01:34:59 -07:00
Johannes Schindelin 87859f3443 import-tars: be nice to wrong directory modes
Some tars seem to have modes 0755 for directories, not 01000755. Do
not generate an empty object for them, but ignore them.

Noticed by riddochc on IRC.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-28 20:01:36 -04:00
Uwe Kleine-König 46f6178a3f fix importing of subversion tars
add a / between the prefix and name fields of the tar archive if prefix
is non-empty.

Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-04-24 12:14:40 -04:00
Michael Loeffler c750da256a Use gunzip -c over gzcat in import-tars example.
Not everyone has gzcat or bzcat installed on their system, but
gunzip -c and bunzip2 -c perform the same task and are available
if the user has installed gzip support or bzip2 support.

Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-21 11:09:57 -05:00
Michael Loeffler d63ea11594 import-tars: brown paper bag fix for file mode.
There is a bug with this $git_mode variable which should be 0644
or 0755, but nothing else I think.

Signed-off-by: Michael Loeffler <zvpunry@zvpunry.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-12 12:19:45 -05:00
Shawn O. Pearce 590dd4bfd2 tar archive frontend for fast-import.
This is an example fast-import frontend, in less than 100 lines
of Perl.  It accepts one or more tar archives on the command line,
passes them through gzcat/bzcat/zcat if necessary, parses out the
individual file headers and feeds all contained data to fast-import.
No temporary files are involved.

Each tar is treated as one commit, with the commit timestamp coming
from the oldest file modification date found within the tar.

Each tar is also tagged with an annotated tag, using the basename
of the tar file as the name of the tag.

Currently symbolic links and hard links are not handled by the
importer.  The file checksums are also not verified.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-02-08 15:37:53 -05:00