Commit graph

40 commits

Author SHA1 Message Date
Tim Kientzle 54d2369731 Use a simpler memory-management strategy for the file objects.
Instead of trying to reference-count them and free them as soon
as they are no longer needed, we now just keep them around and free
them all when we release the archive object.  This fixes a number
of minor memory leaks, especially when reading damaged archives.
2010-01-23 07:55:53 +00:00
Tim Kientzle f124826e0d Merge Michihiro NAKAJIMA's significant work on the ISO9660 reader
from googlecode:
 * Support for zisofs compressed entries
 * Support for relocated deep directories
 * Direct calculation of link counts for accurate nlink values
   even on images that lack Rockridge extensions
 * Faster handling of the internal file lists.
 * Better detection of ISO variants
2009-12-30 05:30:35 +00:00
Tim Kientzle 26baaf0169 Update tests to match r195873, which corrected how hardlinked files
on iso9660 images were returned.  While I'm poking around, update
some comments around this area to try to clarify what's going on and
what still remains to be improved.
2009-09-08 04:52:12 +00:00
Tim Kientzle 33fe28bbcd The parser for Rockridge symlinks tended to insert
extra slashes at the beginning of absolute targets.

Thanks to Jung-uk Kim for pointing this out to me.

Approved by:	re (kib)
2009-07-26 18:11:44 +00:00
Tim Kientzle e2f1f1fb00 Libarchive recognizes hardlinked files on ISO images,
but returned them incorrectly, causing tar to actually
erase the resulting file while trying to restore the
link.  This one-line fix corrects the hardlink descriptions
to avoid this problem.

Thanks to Jung-uk Kim for pointing this out.

Approved by:	re (kib)
2009-07-25 18:11:55 +00:00
Tim Kientzle 6388433b62 Exit with ARCHIVE_FATAL if the ISO image is truncated. 2009-04-26 18:43:49 +00:00
Tim Kientzle cdad0e17a1 Merge r558,567,569,571,581,582,583,598 from libarchive.googlecode.com:
Support Joliet extensions.  This currently ignores Rockridge extensions
if both exist on the same disk unless the '!joliet' option is provided.
e.g.: tar -xvf example.iso --options '!joliet'
Thanks to: Andreas Henriksson
2009-03-07 02:24:32 +00:00
Tim Kientzle 634fb9dd48 Merge r491,493,500,507,510,530,543 from libarchive.googlecode.com:
This implements the new generic options framework that provides a way
to override format- and compression-specific parameters.
2009-03-06 05:58:56 +00:00
Tim Kientzle 90e7486664 Merge r389 from libarchive.googlecode.com: Fix a memory
leak in ISO9660 handler structure whenever a file entry
has a nonsensical CE offset.
2009-03-05 18:38:36 +00:00
Tim Kientzle 3342e45402 "The first part is just to give more info, the latter part fixes
an error to read files past the 32bit byte offset, for instance
on DVDs."

Submitted by:	phk@
MFC after:	10 days
2009-01-13 04:56:41 +00:00
Tim Kientzle dc8cb157dd Strip ";1" and trailing "." from ISO9660 entries.
This seems a better match for people's expectations.
2008-12-06 06:57:45 +00:00
Tim Kientzle 08da6f539c General improvements to Rockridge parsing and ISO9660 format detection. 2008-12-06 06:55:07 +00:00
Tim Kientzle ebcb29a003 Conditionalize a bunch of debugging messages; this also
eliminates what should be the only remaining stdio dependency.
2008-12-06 06:50:09 +00:00
Tim Kientzle b1ff9c25b8 MfP4: Big read filter refactoring.
This is an attempt to eliminate a lot of redundant
code from the read ("decompression") filters by
changing them to juggle arbitrary-sized blocks
and consolidate reblocking code at a single point
in archive_read.c.

Along the way, I've changed the internal read/consume
API used by the format handlers to a slightly
different style originally suggested by des@.  It
does seem to simplify a lot of common cases.

The most dramatic change is, of course, to
archive_read_support_compression_none(), which
has just evaporated into a no-op as the blocking
code this used to hold has all been moved up
a level.

There's at least one more big round of refactoring
yet to come before the individual filters are as
straightforward as I think they should be...
2008-12-06 06:45:15 +00:00
Tim Kientzle fa07de5eeb MFp4: libarchive 2.5.4b. (Still 'b' until I get a bit more
feedback, but the 2.5 branch is shaping up nicely.)

In addition to many small bug fixes and code improvements:
 * Another iteration of versioning; I think I've got it right now.
 * Portability:  A lot of progress on Windows support (though I'm
   not committing all of the Windows support files to FreeBSD CVS)
 * Explicit tracking of MBS, WCS, and UTF-8 versions of strings
   in archive_entry; the archive_entry routines now correctly return
   NULL only when something is unset, setting NULL properly clears
   string values.  Most charset conversions have been pushed down to
   archive_string.
 * Better handling of charset conversion failure when writing or
   reading UTF-8 headers in pax archives
 * archive_entry_linkify() provides multiple strategies for
   hardlink matching to suit different format expectations
 * More accurate bzip2 format detection
 * Joerg Sonnenberger's extensive improvements to mtree support
 * Rough support for self-extracting ZIP archives.  Not an ideal
   approach, but it works for the archives I've tried.
 * New "sparsify" option in archive_write_disk converts blocks of nulls
   into seeks.
 * Better default behavior for the test harness; it now reports
   all failures by default instead of coredumping at the first one.
2008-05-26 17:00:24 +00:00
Tim Kientzle 54c845efb9 Someday I might forgive the standards bodies for omitting timegm().
Maybe.  In the meantime, my workarounds for trying to coax UTC without
timegm() are getting uglier and uglier.  Apparently, some systems
don't support setenv()/unsetenv(), so you can't set the TZ env var and
hope thereby to coax mktime() into generating UTC.  Without that, I
don't see a really good alternative to just giving up and converting to
localtime with mktime().  (I suppose I should research the Perl library
approach for computing an inverse function to gmtime(); that might
actually be simpler than this growing list of hacks.)
2008-02-19 06:02:01 +00:00
Tim Kientzle 9dd49f960f Update libarchive to 2.4.10. This includes a number of improvements
that I've been working on but put off committing until after the
RELENG_7 branch, including:

* New manpages: cpio.5 mtree.5
* New archive_entry_strmode()
* New archive_entry_link_resolver()
* New read support: mtree format
* Internal API change:  read format auction only runs once
* Running the auction only once allowed simplifying a lot of bid logic.
* Cpio robustness:  search for next header after a sync error
* Support device nodes on ISO9660 images
* Eliminate a lot of unnecessary copies for uncompressed archives
* Corrected handling of new GNU --sparse --posix formats
* Correctly handle a zero-byte write to a compressed archive
* Fixed memory leaks

Many of these improvements were motivated by the upcoming bsdcpio
front-end.

There have also been extensive improvements to the libarchive_test
test harness, which I'll commit separately.
2007-12-30 04:58:22 +00:00
Tim Kientzle b48b40f1f8 libarchive 2.2.3
* "compression_program" support uses an external program
  * Portability: no longer uses "struct stat" as a primary
    data interchange structure internally
  * Part of the above: refactor archive_entry to separate
    out copy_stat() and stat() functions
  * More complete tests for archive_entry
  * Finish archive_entry_clone()
  * Isolate major()/minor()/makedev() in archive_entry; remove
    these from everywhere else.
  * Bug fix: properly handle decompression look-ahead at end-of-data
  * Bug fixes to 'ar' support
  * Fix memory leak in ZIP reader
  * Portability: better timegm() emulation in iso9660 reader
  * New write_disk flags to suppress auto dir creation and not
    overwrite newer files (for future cpio front-end)
  * Simplify trailing-'/' fixup when writing tar and pax
  * Test enhancements:  fix various compiler warnings, improve
    portability, add lots of new tests.
  * Documentation: document new functions, first draft of
    libarchive_internals.3

MFC after: 14 days
Thanks to: Joerg Sonnenberger (compression_program)
Thanks to: Kai Wang (ar)
Thanks to: Colin Percival (many small fixes)
Thanks to: Many others who sent me various patches and problem reports.
2007-05-29 01:00:21 +00:00
Tim Kientzle 0acc551509 Don't compare a signed char to 0xFF.
Thanks to: Joerg Sonnenberger
2007-04-02 00:29:52 +00:00
Colin Percival 5998aba99e Provide a dummy compression-layer skip function which just reads data and
discards it, for use when the compression layer code doesn't know how to
skip data (e.g., everything other than the "none" compressor).  This makes
format level code simpler because that code can now assume that the
compression layer always knows how to skip and will always skip exactly
the requested number of bytes.

Discussed with:	kientzle (3 months ago)
2007-03-31 22:59:43 +00:00
Tim Kientzle f81da3e584 libarchive 2.0
* libarchive_test program exercises many of the core features
  * Refactored old "read_extract" into new "archive_write_disk", which
    uses archive_write methods to put entries onto disk.  In particular,
    you can now use archive_write_disk to create objects on disk
    without having an archive available.
  * Pushed some security checks from bsdtar down into libarchive, where
    they can be better optimized.
  * Rearchitected the logic for creating objects on disk to reduce
    the number of system calls.  Several common cases now use a
    minimum number of system calls.
  * Virtualized some internal interfaces to provide a clearer separation
    of read and write handling and make it simpler to override key
    methods.
  * New "empty" format reader.
  * Corrected return types (this ABI breakage required the "2.0" version bump)
  * Many bug fixes.
2007-03-03 07:37:37 +00:00
Tim Kientzle 6fccc5ecd4 Because the buffer gets released immediately, I need to
copy the symlink target name, not just copy the reference.
This problem sometimes caused crashes when extracting
symlinks from ISO9660 images.

Thanks to: Diego "Flameeyes" Pettenò
2007-03-01 06:22:34 +00:00
Tim Kientzle 63165a380d Fix the copyright notice; it was always intended to be
a vanilla 2-clause BSD license, but somehow some confusing
extra verbage get copied from somewhere.

Also, update the copyright dates to 2007 for all of the files.

Prompted by: several questions about what those extra words really mean
2007-01-09 08:05:56 +00:00
Colin Percival 3c3619cdad Convert compression_skip from taking a size_t skip length request and
returning the length skipped in a ssize_t to using off_t for both.  This
does not break any A[BP]Is, since compression_skip is entirely internal
to libarchive.

If a skip request is > SSIZE_MAX, don't pass it down to the client layer
skip function, since those still uses size_t / ssize_t.  Instead, just
read the data and throw it away.

With this commit, libarchive/bsdtar should now successfully skip archive
entries of >2GB on 32-bit systems, but does so slower than necessary.
The performance will improve with a future A[BP]I breaking commit which
makes client layer skip functions use off_t.

Discussed with:	kientzle
MFC after:	1 week
2007-01-04 12:45:00 +00:00
Tim Kientzle a6fac34b43 Improve support for large ISOs:
* Correct a signed/unsigned problem that broke handling of files >2G.
   * Implement "skip" support for much faster "tar -t".

Thanks to: Robert Sciuk for sending me a DVD that illustrated the first problem
2006-11-27 16:30:32 +00:00
Tim Kientzle 43baed1581 Unbreak libarchive on arm. Two parts of libarchive relied on a
traditional shortcut of defining on-disk layouts using structures of
character arrays. Unfortunately, as recently discussed on cvs-all@,
this usage is not actually sanctioned by the standards and
specifically fails on GCC/arm (unless your data structures happen to
be "naturally aligned").

The new code defines offsets/sizes for data fields and accesses
them using explicit pointer arithmetic, instead of casting to
a structure and accessing structure fields.  In particular,
the new code is now clean with WARNS=6 on arm.

MFC after: 14 days
2006-11-26 05:39:28 +00:00
Tim Kientzle aa1eeda578 Portability and style fixes:
* Actually use the HAVE_<header>_H macros to conditionally include
    system headers.  They've been defined for a long time, but only
    used in a few places.  Now they're used pretty consistently
    throughout.
  * Fill in a lot of missing casts for conversions from void*.
    Although Standard C doesn't require this, some people have been
    trying to use C++ compilers with this code, and they do require it.

Bit-for-bit, the compiled object files are identical, except for
one assert() whose line number changed, so I'm pretty confident I
didn't break anything.  ;-)
2006-11-10 06:39:46 +00:00
Tim Kientzle 2228e32755 POSIX.1e-style Extended Attribute support
This commit implements storing/reading POSIX.1e-style extended
attribute information in "pax" format archives.  An outline of the
storage format is in the tar.5 manpage.  The archive_read_extract()
function has code to restore those archives to disk for Linux; FreeBSD
implementation is forthcoming.

Many thanks to Jaakko Heinonen for finding flaws in earlier
proposals and doing the bulk of the coding in this work.
2006-03-21 16:55:46 +00:00
Tim Kientzle a46c33df05 Fine-tune the format detection for CPIO and ISO9660 sub-types.
This has no impact on the actual operation, it just fixes some
inaccuracies in the format code and description reported back to the caller.
2005-11-08 07:41:03 +00:00
Tim Kientzle a4fd64c861 Portability: timegm() isn't standard, so check for timegm() in
the configure script and substitute mktime() when necessary.

Thanks to:  Darin Broady
2005-11-06 23:38:01 +00:00
Tim Kientzle c4e21983bc signed/unsigned fixes (thanks to GCC4) and a few related minor style corrections. 2005-09-24 21:15:00 +00:00
Tim Kientzle 8aaa8fe733 Add a lot of error checks, based on the patches provided by Dan Lukes.
Also fixes a memory leak reported by Andrew Turner.

PR: bin/83476
Thanks to: Dan Lukes, Andrew Turner
2005-09-21 04:25:06 +00:00
Tim Kientzle 81a4ac6ddb A number of improvements to ZIP support.
* Handles entries with compressed size >2GB (signed/unsigned cleanup)
  * Handles entries with compressed size >4GB ("ZIP64" extension)
  * Handles Unix extensions (ctime, atime, mtime, mode, uid, etc)
  * Format-specific "skip data" override allows ZIP reader to skip
    entries without decompressing them, which makes "tar -t"
    a lot faster.
  * Handles "length-at-end" entries generated by, e.g., "zip -r - foo"

Many thanks to: Dan Nelson, who contributed the code and test files for
   the first three items above and suggested the fourth.
2005-04-06 04:19:30 +00:00
Tim Kientzle e3485a974c Fill in some more Rockridge details in ISO9660 support: Ignore PD
(padding) entries, extract inode value from PX entry, recognize SP and
ST (start/end of SUSP extensions).

I don't enforce SP yet, as I've seen CDROMs which use Rockridge
extensions but don't have the SP record (which is officially
required).

The ISO9660 support is now mature enough to extract FreeBSD
distribution CDROMs created with mkisofs.
2005-02-12 22:48:38 +00:00
Tim Kientzle ab4999c061 Set the format code and name correctly for:
* ISO9660 CDROM images
  * ISO9660 images with Rockridge extensions
2005-01-23 03:02:14 +00:00
Tim Kientzle e01b596372 Support 'CE' records in Rockridge extensions
(specifies that record is extended elsewhere on
the disk).
2005-01-20 04:16:55 +00:00
Tim Kientzle d576b4c796 Recognize and parse symlinks in ISO9660 CDROM images with Rockridge extensions. 2005-01-08 19:56:07 +00:00
Tim Kientzle 8544432b98 First cut at RockRidge support.
Large thanks to the easy-to-read and well-documented
sys/isofs/cd9660 source code, which provided many of the
details I needed for this exercise.
2005-01-03 05:51:33 +00:00
Tim Kientzle 483e82b86c Next round of work on ISO9660 support:
* Reference-count the directory data so that
    we don't leak memory.
  * Correctly step through the directory records
    (skipping unrecognized extensions)
  * Use better defaults for file modes
  * Sort directory entries by offset of the end of the file
    rather than the beginning of the file.  This fixes a
    lot of "out-of-order" problems with zero-length files,
    in particular.
  * Style fixes, remove some debug code, add some error messages.
2005-01-03 01:24:13 +00:00
Tim Kientzle 5d9e84da87 First cut support for extracting from ISO9660 disk images.
This seems to be able to extract a TOC and extract files from
the couple of ISO images I've tested it with.

Treat this as experimental proof-of-concept code for the
moment.  There are still a bunch of debug messages (there
are a few oddities in ISO9660 that I haven't yet figured
out how to handle), a lot of bugs to be addressed (this
code leaks memory very badly), and a lot of missing features (no
Rockridge support, in particular).  I'd appreciate
feedback from anyone who understands ISO9660 format
better than I do. ;-)

Suggested by: Robert Watson
2005-01-02 05:21:15 +00:00