tar-0.6.4.0/0000755000000000000000000000000007346545000010636 5ustar0000000000000000tar-0.6.4.0/Codec/Archive/0000755000000000000000000000000007346545000013234 5ustar0000000000000000tar-0.6.4.0/Codec/Archive/Tar.hs0000644000000000000000000002550707346545000014327 0ustar0000000000000000----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2012 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- -- Reading, writing and manipulating \"@.tar@\" archive files. -- -- This module uses common names and so is designed to be imported qualified: -- -- > import qualified Codec.Archive.Tar as Tar -- ----------------------------------------------------------------------------- module Codec.Archive.Tar ( -- | Tar archive files are used to store a collection of other files in a -- single file. They consists of a sequence of entries. Each entry describes -- a file or directory (or some other special kind of file). The entry stores -- a little bit of meta-data, in particular the file or directory name. -- -- Unlike some other archive formats, a tar file contains no index. The -- information about each entry is stored next to the entry. Because of this, -- tar files are almost always processed linearly rather than in a -- random-access fashion. -- -- The functions in this package are designed for working on tar files -- linearly and lazily. This makes it possible to do many operations in -- constant space rather than having to load the entire archive into memory. -- -- It can read and write standard POSIX tar files and also the GNU and old -- Unix V7 tar formats. The convenience functions that are provided in the -- "Codec.Archive.Tar.Entry" module for creating archive entries are -- primarily designed for standard portable archives. If you need to -- construct GNU format archives or exactly preserve file ownership and -- permissions then you will need to write some extra helper functions. -- -- This module contains just the simple high level operations without -- exposing the all the details of tar files. If you need to inspect tar -- entries in more detail or construct them directly then you also need -- the module "Codec.Archive.Tar.Entry". -- * High level \"all in one\" operations create, extract, append, -- * Notes -- ** Compressed tar archives -- | Tar files are commonly used in conjunction with compression, as in -- @.tar.gz@ or @.tar.bz2@ files. This module does not directly -- handle compressed tar files however they can be handled easily by -- composing functions from this module and the modules -- [@Codec.Compression.GZip@](https://hackage.haskell.org/package/zlib/docs/Codec-Compression-Zlib.html) -- or -- [@Codec.Compression.BZip@](https://hackage.haskell.org/package/bzlib-0.5.0.5/docs/Codec-Compression-BZip.html). -- -- Creating a compressed @.tar.gz@ file is just a minor variation on the -- 'create' function, but where throw compression into the pipeline: -- -- > import qualified Data.ByteString.Lazy as BL -- > import qualified Codec.Compression.GZip as GZip -- > -- > BL.writeFile tar . GZip.compress . Tar.write =<< Tar.pack base dir -- -- Similarly, extracting a compressed @.tar.gz@ is just a minor variation -- on the 'extract' function where we use decompression in the pipeline: -- -- > import qualified Data.ByteString.Lazy as BL -- > import qualified Codec.Compression.GZip as GZip -- > -- > Tar.unpack dir . Tar.read . GZip.decompress =<< BL.readFile tar -- -- ** Security -- | This is pretty important. A maliciously constructed tar archives could -- contain entries that specify bad file names. It could specify absolute -- file names like @\/etc\/passwd@ or relative files outside of the -- archive like @..\/..\/..\/something@. This security problem is commonly -- called a \"directory traversal vulnerability\". Historically, such -- vulnerabilities have been common in packages handling tar archives. -- -- The 'extract' and 'Codec.Archive.Tar.unpack' functions check for bad file names. See the -- 'Codec.Archive.Tar.Check.checkSecurity' function for more details. -- If you need to do any custom -- unpacking then you should use this. -- ** Tarbombs -- | A \"tarbomb\" is a @.tar@ file where not all entries are in a -- subdirectory but instead files extract into the top level directory. The -- 'extract' function does not check for these however if you want to do -- that you can use the 'checkTarbomb' function like so: -- -- > import Control.Exception (SomeException(..)) -- > import Control.Applicative ((<|>)) -- > import qualified Data.ByteString.Lazy as BL -- > -- > Tar.unpackAndCheck (\x -> SomeException <$> checkEntryTarbomb expectedDir x -- > <|> SomeException <$> checkEntrySecurity x) dir . -- > Tar.read =<< BL.readFile tar -- -- In this case extraction will fail if any file is outside of @expectedDir@. -- * Converting between internal and external representation -- | Note, you cannot expect @write . read@ to give exactly the same output -- as input. You can expect the information to be preserved exactly however. -- This is because 'read' accepts common format variations while 'write' -- produces the standard format. read, write, -- * Packing and unpacking files to\/from internal representation -- | These functions are for packing and unpacking portable archives. They -- are not suitable in cases where it is important to preserve file ownership -- and permissions or to archive special files like named pipes and Unix -- device files. pack, packAndCheck, unpack, unpackAndCheck, -- * Types -- ** Tar entry type -- | This module provides only very simple and limited read-only access to -- the 'GenEntry' type. If you need access to the details or if you need to -- construct your own entries then also import "Codec.Archive.Tar.Entry". GenEntry, Entry, entryPath, entryContent, GenEntryContent(..), EntryContent, -- ** Sequences of tar entries GenEntries(..), Entries, mapEntries, mapEntriesNoFail, foldEntries, foldlEntries, unfoldEntries, -- ** Long file names encodeLongNames, decodeLongNames, DecodeLongNamesError(..), -- * Error handling -- | Reading tar files can fail if the data does not match the tar file -- format correctly. -- -- The style of error handling by returning structured errors. The pure -- functions in the library do not throw exceptions, they return the errors -- as data. The IO actions in the library can throw exceptions, in particular -- the 'Codec.Archive.Tar.unpack' action does this. All the error types used are an instance of -- the standard 'Exception' class so it is possible to 'throw' and 'catch' -- them. -- ** Errors from reading tar files FormatError(..), ) where import Codec.Archive.Tar.Check import Codec.Archive.Tar.Entry import Codec.Archive.Tar.Index (hSeekEndEntryOffset) import Codec.Archive.Tar.LongNames (decodeLongNames, encodeLongNames, DecodeLongNamesError(..)) import Codec.Archive.Tar.Pack (pack, packAndCheck) import Codec.Archive.Tar.Read (read, FormatError(..)) import Codec.Archive.Tar.Types (unfoldEntries, foldlEntries, foldEntries, mapEntriesNoFail, mapEntries, Entries, GenEntries(..)) import Codec.Archive.Tar.Unpack (unpack, unpackAndCheck) import Codec.Archive.Tar.Write (write) import Control.Applicative ((<|>)) import Control.Exception (Exception, throw, catch, SomeException(..)) import qualified Data.ByteString.Lazy as BL import System.IO (withFile, IOMode(..)) import Prelude hiding (read) -- | Create a new @\".tar\"@ file from a directory of files. -- -- It is equivalent to calling the standard @tar@ program like so: -- -- @$ tar -f tarball.tar -C base -c dir@ -- -- This assumes a directory @.\/base\/dir@ with files inside, eg -- @.\/base\/dir\/foo.txt@. The file names inside the resulting tar file will be -- relative to @dir@, eg @dir\/foo.txt@. -- -- This is a high level \"all in one\" operation. Since you may need variations -- on this function it is instructive to see how it is written. It is just: -- -- > import qualified Data.ByteString.Lazy as BL -- > -- > BL.writeFile tar . Tar.write =<< Tar.pack base paths -- -- Notes: -- -- The files and directories must not change during this operation or the -- result is not well defined. -- -- The intention of this function is to create tarballs that are portable -- between systems. It is /not/ suitable for doing file system backups because -- file ownership and permissions are not fully preserved. File ownership is -- not preserved at all. File permissions are set to simple portable values: -- -- * @rw-r--r--@ for normal files -- -- * @rwxr-xr-x@ for executable files -- -- * @rwxr-xr-x@ for directories -- create :: FilePath -- ^ Path of the \".tar\" file to write. -> FilePath -- ^ Base directory -> [FilePath] -- ^ Files and directories to archive, relative to base dir -> IO () create tar base paths = BL.writeFile tar . write =<< pack base paths -- | Extract all the files contained in a @\".tar\"@ file. -- -- It is equivalent to calling the standard @tar@ program like so: -- -- @$ tar -x -f tarball.tar -C dir@ -- -- So for example if the @tarball.tar@ file contains @foo\/bar.txt@ then this -- will extract it to @dir\/foo\/bar.txt@. -- -- This is a high level \"all in one\" operation. Since you may need variations -- on this function it is instructive to see how it is written. It is just: -- -- > import qualified Data.ByteString.Lazy as BL -- > -- > Tar.unpack dir . Tar.read =<< BL.readFile tar -- -- Notes: -- -- Extracting can fail for a number of reasons. The tarball may be incorrectly -- formatted. There may be IO or permission errors. In such cases an exception -- will be thrown and extraction will not continue. -- -- Since the extraction may fail part way through it is not atomic. For this -- reason you may want to extract into an empty directory and, if the -- extraction fails, recursively delete the directory. -- -- Security: only files inside the target directory will be written. Tarballs -- containing entries that point outside of the tarball (either absolute paths -- or relative paths) will be caught and an exception will be thrown. -- extract :: FilePath -- ^ Destination directory -> FilePath -- ^ Tarball -> IO () extract dir tar = unpack dir . read =<< BL.readFile tar -- | Append new entries to a @\".tar\"@ file from a directory of files. -- -- This is much like 'create', except that all the entries are added to the -- end of an existing tar file. Or if the file does not already exists then -- it behaves the same as 'create'. -- append :: FilePath -- ^ Path of the \".tar\" file to write. -> FilePath -- ^ Base directory -> [FilePath] -- ^ Files and directories to archive, relative to base dir -> IO () append tar base paths = withFile tar ReadWriteMode $ \hnd -> do _ <- hSeekEndEntryOffset hnd Nothing BL.hPut hnd . write =<< pack base paths tar-0.6.4.0/Codec/Archive/Tar/0000755000000000000000000000000007346545000013762 5ustar0000000000000000tar-0.6.4.0/Codec/Archive/Tar/Check.hs0000644000000000000000000000143007346545000015331 0ustar0000000000000000----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Check -- Copyright : (c) 2008-2012 Duncan Coutts -- 2011 Max Bolingbroke -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- -- Perform various checks on tar file entries. -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Check ( -- * Security checkSecurity, checkEntrySecurity, FileNameError(..), -- * Tarbombs checkTarbomb, checkEntryTarbomb, TarBombError(..), -- * Portability checkPortability, checkEntryPortability, PortabilityError(..), PortabilityPlatform, ) where import Codec.Archive.Tar.Check.Internal tar-0.6.4.0/Codec/Archive/Tar/Check/0000755000000000000000000000000007346545000014777 5ustar0000000000000000tar-0.6.4.0/Codec/Archive/Tar/Check/Internal.hs0000644000000000000000000002631107346545000017112 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE DeriveDataTypeable #-} {-# LANGUAGE ViewPatterns #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# OPTIONS_GHC -Wno-orphans #-} {-# OPTIONS_HADDOCK hide #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Check.Internal -- Copyright : (c) 2008-2012 Duncan Coutts -- 2011 Max Bolingbroke -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- -- Perform various checks on tar file entries. -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Check.Internal ( -- * Security checkSecurity, checkEntrySecurity, FileNameError(..), -- * Tarbombs checkTarbomb, checkEntryTarbomb, TarBombError(..), -- * Portability checkPortability, checkEntryPortability, PortabilityError(..), PortabilityPlatform, ) where import Codec.Archive.Tar.LongNames import Codec.Archive.Tar.Types import Control.Applicative ((<|>)) import qualified Data.ByteString.Lazy.Char8 as Char8 import Data.Maybe (fromMaybe) import Data.Typeable (Typeable) import Control.Exception (Exception(..)) import qualified System.FilePath as FilePath.Native ( splitDirectories, isAbsolute, isValid, (), takeDirectory, hasDrive ) import qualified System.FilePath.Windows as FilePath.Windows import qualified System.FilePath.Posix as FilePath.Posix -------------------------- -- Security -- -- | This function checks a sequence of tar entries for file name security -- problems. It checks that: -- -- * file paths are not absolute -- -- * file paths do not refer outside of the archive -- -- * file names are valid -- -- These checks are from the perspective of the current OS. That means we check -- for \"@C:\blah@\" files on Windows and \"\/blah\" files on Unix. For archive -- entry types 'HardLink' and 'SymbolicLink' the same checks are done for the -- link target. A failure in any entry terminates the sequence of entries with -- an error. -- -- Whenever possible, consider fusing 'Codec.Archive.Tar.Check.checkSecurity' -- with packing / unpacking by using -- 'Codec.Archive.Tar.packAndCheck' / 'Codec.Archive.Tar.unpackAndCheck' -- with 'Codec.Archive.Tar.Check.checkEntrySecurity'. -- Not only it is faster, but also alleviates issues with lazy I/O -- such as exhaustion of file handlers. checkSecurity :: Entries e -> GenEntries FilePath FilePath (Either (Either e DecodeLongNamesError) FileNameError) checkSecurity = checkEntries checkEntrySecurity . decodeLongNames -- | Worker of 'Codec.Archive.Tar.Check.checkSecurity'. -- -- @since 0.6.0.0 checkEntrySecurity :: GenEntry FilePath FilePath -> Maybe FileNameError checkEntrySecurity e = check (entryTarPath e) <|> case entryContent e of HardLink link -> check link SymbolicLink link -> check (FilePath.Posix.takeDirectory (entryTarPath e) FilePath.Posix. link) _ -> Nothing where checkPosix name | FilePath.Posix.isAbsolute name = Just $ AbsoluteFileName name | not (FilePath.Posix.isValid name) = Just $ InvalidFileName name | not (isInsideBaseDir (FilePath.Posix.splitDirectories name)) = Just $ UnsafeLinkTarget name | otherwise = Nothing checkNative (fromFilePathToNative -> name) | FilePath.Native.isAbsolute name || FilePath.Native.hasDrive name = Just $ AbsoluteFileName name | not (FilePath.Native.isValid name) = Just $ InvalidFileName name | not (isInsideBaseDir (FilePath.Native.splitDirectories name)) = Just $ UnsafeLinkTarget name | otherwise = Nothing check name = checkPosix name <|> checkNative (fromFilePathToNative name) isInsideBaseDir :: [FilePath] -> Bool isInsideBaseDir = go 0 where go :: Word -> [FilePath] -> Bool go !_ [] = True go 0 (".." : _) = False go lvl (".." : xs) = go (lvl - 1) xs go lvl ("." : xs) = go lvl xs go lvl (_ : xs) = go (lvl + 1) xs -- | Errors arising from tar file names being in some way invalid or dangerous data FileNameError = InvalidFileName FilePath | AbsoluteFileName FilePath | UnsafeLinkTarget FilePath -- ^ @since 0.6.0.0 deriving (Typeable) instance Show FileNameError where show = showFileNameError Nothing instance Exception FileNameError showFileNameError :: Maybe PortabilityPlatform -> FileNameError -> String showFileNameError mb_plat err = case err of InvalidFileName path -> "Invalid" ++ plat ++ " file name in tar archive: " ++ show path AbsoluteFileName path -> "Absolute" ++ plat ++ " file name in tar archive: " ++ show path UnsafeLinkTarget path -> "Unsafe" ++ plat ++ " link target in tar archive: " ++ show path where plat = maybe "" (' ':) mb_plat -------------------------- -- Tarbombs -- -- | This function checks a sequence of tar entries for being a \"tar bomb\". -- This means that the tar file does not follow the standard convention that -- all entries are within a single subdirectory, e.g. a file \"foo.tar\" would -- usually have all entries within the \"foo/\" subdirectory. -- -- Given the expected subdirectory, this function checks all entries are within -- that subdirectroy. -- -- Note: This check must be used in conjunction with 'Codec.Archive.Tar.Check.checkSecurity' -- (or 'Codec.Archive.Tar.Check.checkPortability'). -- -- Whenever possible, consider fusing 'Codec.Archive.Tar.Check.checkTarbomb' -- with packing / unpacking by using -- 'Codec.Archive.Tar.packAndCheck' / 'Codec.Archive.Tar.unpackAndCheck' -- with 'Codec.Archive.Tar.Check.checkEntryTarbomb'. -- Not only it is faster, but also alleviates issues with lazy I/O -- such as exhaustion of file handlers. checkTarbomb :: FilePath -> Entries e -> GenEntries FilePath FilePath (Either (Either e DecodeLongNamesError) TarBombError) checkTarbomb expectedTopDir = checkEntries (checkEntryTarbomb expectedTopDir) . decodeLongNames -- | Worker of 'checkTarbomb'. -- -- @since 0.6.0.0 checkEntryTarbomb :: FilePath -> GenEntry FilePath linkTarget -> Maybe TarBombError checkEntryTarbomb expectedTopDir entry = do case entryContent entry of -- Global extended header aka XGLTYPE aka pax_global_header -- https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02 OtherEntryType 'g' _ _ -> Nothing -- Extended header referring to the next file in the archive aka XHDTYPE OtherEntryType 'x' _ _ -> Nothing _ -> case FilePath.Posix.splitDirectories (entryTarPath entry) of (topDir:_) | topDir == expectedTopDir -> Nothing _ -> Just $ TarBombError expectedTopDir (entryTarPath entry) -- | An error that occurs if a tar file is a \"tar bomb\" that would extract -- files outside of the intended directory. data TarBombError = TarBombError FilePath -- ^ Path inside archive. -- -- @since 0.6.0.0 FilePath -- ^ Expected top directory. deriving (Typeable) instance Exception TarBombError instance Show TarBombError where show (TarBombError expectedTopDir tarBombPath) = "File in tar archive, " ++ show tarBombPath ++ ", is not in the expected directory " ++ show expectedTopDir -------------------------- -- Portability -- -- | This function checks a sequence of tar entries for a number of portability -- issues. It will complain if: -- -- * The old \"Unix V7\" or \"gnu\" formats are used. For maximum portability -- only the POSIX standard \"ustar\" format should be used. -- -- * A non-portable entry type is used. Only ordinary files, hard links, -- symlinks and directories are portable. Device files, pipes and others are -- not portable between all common operating systems. -- -- * Non-ASCII characters are used in file names. There is no agreed portable -- convention for Unicode or other extended character sets in file names in -- tar archives. -- -- * File names that would not be portable to both Unix and Windows. This check -- includes characters that are valid in both systems and the \'/\' vs \'\\\' -- directory separator conventions. -- -- Whenever possible, consider fusing 'checkPortability' with packing / unpacking by using -- 'Codec.Archive.Tar.packAndCheck' / 'Codec.Archive.Tar.unpackAndCheck' -- with 'checkEntryPortability'. -- Not only it is faster, but also alleviates issues with lazy I/O -- such as exhaustion of file handlers. checkPortability :: Entries e -> GenEntries FilePath FilePath (Either (Either e DecodeLongNamesError) PortabilityError) checkPortability = checkEntries checkEntryPortability . decodeLongNames -- | Worker of 'checkPortability'. -- -- @since 0.6.0.0 checkEntryPortability :: GenEntry FilePath linkTarget -> Maybe PortabilityError checkEntryPortability entry | entryFormat entry `elem` [V7Format, GnuFormat] = Just $ NonPortableFormat (entryFormat entry) | not (portableFileType (entryContent entry)) = Just NonPortableFileType | not (all portableChar posixPath) = Just $ NonPortableEntryNameChar posixPath | not (FilePath.Posix.isValid posixPath) = Just $ NonPortableFileName "unix" (InvalidFileName posixPath) | not (FilePath.Windows.isValid windowsPath) = Just $ NonPortableFileName "windows" (InvalidFileName windowsPath) | FilePath.Posix.isAbsolute posixPath = Just $ NonPortableFileName "unix" (AbsoluteFileName posixPath) | FilePath.Windows.isAbsolute windowsPath = Just $ NonPortableFileName "windows" (AbsoluteFileName windowsPath) | any (=="..") (FilePath.Posix.splitDirectories posixPath) = Just $ NonPortableFileName "unix" (InvalidFileName posixPath) | any (=="..") (FilePath.Windows.splitDirectories windowsPath) = Just $ NonPortableFileName "windows" (InvalidFileName windowsPath) | otherwise = Nothing where posixPath = entryTarPath entry windowsPath = fromFilePathToWindowsPath posixPath portableFileType ftype = case ftype of NormalFile {} -> True HardLink {} -> True SymbolicLink {} -> True Directory -> True _ -> False portableChar c = c <= '\127' -- | Portability problems in a tar archive data PortabilityError = NonPortableFormat Format | NonPortableFileType | NonPortableEntryNameChar FilePath | NonPortableFileName PortabilityPlatform FileNameError deriving (Typeable) -- | The name of a platform that portability issues arise from type PortabilityPlatform = String instance Exception PortabilityError instance Show PortabilityError where show (NonPortableFormat format) = "Archive is in the " ++ fmt ++ " format" where fmt = case format of V7Format -> "old Unix V7 tar" UstarFormat -> "ustar" -- I never generate this but a user might GnuFormat -> "GNU tar" show NonPortableFileType = "Non-portable file type in archive" show (NonPortableEntryNameChar posixPath) = "Non-portable character in archive entry name: " ++ show posixPath show (NonPortableFileName platform err) = showFileNameError (Just platform) err -------------------------- -- Utils checkEntries :: (GenEntry tarPath linkTarget -> Maybe e') -> GenEntries tarPath linkTarget e -> GenEntries tarPath linkTarget (Either e e') checkEntries checkEntry = mapEntries (\entry -> maybe (Right entry) Left (checkEntry entry)) tar-0.6.4.0/Codec/Archive/Tar/Entry.hs0000644000000000000000000000430507346545000015421 0ustar0000000000000000----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Entry -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2009 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- -- Types and functions to manipulate tar entries. -- -- While the "Codec.Archive.Tar" module provides only the simple high level -- API, this module provides full access to the details of tar entries. This -- lets you inspect all the meta-data, construct entries and handle error cases -- more precisely. -- -- This module uses common names and so is designed to be imported qualified: -- -- > import qualified Codec.Archive.Tar as Tar -- > import qualified Codec.Archive.Tar.Entry as Tar -- ----------------------------------------------------------------------------- {-# LANGUAGE CPP #-} module Codec.Archive.Tar.Entry ( -- * Tar entry and associated types GenEntry(..), Entry, entryPath, GenEntryContent(..), EntryContent, Ownership(..), FileSize, Permissions, EpochTime, DevMajor, DevMinor, TypeCode, Format(..), -- * Constructing simple entry values simpleEntry, fileEntry, directoryEntry, longLinkEntry, longSymLinkEntry, -- * Standard file permissions -- | For maximum portability when constructing archives use only these file -- permissions. ordinaryFilePermissions, executableFilePermissions, directoryPermissions, -- * Constructing entries from disk files packFileEntry, packDirectoryEntry, packSymlinkEntry, #if __GLASGOW_HASKELL__ >= 908 {-# DEPRECATED "The re-export will be removed in future releases of tar, use directory-ospath-streaming package directly " #-} #endif getDirectoryContentsRecursive, -- * TarPath type TarPath, toTarPath, fromTarPath, fromTarPathToPosixPath, fromTarPathToWindowsPath, -- * LinkTarget type LinkTarget, toLinkTarget, fromLinkTarget, fromLinkTargetToPosixPath, fromLinkTargetToWindowsPath, ) where import Codec.Archive.Tar.Types import Codec.Archive.Tar.Pack import System.Directory.OsPath.Streaming (getDirectoryContentsRecursive) tar-0.6.4.0/Codec/Archive/Tar/Index.hs0000644000000000000000000000542207346545000015370 0ustar0000000000000000----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Index -- Copyright : (c) 2010-2015 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- -- Random access to the content of a @.tar@ archive. -- -- This module uses common names and so is designed to be imported qualified: -- -- > import qualified Codec.Archive.Tar.Index as TarIndex -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Index ( -- | The @tar@ format does not contain an index of files within the -- archive. Normally, @tar@ file have to be processed linearly. It is -- sometimes useful however to be able to get random access to files -- within the archive. -- -- This module provides an index of a @tar@ file. A linear pass of the -- @tar@ file is needed to 'build' the t'TarIndex', but thereafter you can -- 'lookup' paths in the @tar@ file, and then use 'hReadEntry' to -- seek to the right part of the file and read the entry. -- -- An index cannot be used to lookup 'Codec.Archive.Tar.Directory' entries in a tar file; -- instead, you will get 'TarDir' entry listing all the entries in the -- directory. -- * Index type TarIndex, -- * Index lookup lookup, TarIndexEntry(..), toList, -- ** I\/O operations TarEntryOffset, hReadEntry, hReadEntryHeader, -- * Index construction build, -- ** Incremental construction -- $incremental-construction IndexBuilder, empty, addNextEntry, skipNextEntry, finalise, unfinalise, -- * Serialising indexes serialise, deserialise, -- * Lower level operations with offsets and I\/O on tar files hReadEntryHeaderOrEof, hSeekEntryOffset, hSeekEntryContentOffset, hSeekEndEntryOffset, nextEntryOffset, indexEndEntryOffset, indexNextEntryOffset, ) where import Prelude hiding (lookup) import Codec.Archive.Tar.Index.Internal -- $incremental-construction -- If you need more control than 'build' then you can construct the index -- in an accumulator style using the t'IndexBuilder' and operations. -- -- Start with 'empty' and use 'addNextEntry' (or 'skipNextEntry') for -- each 'Codec.Archive.Tar.Entry.Entry' in the tar file in order. Every entry must added or skipped in -- order, otherwise the resulting t'TarIndex' will report the wrong -- 'TarEntryOffset's. At the end use 'finalise' to get the t'TarIndex'. -- -- For example, 'build' is simply: -- -- > build = go empty -- > where -- > go !builder (Next e es) = go (addNextEntry e builder) es -- > go !builder Done = Right $! finalise builder -- > go !_ (Fail err) = Left err tar-0.6.4.0/Codec/Archive/Tar/Index/0000755000000000000000000000000007346545000015031 5ustar0000000000000000tar-0.6.4.0/Codec/Archive/Tar/Index/IntTrie.hs0000644000000000000000000002426307346545000016752 0ustar0000000000000000{-# LANGUAGE CPP, BangPatterns, PatternGuards #-} {-# LANGUAGE DeriveDataTypeable, ScopedTypeVariables #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE UnboxedTuples #-} {-# OPTIONS_HADDOCK hide #-} module Codec.Archive.Tar.Index.IntTrie ( IntTrie(..), construct, toList, IntTrieBuilder(..), empty, insert, finalise, unfinalise, lookup, TrieLookup(..), serialise, serialiseSize, deserialise, TrieNode(..), Completions, inserts, completionsFrom, flattenTrie, tagLeaf, tagNode, Key(..), Value(..), ) where import Prelude hiding (lookup) import Data.Typeable (Typeable) import qualified Data.Array.Unboxed as A import Data.Array.IArray ((!)) import qualified Data.Bits as Bits import Data.Word (Word32) import Data.Bits import Data.Monoid (Monoid(..)) import Data.Monoid ((<>)) import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS import qualified Data.ByteString.Unsafe as BS import Data.ByteString.Builder as BS import Control.Exception (assert) import qualified Data.Map.Strict as Map import qualified Data.IntMap.Strict as IntMap import Data.IntMap.Strict (IntMap) import Data.List hiding (lookup, insert) import Data.Function (on) import GHC.IO import Codec.Archive.Tar.Index.Utils -- | A compact mapping from sequences of nats to nats. -- -- NOTE: The tries in this module have values /only/ at the leaves (which -- correspond to files), they do not have values at the branch points (which -- correspond to directories). newtype IntTrie = IntTrie (A.UArray Word32 Word32) deriving (Eq, Show, Typeable) -- | The most significant bit is used for tagging, -- see 'tagLeaf' / 'tagNode' below, so morally it's Word31 only. newtype Key = Key { unKey :: Word32 } deriving (Eq, Ord, Show) newtype Value = Value { unValue :: Word32 } deriving (Eq, Ord, Show) -- Compact, read-only implementation of a trie. It's intended for use with file -- paths, but we do that via string ids. -- Each node has a size and a sequence of keys followed by an equal length -- sequence of corresponding entries. Since we're going to flatten this into -- a single array then we will need to replace the trie structure with pointers -- represented as array offsets. -- Each node is a pair of arrays, one of keys and one of Either value pointer. -- We need to distinguish values from internal pointers. We use a tag bit: -- tagLeaf, tagNode, untag :: Word32 -> Word32 tagLeaf = id tagNode = flip Bits.setBit 31 untag = flip Bits.clearBit 31 isNode :: Word32 -> Bool isNode = flip Bits.testBit 31 ------------------------------------- -- Decoding the trie array form -- completionsFrom :: IntTrie -> Word32 -> Completions completionsFrom trie@(IntTrie arr) nodeOff = [ (Key (untag key), next) | keyOff <- [keysStart..keysEnd] , let key = arr ! keyOff entry = arr ! (keyOff + nodeSize) next | isNode key = Completions (completionsFrom trie entry) | otherwise = Entry (Value entry) ] where nodeSize = arr ! nodeOff keysStart = nodeOff + 1 keysEnd = nodeOff + nodeSize -- | Convert the trie to a list -- -- This is the left inverse to 'construct' (modulo ordering). toList :: IntTrie -> [([Key], Value)] toList = concatMap (aux []) . (`completionsFrom` 0) where aux :: [Key] -> (Key, TrieLookup) -> [([Key], Value)] aux ks (k, Entry v) = [(reverse (k:ks), v)] aux ks (k, Completions cs) = concatMap (aux (k:ks)) cs ------------------------------------- -- Toplevel trie array construction -- -- So constructing the t'IntTrie' as a whole is just a matter of stringing -- together all the bits -- | Build an t'IntTrie' from a bunch of (key, value) pairs, where the keys -- are sequences. -- construct :: [([Key], Value)] -> IntTrie construct = finalise . flip inserts empty --------------------------------- -- Looking up in the trie array -- data TrieLookup = Entry !Value | Completions Completions deriving (Eq, Ord, Show) type Completions = [(Key, TrieLookup)] lookup :: IntTrie -> [Key] -> Maybe TrieLookup lookup trie@(IntTrie arr) = go 0 where go :: Word32 -> [Key] -> Maybe TrieLookup go nodeOff [] = Just (completions nodeOff) go nodeOff (k:ks) = case search nodeOff (tagLeaf k') of Just entryOff | null ks -> Just (entry entryOff) | otherwise -> Nothing Nothing -> case search nodeOff (tagNode k') of Nothing -> Nothing Just entryOff -> go (arr ! entryOff) ks where k' = unKey k entry entryOff = Entry (Value (arr ! entryOff)) completions nodeOff = Completions (completionsFrom trie nodeOff) search :: Word32 -> Word32 -> Maybe Word32 search nodeOff key = fmap (+nodeSize) (bsearch keysStart keysEnd key) where nodeSize = arr ! nodeOff keysStart = nodeOff + 1 keysEnd = nodeOff + nodeSize bsearch :: Word32 -> Word32 -> Word32 -> Maybe Word32 bsearch a b key | a > b = Nothing | otherwise = case compare key (arr ! mid) of LT -> bsearch a (mid-1) key EQ -> Just mid GT -> bsearch (mid+1) b key where mid = (a + b) `div` 2 ------------------------- -- Building Tries -- newtype IntTrieBuilder = IntTrieBuilder (IntMap TrieNode) deriving (Show, Eq) data TrieNode = TrieLeaf {-# UNPACK #-} !Word32 | TrieNode !IntTrieBuilder deriving (Show, Eq) empty :: IntTrieBuilder empty = IntTrieBuilder IntMap.empty insert :: [Key] -> Value -> IntTrieBuilder -> IntTrieBuilder insert [] _v t = t insert (k:ks) v t = insertTrie (fromIntegral (unKey k) :: Int) (map (fromIntegral . unKey) ks :: [Int]) (unValue v) t insertTrie :: Int -> [Int] -> Word32 -> IntTrieBuilder -> IntTrieBuilder insertTrie k ks v (IntTrieBuilder t) = IntTrieBuilder $ IntMap.alter (\t' -> Just $! maybe (freshTrieNode ks v) (insertTrieNode ks v) t') k t insertTrieNode :: [Int] -> Word32 -> TrieNode -> TrieNode insertTrieNode [] v _ = TrieLeaf v insertTrieNode (k:ks) v (TrieLeaf _) = TrieNode (freshTrie k ks v) insertTrieNode (k:ks) v (TrieNode t) = TrieNode (insertTrie k ks v t) freshTrie :: Int -> [Int] -> Word32 -> IntTrieBuilder freshTrie k [] v = IntTrieBuilder (IntMap.singleton k (TrieLeaf v)) freshTrie k (k':ks) v = IntTrieBuilder (IntMap.singleton k (TrieNode (freshTrie k' ks v))) freshTrieNode :: [Int] -> Word32 -> TrieNode freshTrieNode [] v = TrieLeaf v freshTrieNode (k:ks) v = TrieNode (freshTrie k ks v) inserts :: [([Key], Value)] -> IntTrieBuilder -> IntTrieBuilder inserts kvs t = foldl' (\t' (ks, v) -> insert ks v t') t kvs finalise :: IntTrieBuilder -> IntTrie finalise trie = IntTrie $ A.listArray (0, fromIntegral (flatTrieLength trie) - 1) (flattenTrie trie) unfinalise :: IntTrie -> IntTrieBuilder unfinalise trie = go (completionsFrom trie 0) where go kns = IntTrieBuilder $ IntMap.fromList [ (fromIntegral (unKey k) :: Int, t) | (k, n) <- kns , let t = case n of Entry v -> TrieLeaf (unValue v) Completions kns' -> TrieNode (go kns') ] --------------------------------- -- Flattening Tries -- type Offset = Int flatTrieLength :: IntTrieBuilder -> Int flatTrieLength (IntTrieBuilder tns) = 1 + 2 * IntMap.size tns + sum [ flatTrieLength n | TrieNode n <- IntMap.elems tns ] -- This is a breadth-first traversal. We keep a list of the tries that we are -- to write out next. Each of these have an offset allocated to them at the -- time we put them into the list. We keep a running offset so we know where -- to allocate next. -- flattenTrie :: IntTrieBuilder -> [Word32] flattenTrie trie = go (queue [trie]) (size trie) where size (IntTrieBuilder tns) = 1 + 2 * IntMap.size tns go :: Q IntTrieBuilder -> Offset -> [Word32] go todo !offset = case dequeue todo of Nothing -> [] Just (IntTrieBuilder tnodes, tries) -> flat ++ go tries' offset' where !count = IntMap.size tnodes flat = fromIntegral count : Map.keys keysValues ++ Map.elems keysValues (!offset', !keysValues, !tries') = IntMap.foldlWithKey' accumNodes (offset, Map.empty, tries) tnodes accumNodes :: (Offset, Map.Map Word32 Word32, Q IntTrieBuilder) -> Int -> TrieNode -> (Offset, Map.Map Word32 Word32, Q IntTrieBuilder) accumNodes (!off, !kvs, !tries) !k (TrieLeaf v) = (off, kvs', tries) where kvs' = Map.insert (tagLeaf (int2Word32 k)) v kvs accumNodes (!off, !kvs, !tries) !k (TrieNode t) = (off + size t, kvs', tries') where kvs' = Map.insert (tagNode (int2Word32 k)) (int2Word32 off) kvs tries' = enqueue tries t data Q a = Q [a] [a] queue :: [a] -> Q a queue xs = Q xs [] enqueue :: Q a -> a -> Q a enqueue (Q front back) x = Q front (x : back) dequeue :: Q a -> Maybe (a, Q a) dequeue (Q (x:xs) back) = Just (x, Q xs back) dequeue (Q [] back) = case reverse back of x:xs -> Just (x, Q xs []) [] -> Nothing int2Word32 :: Int -> Word32 int2Word32 = fromIntegral ------------------------- -- (de)serialisation -- serialise :: IntTrie -> BS.Builder serialise (IntTrie arr) = let (_, !ixEnd) = A.bounds arr in BS.word32BE (ixEnd+1) <> foldr (\n r -> BS.word32BE n <> r) mempty (A.elems arr) serialiseSize :: IntTrie -> Int serialiseSize (IntTrie arr) = let (_, ixEnd) = A.bounds arr in 4 + 4 * (fromIntegral ixEnd + 1) deserialise :: BS.ByteString -> Maybe (IntTrie, BS.ByteString) deserialise bs | BS.length bs >= 4 , let lenArr = readWord32BE bs 0 lenTotal = 4 + 4 * fromIntegral lenArr , BS.length bs >= 4 + 4 * fromIntegral lenArr , let !bs_without_len = BS.unsafeDrop 4 bs !bs_remaining = BS.unsafeDrop lenTotal bs !arr = unsafePerformIO $ beToLe lenArr bs_without_len = Just (IntTrie arr, bs_remaining) | otherwise = Nothing tar-0.6.4.0/Codec/Archive/Tar/Index/Internal.hs0000644000000000000000000004566407346545000017160 0ustar0000000000000000{-# LANGUAGE CPP, BangPatterns, PatternGuards #-} {-# LANGUAGE GeneralizedNewtypeDeriving, DeriveDataTypeable #-} {-# OPTIONS_HADDOCK hide #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Index.Internal -- Copyright : (c) 2010-2015 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Index.Internal ( -- * Index type TarIndex(..), -- * Index lookup lookup, TarIndexEntry(..), toList, PathComponentId(..), -- ** I\/O operations TarEntryOffset, hReadEntry, hReadEntryHeader, -- * Index construction build, -- ** Incremental construction IndexBuilder, empty, addNextEntry, skipNextEntry, finalise, unfinalise, -- * Serialising indexes serialise, deserialise, -- * Lower level operations with offsets and I\/O on tar files hReadEntryHeaderOrEof, hSeekEntryOffset, hSeekEntryContentOffset, hSeekEndEntryOffset, nextEntryOffset, indexEndEntryOffset, indexNextEntryOffset, toComponentIds, serialiseLBS, serialiseSize, ) where import Data.Typeable (Typeable) import Codec.Archive.Tar.Types as Tar import Codec.Archive.Tar.Read as Tar import qualified Codec.Archive.Tar.Index.StringTable as StringTable import Codec.Archive.Tar.Index.StringTable (StringTable, StringTableBuilder) import qualified Codec.Archive.Tar.Index.IntTrie as IntTrie import Codec.Archive.Tar.Index.Utils (readWord32BE) import Codec.Archive.Tar.Index.IntTrie (IntTrie, IntTrieBuilder) import Codec.Archive.Tar.PackAscii import qualified System.FilePath.Posix as FilePath import Data.Monoid (Monoid(..)) import Data.Monoid ((<>)) import Data.Word import Data.Int import Data.Bits import qualified Data.Array.Unboxed as A import Prelude hiding (lookup) import System.IO import Control.Exception (assert, throwIO) import Control.DeepSeq import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS.Char8 import qualified Data.ByteString.Lazy as LBS import qualified Data.ByteString.Unsafe as BS import Data.ByteString.Builder as BS import Data.ByteString.Builder.Extra as BS (toLazyByteStringWith, untrimmedStrategy) -- | An index of the entries in a tar file. -- -- This index type is designed to be quite compact and suitable to store either -- on disk or in memory. -- data TarIndex = TarIndex -- As an example of how the mapping works, consider these example files: -- "foo/bar.hs" at offset 0 -- "foo/baz.hs" at offset 1024 -- -- We split the paths into components and enumerate them. -- { "foo" -> TokenId 0, "bar.hs" -> TokenId 1, "baz.hs" -> TokenId 2 } -- -- We convert paths into sequences of 'TokenId's, i.e. -- "foo/bar.hs" becomes [PathComponentId 0, PathComponentId 1] -- "foo/baz.hs" becomes [PathComponentId 0, PathComponentId 2] -- -- We use a trie mapping sequences of 'PathComponentId's to the entry offset: -- { [PathComponentId 0, PathComponentId 1] -> offset 0 -- , [PathComponentId 0, PathComponentId 2] -> offset 1024 } -- The mapping of filepath components as strings to ids. {-# UNPACK #-} !(StringTable PathComponentId) -- Mapping of sequences of filepath component ids to tar entry offsets. {-# UNPACK #-} !IntTrie -- key = PathComponentId, value = TarEntryOffset -- The offset immediatly after the last entry, where we would append any -- additional entries. {-# UNPACK #-} !TarEntryOffset deriving (Eq, Show, Typeable) instance NFData TarIndex where rnf (TarIndex _ _ _) = () -- fully strict by construction -- | The result of 'Codec.Archive.Tar.Index.lookup' in a t'TarIndex'. It can either be a file directly, -- or a directory entry containing further entries (and all subdirectories -- recursively). Note that the subtrees are constructed lazily, so it's -- cheaper if you don't look at them. -- data TarIndexEntry = TarFileEntry {-# UNPACK #-} !TarEntryOffset | TarDir [(FilePath, TarIndexEntry)] deriving (Show, Typeable) newtype PathComponentId = PathComponentId Int deriving (Eq, Ord, Enum, Show, Typeable) -- | An offset within a tar file. Use 'hReadEntry', 'hReadEntryHeader' or -- 'hSeekEntryOffset'. -- -- This is actually a tar \"record\" number, not a byte offset. -- type TarEntryOffset = Word32 -- | Look up a given filepath in the t'TarIndex'. It may return a 'TarFileEntry' -- containing the 'TarEntryOffset' of the file within the tar file, or if -- the filepath identifies a directory then it returns a 'TarDir' containing -- the list of files within that directory. -- -- Given the 'TarEntryOffset' you can then use one of the I\/O operations: -- -- * 'hReadEntry' to read the whole entry; -- -- * 'hReadEntryHeader' to read just the file metadata (e.g. its length); -- lookup :: TarIndex -> FilePath -> Maybe TarIndexEntry lookup (TarIndex pathTable pathTrie _) path = do fpath <- toComponentIds pathTable path tentry <- IntTrie.lookup pathTrie $ map pathComponentIdToKey fpath return (mkIndexEntry tentry) where mkIndexEntry (IntTrie.Entry offset) = TarFileEntry $ IntTrie.unValue offset mkIndexEntry (IntTrie.Completions entries) = TarDir [ (fromComponentId pathTable $ keyToPathComponentId key, mkIndexEntry entry) | (key, entry) <- entries ] toComponentIds :: StringTable PathComponentId -> FilePath -> Maybe [PathComponentId] toComponentIds table = lookupComponents [] . filter (/= BS.Char8.singleton '.') . splitDirectories . posixToByteString . toPosixString where lookupComponents cs' [] = Just (reverse cs') lookupComponents cs' (c:cs) = case StringTable.lookup table c of Nothing -> Nothing Just cid -> lookupComponents (cid:cs') cs fromComponentId :: StringTable PathComponentId -> PathComponentId -> FilePath fromComponentId table = fromPosixString . byteToPosixString . StringTable.index table -- | All the files in the index with their corresponding 'TarEntryOffset's. -- -- Note that the files are in no special order. If you intend to read all or -- most files then is is recommended to sort by the 'TarEntryOffset'. -- toList :: TarIndex -> [(FilePath, TarEntryOffset)] toList (TarIndex pathTable pathTrie _) = [ (path, IntTrie.unValue off) | (cids, off) <- IntTrie.toList pathTrie , let path = FilePath.joinPath (map (fromComponentId pathTable . keyToPathComponentId) cids) ] -- | Build a t'TarIndex' from a sequence of tar 'Entries'. The 'Entries' are -- assumed to start at offset @0@ within a file. -- build :: Entries e -> Either e TarIndex build = go empty where go !builder (Next e es) = go (addNextEntry e builder) es go !builder Done = Right $! finalise builder go !_ (Fail err) = Left err -- | The intermediate type used for incremental construction of a t'TarIndex'. -- data IndexBuilder = IndexBuilder !(StringTableBuilder PathComponentId) !IntTrieBuilder -- key = PathComponentId, value = TarEntryOffset {-# UNPACK #-} !TarEntryOffset deriving (Eq, Show) instance NFData IndexBuilder where rnf IndexBuilder{} = () -- fully strict by construction -- | The initial empty t'IndexBuilder'. -- empty :: IndexBuilder empty = IndexBuilder StringTable.empty IntTrie.empty 0 -- | Add the next t'Entry' into the t'IndexBuilder'. -- addNextEntry :: Entry -> IndexBuilder -> IndexBuilder addNextEntry entry (IndexBuilder stbl itrie nextOffset) = IndexBuilder stbl' itrie' (nextEntryOffset entry nextOffset) where !entrypath = splitTarPath (entryTarPath entry) (stbl', cids) = StringTable.inserts entrypath stbl itrie' = IntTrie.insert (map pathComponentIdToKey cids) (IntTrie.Value nextOffset) itrie -- | Use this function if you want to skip some entries and not add them to the -- final t'TarIndex'. -- skipNextEntry :: Entry -> IndexBuilder -> IndexBuilder skipNextEntry entry (IndexBuilder stbl itrie nextOffset) = IndexBuilder stbl itrie (nextEntryOffset entry nextOffset) -- | Finish accumulating t'Entry' information and build the compact t'TarIndex' -- lookup structure. -- finalise :: IndexBuilder -> TarIndex finalise (IndexBuilder stbl itrie finalOffset) = TarIndex pathTable pathTrie finalOffset where pathTable = StringTable.finalise stbl pathTrie = IntTrie.finalise itrie -- | This is the offset immediately following the entry most recently added -- to the t'IndexBuilder'. You might use this if you need to know the offsets -- but don't want to use the t'TarIndex' lookup structure. -- Use with 'hSeekEntryOffset'. See also 'nextEntryOffset'. -- indexNextEntryOffset :: IndexBuilder -> TarEntryOffset indexNextEntryOffset (IndexBuilder _ _ off) = off -- | This is the offset immediately following the last entry in the tar file. -- This can be useful to append further entries into the tar file. -- Use with 'hSeekEntryOffset', or just use 'hSeekEndEntryOffset' directly. -- indexEndEntryOffset :: TarIndex -> TarEntryOffset indexEndEntryOffset (TarIndex _ _ off) = off -- | Calculate the 'TarEntryOffset' of the next entry, given the size and -- offset of the current entry. -- -- This is much like using 'skipNextEntry' and 'indexNextEntryOffset', but without -- using an t'IndexBuilder'. -- nextEntryOffset :: Entry -> TarEntryOffset -> TarEntryOffset nextEntryOffset entry offset = offset + 1 + case entryContent entry of NormalFile _ size -> blocks size OtherEntryType _ _ size -> blocks size _ -> 0 where -- NOTE: to avoid underflow, do the (fromIntegral :: Int64 -> Word32) last blocks :: Int64 -> TarEntryOffset blocks size = fromIntegral (1 + (size - 1) `div` 512) type FilePathBS = BS.ByteString splitTarPath :: TarPath -> [FilePathBS] splitTarPath (TarPath name prefix) = splitDirectories (posixToByteString prefix) ++ splitDirectories (posixToByteString name) splitDirectories :: FilePathBS -> [FilePathBS] splitDirectories bs = case BS.Char8.split '/' bs of c:cs | BS.null c -> BS.Char8.singleton '/' : filter (not . BS.null) cs cs -> filter (not . BS.null) cs ------------------------- -- Resume building an existing index -- -- | Resume building an existing index -- -- A t'TarIndex' is optimized for a highly compact and efficient in-memory -- representation. This, however, makes it read-only. If you have an existing -- t'TarIndex' for a large file, and want to add to it, you can translate the -- t'TarIndex' back to an t'IndexBuilder'. Be aware that this is a relatively -- costly operation (linear in the size of the t'TarIndex'), though still -- faster than starting again from scratch. -- -- This is the left inverse to 'Codec.Archive.Tar.Index.finalise' (modulo ordering). -- unfinalise :: TarIndex -> IndexBuilder unfinalise (TarIndex pathTable pathTrie finalOffset) = IndexBuilder (StringTable.unfinalise pathTable) (IntTrie.unfinalise pathTrie) finalOffset ------------------------- -- I/O operations -- -- | Reads an entire t'Entry' at the given 'TarEntryOffset' in the tar file. -- The 'Handle' must be open for reading and be seekable. -- -- This reads the whole entry into memory strictly, not incrementally. For more -- control, use 'hReadEntryHeader' and then read the entry content manually. -- hReadEntry :: Handle -> TarEntryOffset -> IO Entry hReadEntry hnd off = do entry <- hReadEntryHeader hnd off case entryContent entry of NormalFile _ size -> do body <- LBS.hGet hnd (fromIntegral size) return entry { entryContent = NormalFile body size } OtherEntryType c _ size -> do body <- LBS.hGet hnd (fromIntegral size) return entry { entryContent = OtherEntryType c body size } _ -> return entry -- | Read the header for a t'Entry' at the given 'TarEntryOffset' in the tar -- file. The 'entryContent' will contain the correct metadata but an empty file -- content. The 'Handle' must be open for reading and be seekable. -- -- The 'Handle' position is advanced to the beginning of the entry content (if -- any). You must check the 'entryContent' to see if the entry is of type -- 'NormalFile'. If it is, the 'NormalFile' gives the content length and you -- are free to read this much data from the 'Handle'. -- -- > entry <- Tar.hReadEntryHeader hnd -- > case Tar.entryContent entry of -- > Tar.NormalFile _ size -> do content <- BS.hGet hnd size -- > ... -- -- Of course you don't have to read it all in one go (as 'hReadEntry' does), -- you can use any appropriate method to read it incrementally. -- -- In addition to I\/O errors, this can throw a 'FormatError' if the offset is -- wrong, or if the file is not valid tar format. -- -- There is also the lower level operation 'hSeekEntryOffset'. -- hReadEntryHeader :: Handle -> TarEntryOffset -> IO Entry hReadEntryHeader hnd blockOff = do hSeekEntryOffset hnd blockOff header <- LBS.hGet hnd 512 case Tar.read header of Tar.Next entry _ -> return entry Tar.Fail e -> throwIO e Tar.Done -> fail "hReadEntryHeader: impossible" -- | Set the 'Handle' position to the position corresponding to the given -- 'TarEntryOffset'. -- -- This position is where the entry metadata can be read. If you already know -- the entry has a body (and perhaps know it's length), you may wish to seek to -- the body content directly using 'hSeekEntryContentOffset'. -- hSeekEntryOffset :: Handle -> TarEntryOffset -> IO () hSeekEntryOffset hnd blockOff = hSeek hnd AbsoluteSeek (fromIntegral blockOff * 512) -- | Set the 'Handle' position to the entry content position corresponding to -- the given 'TarEntryOffset'. -- -- This position is where the entry content can be read using ordinary I\/O -- operations (though you have to know in advance how big the entry content -- is). This is /only valid/ if you /already know/ the entry has a body (i.e. -- is a normal file). -- hSeekEntryContentOffset :: Handle -> TarEntryOffset -> IO () hSeekEntryContentOffset hnd blockOff = hSeekEntryOffset hnd (blockOff + 1) -- | This is a low level variant on 'hReadEntryHeader', that can be used to -- iterate through a tar file, entry by entry. -- -- It has a few differences compared to 'hReadEntryHeader': -- -- * It returns an indication when the end of the tar file is reached. -- -- * It /does not/ move the 'Handle' position to the beginning of the entry -- content. -- -- * It returns the 'TarEntryOffset' of the next entry. -- -- After this action, the 'Handle' position is not in any useful place. If -- you want to skip to the next entry, take the 'TarEntryOffset' returned and -- use 'hReadEntryHeaderOrEof' again. Or if having inspected the t'Entry' -- header you want to read the entry content (if it has one) then use -- 'hSeekEntryContentOffset' on the original input 'TarEntryOffset'. -- hReadEntryHeaderOrEof :: Handle -> TarEntryOffset -> IO (Maybe (Entry, TarEntryOffset)) hReadEntryHeaderOrEof hnd blockOff = do hSeekEntryOffset hnd blockOff header <- LBS.hGet hnd 1024 case Tar.read header of Tar.Next entry _ -> let !blockOff' = nextEntryOffset entry blockOff in return (Just (entry, blockOff')) Tar.Done -> return Nothing Tar.Fail e -> throwIO e -- | Seek to the end of a tar file, to the position where new entries can -- be appended, and return that 'TarEntryOffset'. -- -- If you have a valid t'TarIndex' for this tar file then you should supply it -- because it allows seeking directly to the correct location. -- -- If you do not have an index, then this becomes an expensive linear -- operation because we have to read each tar entry header from the beginning -- to find the location immediately after the last entry (this is because tar -- files have a variable length trailer and we cannot reliably find that by -- starting at the end). In this mode, it will fail with an exception if the -- file is not in fact in the tar format. -- hSeekEndEntryOffset :: Handle -> Maybe TarIndex -> IO TarEntryOffset hSeekEndEntryOffset hnd (Just index) = do let offset = indexEndEntryOffset index hSeekEntryOffset hnd offset return offset hSeekEndEntryOffset hnd Nothing = do size <- hFileSize hnd if size == 0 then return 0 else seekToEnd 0 where seekToEnd offset = do mbe <- hReadEntryHeaderOrEof hnd offset case mbe of Nothing -> do hSeekEntryOffset hnd offset return offset Just (_, offset') -> seekToEnd offset' ------------------------- -- (de)serialisation -- -- | The t'TarIndex' is compact in memory, and it has a similarly compact -- external representation. -- serialise :: TarIndex -> BS.ByteString serialise = toStrict . serialiseLBS -- we keep this version around just so we can check we got the size right. serialiseLBS :: TarIndex -> LBS.ByteString serialiseLBS index = BS.toLazyByteStringWith (BS.untrimmedStrategy (serialiseSize index) 512) LBS.empty (serialiseBuilder index) serialiseSize :: TarIndex -> Int serialiseSize (TarIndex stringTable intTrie _) = StringTable.serialiseSize stringTable + IntTrie.serialiseSize intTrie + 8 serialiseBuilder :: TarIndex -> BS.Builder serialiseBuilder (TarIndex stringTable intTrie finalOffset) = BS.word32BE 2 -- format version <> BS.word32BE finalOffset <> StringTable.serialise stringTable <> IntTrie.serialise intTrie -- | Read the external representation back into a t'TarIndex'. -- deserialise :: BS.ByteString -> Maybe (TarIndex, BS.ByteString) deserialise bs | BS.length bs < 8 = Nothing | let ver = readWord32BE bs 0 , ver == 1 = do let !finalOffset = readWord32BE bs 1 (stringTable, bs') <- StringTable.deserialiseV1 (BS.unsafeDrop 8 bs) (intTrie, bs'') <- IntTrie.deserialise bs' return (TarIndex stringTable intTrie finalOffset, bs'') | let ver = readWord32BE bs 0 , ver == 2 = do let !finalOffset = readWord32BE bs 1 (stringTable, bs') <- StringTable.deserialiseV2 (BS.unsafeDrop 8 bs) (intTrie, bs'') <- IntTrie.deserialise bs' return (TarIndex stringTable intTrie finalOffset, bs'') | otherwise = Nothing toStrict :: LBS.ByteString -> BS.ByteString toStrict = LBS.toStrict -- 'fromIntegral' is safe even on 32-bit machines, but 'fromEnum' / 'toEnum' is not, -- because 'fromEnum' on 'Word32' near 'maxBound' fails, as well as -- 'toEnum :: Int -> Word32' on negative arguments. pathComponentIdToKey :: PathComponentId -> IntTrie.Key pathComponentIdToKey (PathComponentId n) = IntTrie.Key (fromIntegral n) keyToPathComponentId :: IntTrie.Key -> PathComponentId keyToPathComponentId (IntTrie.Key n) = PathComponentId (fromIntegral n) tar-0.6.4.0/Codec/Archive/Tar/Index/StringTable.hs0000644000000000000000000001753507346545000017616 0ustar0000000000000000{-# LANGUAGE CPP, BangPatterns, PatternGuards, DeriveDataTypeable #-} {-# OPTIONS_GHC -Wno-incomplete-uni-patterns #-} {-# OPTIONS_HADDOCK hide #-} module Codec.Archive.Tar.Index.StringTable ( StringTable(..), lookup, index, construct, StringTableBuilder, empty, insert, inserts, finalise, unfinalise, serialise, serialiseSize, deserialiseV1, deserialiseV2, index' ) where import Data.Typeable (Typeable) import Prelude hiding (lookup, id) import Data.List hiding (lookup, insert) import Data.Function (on) import Data.Word (Word32) import Data.Int (Int32) import Data.Bits import Data.Monoid (Monoid(..)) import Data.Monoid ((<>)) import Control.Exception (assert) import qualified Data.Array.Unboxed as A import qualified Data.Array.Base as A import Data.Array.Unboxed ((!)) import qualified Data.Map.Strict as Map import Data.Map.Strict (Map) import qualified Data.ByteString as BS import qualified Data.ByteString.Unsafe as BS import qualified Data.ByteString.Lazy as LBS import Data.ByteString.Builder as BS import Data.ByteString.Builder.Extra as BS (byteStringCopy) import GHC.IO (unsafePerformIO) import Unsafe.Coerce (unsafeCoerce) import Codec.Archive.Tar.Index.Utils -- | An efficient mapping from strings to a dense set of integers. -- data StringTable id = StringTable {-# UNPACK #-} !BS.ByteString -- all strings concatenated {-# UNPACK #-} !(A.UArray Int32 Word32) -- string offset table {-# UNPACK #-} !(A.UArray Int32 Int32) -- string index to id table {-# UNPACK #-} !(A.UArray Int32 Int32) -- string id to index table deriving (Show, Typeable) instance (Eq id, Enum id) => Eq (StringTable id) where tbl1 == tbl2 = unfinalise tbl1 == unfinalise tbl2 -- | Look up a string in the token table. If the string is present, return -- its corresponding index. -- lookup :: Enum id => StringTable id -> BS.ByteString -> Maybe id lookup (StringTable bs offsets ids _ixs) str = binarySearch 0 (topBound-1) str where (0, topBound) = A.bounds offsets binarySearch !a !b !key | a > b = Nothing | otherwise = case compare key (index' bs offsets mid) of LT -> binarySearch a (mid-1) key EQ -> Just $! toEnum (fromIntegral (ids ! mid)) GT -> binarySearch (mid+1) b key where mid = (a + b) `div` 2 index' :: BS.ByteString -> A.UArray Int32 Word32 -> Int32 -> BS.ByteString index' bs offsets i = BS.unsafeTake len . BS.unsafeDrop start $ bs where start, end, len :: Int start = fromIntegral (offsets ! i) end = fromIntegral (offsets ! (i+1)) len = end - start -- | Given the index of a string in the table, return the string. -- index :: Enum id => StringTable id -> id -> BS.ByteString index (StringTable bs offsets _ids ixs) = index' bs offsets . (ixs !) . fromIntegral . fromEnum -- | Given a list of strings, construct a t'StringTable' mapping those strings -- to a dense set of integers. Also return the ids for all the strings used -- in the construction. -- construct :: Enum id => [BS.ByteString] -> StringTable id construct = finalise . foldl' (\tbl s -> fst (insert s tbl)) empty data StringTableBuilder id = StringTableBuilder !(Map BS.ByteString id) {-# UNPACK #-} !Word32 deriving (Eq, Show, Typeable) empty :: StringTableBuilder id empty = StringTableBuilder Map.empty 0 insert :: Enum id => BS.ByteString -> StringTableBuilder id -> (StringTableBuilder id, id) insert str builder@(StringTableBuilder smap nextid) = case Map.lookup str smap of Just id -> (builder, id) Nothing -> let !id = toEnum (fromIntegral nextid) !smap' = Map.insert str id smap in (StringTableBuilder smap' (nextid+1), id) inserts :: Enum id => [BS.ByteString] -> StringTableBuilder id -> (StringTableBuilder id, [id]) inserts bss builder = mapAccumL (flip insert) builder bss finalise :: Enum id => StringTableBuilder id -> StringTable id finalise (StringTableBuilder smap _) = (StringTable strs offsets ids ixs) where strs = BS.concat (Map.keys smap) offsets = A.listArray (0, fromIntegral (Map.size smap)) . scanl (\off str -> off + fromIntegral (BS.length str)) 0 $ Map.keys smap ids = A.listArray (0, fromIntegral (Map.size smap) - 1) . map (fromIntegral . fromEnum) $ Map.elems smap ixs = A.array (A.bounds ids) [ (id,ix) | (ix,id) <- A.assocs ids ] unfinalise :: Enum id => StringTable id -> StringTableBuilder id unfinalise (StringTable strs offsets ids _) = StringTableBuilder smap nextid where smap = Map.fromAscList [ (index' strs offsets ix, toEnum (fromIntegral (ids ! ix))) | ix <- [0..h] ] (0,h) = A.bounds ids nextid = fromIntegral (h+1) ------------------------- -- (de)serialisation -- serialise :: StringTable id -> BS.Builder serialise (StringTable strs offs ids ixs) = let (_, !ixEnd) = A.bounds offs in BS.word32BE (fromIntegral (BS.length strs)) <> BS.word32BE (fromIntegral ixEnd + 1) <> BS.byteStringCopy strs <> foldr (\n r -> BS.word32BE n <> r) mempty (A.elems offs) <> foldr (\n r -> BS.int32BE n <> r) mempty (A.elems ids) <> foldr (\n r -> BS.int32BE n <> r) mempty (A.elems ixs) serialiseSize :: StringTable id -> Int serialiseSize (StringTable strs offs _ids _ixs) = let (_, !ixEnd) = A.bounds offs in 4 * 2 + BS.length strs + 4 * (fromIntegral ixEnd + 1) + 8 * fromIntegral ixEnd deserialiseV1 :: BS.ByteString -> Maybe (StringTable id, BS.ByteString) deserialiseV1 bs | BS.length bs >= 8 , let lenStrs = fromIntegral (readWord32BE bs 0) lenArr = fromIntegral (readWord32BE bs 1) lenTotal= 8 + lenStrs + 4 * lenArr , BS.length bs >= lenTotal , let strs = BS.unsafeTake lenStrs (BS.unsafeDrop 8 bs) arr = A.array (0, fromIntegral lenArr - 1) [ (i, readWord32BE bs off) | (i, off) <- zip [0 .. fromIntegral lenArr - 1] [offArrS,offArrS+4 .. offArrE] ] ids = A.array (0, fromIntegral lenArr - 1) [ (i,i) | i <- [0 .. fromIntegral lenArr - 1] ] ixs = ids -- two identity mappings offArrS = 8 + lenStrs offArrE = offArrS + 4 * lenArr - 1 !stringTable = StringTable strs arr ids ixs !bs' = BS.drop lenTotal bs = Just (stringTable, bs') | otherwise = Nothing deserialiseV2 :: BS.ByteString -> Maybe (StringTable id, BS.ByteString) deserialiseV2 bs | BS.length bs >= 8 , let lenStrs = fromIntegral (readWord32BE bs 0) lenArr = fromIntegral (readWord32BE bs 1) lenTotal= 8 -- the two length prefixes + lenStrs + 4 * lenArr +(4 * (lenArr - 1)) * 2 -- offsets array is 1 longer , BS.length bs >= lenTotal , let strs = BS.unsafeTake lenStrs (BS.unsafeDrop 8 bs) offs_bs = BS.unsafeDrop (8 + lenStrs) bs ids_bs = BS.unsafeDrop (lenArr * 4) offs_bs ixs_bs = BS.unsafeDrop ((lenArr - 1) * 4) ids_bs castArray :: A.UArray i Word32 -> A.UArray i Int32 castArray (A.UArray a b c d) = (A.UArray a b c d) -- Bangs are crucial for this to work in spite of unsafePerformIO! (offs, ids, ixs) = unsafePerformIO $ do !r1 <- beToLe (fromIntegral lenArr) offs_bs !r2 <- castArray <$> beToLe (fromIntegral lenArr - 1) ids_bs !r3 <- castArray <$> beToLe (fromIntegral lenArr - 1) ixs_bs return (r1, r2, r3) !stringTable = StringTable strs offs ids ixs !bs_left = BS.drop lenTotal bs = Just (stringTable, bs_left) | otherwise = Nothing tar-0.6.4.0/Codec/Archive/Tar/Index/Utils.hs0000644000000000000000000000622307346545000016470 0ustar0000000000000000{-# LANGUAGE MagicHash, UnboxedTuples, BangPatterns, CPP #-} {-# OPTIONS_HADDOCK hide #-} module Codec.Archive.Tar.Index.Utils where import Data.ByteString as BS import Control.Exception (assert) import Data.ByteString.Internal (ByteString(..), unsafeWithForeignPtr, accursedUnutterablePerformIO) import GHC.Int (Int(..), Int32) import GHC.Word (Word32(..), byteSwap32) import Foreign.Storable (peek) import GHC.Ptr (castPtr, plusPtr, Ptr) import GHC.Exts import GHC.IO (IO(..), unsafePerformIO) import Data.Array.Base import Data.Array.IO.Internals (unsafeFreezeIOUArray) import Control.DeepSeq (NFData(..)) import GHC.Storable import GHC.ByteOrder #include -- | Construct a `UArray Word32 Word32` from a ByteString of 32bit big endian -- words. -- -- Note: If using `unsafePerformIO`, be sure to force the result of running the -- IO action right away... (e.g. see calls to beToLe in StringTable) beToLe :: (Integral i, Num i) => i -- ^ The total array length (the number of 32bit words in the array) -> BS.ByteString -- ^ The bytestring from which the UArray is constructed. -- The content must start in the first byte! (i.e. the meta-data words -- that shouldn't be part of the array, must have been dropped already) -> IO (UArray i Word32) beToLe lenArr (BS fptr _) = do unsafeWithForeignPtr fptr $ \ptr -> do let ptr' = castPtr ptr :: Ptr Word32 !(I# lenBytes#) = fromIntegral (lenArr * 4) -- In spirit, the following does this, but we can't use `newGenArray` -- because it only has been introduced in later versions of array: -- @@ -- unsafeFreezeIOUArray =<< -- newGenArray (0, lenArr - 1) (\offset -> do -- byteSwap32 <$> peek (ptr' `plusPtr` (fromIntegral offset * 4))) -- @@ IO $ \rw0 -> case newByteArray# lenBytes# rw0 of (# rw1, mba# #) -> let loop :: Int -> State# RealWorld -> State# RealWorld loop !offset st | offset < fromIntegral lenArr = let IO getV = readWord32OffPtrBE ptr' offset !(I# o#) = offset in case getV st of (# st', W32# v# #) -> loop (offset + 1) (writeWord32Array# mba# o# v# st') | otherwise = st in case unsafeFreezeByteArray# mba# (loop 0 rw1) of (# rw2, ba# #) -> (# rw2, UArray 0 (lenArr - 1) (fromIntegral lenArr) ba# #) {-# SPECIALISE beToLe :: Word32 -> BS.ByteString -> IO (UArray Word32 Word32) #-} {-# SPECIALISE beToLe :: Int32 -> BS.ByteString -> IO (UArray Int32 Word32) #-} readInt32BE :: BS.ByteString -> Int -> Int32 readInt32BE bs i = fromIntegral (readWord32BE bs i) {-# INLINE readInt32BE #-} readWord32OffPtrBE :: Ptr Word32 -> Int -> IO Word32 readWord32OffPtrBE ptr i = do #if defined(WORDS_BIGENDIAN) readWord32OffPtr ptr i #else byteSwap32 <$> readWord32OffPtr ptr i #endif readWord32BE :: BS.ByteString -> Int -> Word32 readWord32BE (BS fptr len) i = assert (i >= 0 && i+3 <= len - 1) $ accursedUnutterablePerformIO $ unsafeWithForeignPtr fptr $ \ptr -> do readWord32OffPtrBE (castPtr ptr) i {-# INLINE readWord32BE #-} tar-0.6.4.0/Codec/Archive/Tar/LongNames.hs0000644000000000000000000001423407346545000016205 0ustar0000000000000000{-# LANGUAGE LambdaCase #-} {-# LANGUAGE PackageImports #-} {-# OPTIONS_HADDOCK hide #-} module Codec.Archive.Tar.LongNames ( encodeLongNames , decodeLongNames , DecodeLongNamesError(..) ) where import Codec.Archive.Tar.PackAscii import Codec.Archive.Tar.Types import Control.Exception import qualified Data.ByteString.Char8 as B import qualified Data.ByteString.Lazy.Char8 as BL import "os-string" System.OsString.Posix (PosixString, PosixChar) import qualified "os-string" System.OsString.Posix as PS -- | Errors raised by 'decodeLongNames'. -- -- @since 0.6.0.0 data DecodeLongNamesError = TwoTypeKEntries -- ^ Two adjacent 'OtherEntryType' @\'K\'@ nodes. | TwoTypeLEntries -- ^ Two adjacent 'OtherEntryType' @\'L\'@ nodes. | NoLinkEntryAfterTypeKEntry -- ^ 'OtherEntryType' @\'K\'@ node is not followed by a 'SymbolicLink' / 'HardLink'. deriving (Eq, Ord, Show) instance Exception DecodeLongNamesError -- | Translate high-level entries with POSIX 'FilePath's for files and symlinks -- into entries suitable for serialization by emitting additional -- 'OtherEntryType' @\'K\'@ and 'OtherEntryType' @\'L\'@ nodes. -- -- Input 'FilePath's must be POSIX file names, not native ones. -- -- @since 0.6.0.0 encodeLongNames :: GenEntry FilePath FilePath -> [Entry] encodeLongNames e = maybe id (:) mEntry $ maybe id (:) mEntry' [e''] where (mEntry, e') = encodeLinkTarget e (mEntry', e'') = encodeTarPath e' encodeTarPath :: GenEntry FilePath linkTarget -> (Maybe (GenEntry TarPath whatever), GenEntry TarPath linkTarget) -- ^ (LongLink entry, actual entry) encodeTarPath e = case toTarPath' (entryTarPath e) of FileNameEmpty -> (Nothing, e { entryTarPath = TarPath mempty mempty }) FileNameOK tarPath -> (Nothing, e { entryTarPath = tarPath }) FileNameTooLong tarPath -> (Just $ longLinkEntry $ entryTarPath e, e { entryTarPath = tarPath }) encodeLinkTarget :: GenEntry tarPath FilePath -> (Maybe (GenEntry TarPath LinkTarget), GenEntry tarPath LinkTarget) -- ^ (LongLink symlink entry, actual entry) encodeLinkTarget e = case entryContent e of NormalFile x y -> (Nothing, e { entryContent = NormalFile x y }) Directory -> (Nothing, e { entryContent = Directory }) SymbolicLink lnk -> let (mEntry, lnk') = encodeLinkPath lnk in (mEntry, e { entryContent = SymbolicLink lnk' }) HardLink lnk -> let (mEntry, lnk') = encodeLinkPath lnk in (mEntry, e { entryContent = HardLink lnk' }) CharacterDevice x y -> (Nothing, e { entryContent = CharacterDevice x y }) BlockDevice x y -> (Nothing, e { entryContent = BlockDevice x y }) NamedPipe -> (Nothing, e { entryContent = NamedPipe }) OtherEntryType x y z -> (Nothing, e { entryContent = OtherEntryType x y z }) encodeLinkPath :: FilePath -> (Maybe (GenEntry TarPath LinkTarget), LinkTarget) encodeLinkPath lnk = case toTarPath' lnk of FileNameEmpty -> (Nothing, LinkTarget mempty) FileNameOK (TarPath name prefix) | PS.null prefix -> (Nothing, LinkTarget name) | otherwise -> (Just $ longSymLinkEntry lnk, LinkTarget name) FileNameTooLong (TarPath name _) -> (Just $ longSymLinkEntry lnk, LinkTarget name) -- | Translate low-level entries (usually freshly deserialized) into -- high-level entries with POSIX 'FilePath's for files and symlinks -- by parsing and eliminating -- 'OtherEntryType' @\'K\'@ and 'OtherEntryType' @\'L\'@ nodes. -- -- Resolved 'FilePath's are still POSIX file names, not native ones. -- -- @since 0.6.0.0 decodeLongNames :: Entries e -> GenEntries FilePath FilePath (Either e DecodeLongNamesError) decodeLongNames = go Nothing Nothing where go :: Maybe FilePath -> Maybe FilePath -> Entries e -> GenEntries FilePath FilePath (Either e DecodeLongNamesError) go _ _ (Fail err) = Fail (Left err) go _ _ Done = Done go Nothing Nothing (Next e rest) = case entryContent e of OtherEntryType 'K' fn _ -> go (Just (otherEntryPayloadToFilePath fn)) Nothing rest OtherEntryType 'L' fn _ -> go Nothing (Just (otherEntryPayloadToFilePath fn)) rest _ -> Next (castEntry e) (go Nothing Nothing rest) go Nothing (Just path) (Next e rest) = case entryContent e of OtherEntryType 'K' fn _ -> go (Just (otherEntryPayloadToFilePath fn)) (Just path) rest OtherEntryType 'L' _ _ -> Fail $ Right TwoTypeLEntries _ -> Next ((castEntry e) { entryTarPath = path }) (go Nothing Nothing rest) go (Just link) Nothing (Next e rest) = case entryContent e of OtherEntryType 'K' _ _ -> Fail $ Right TwoTypeKEntries OtherEntryType 'L' fn _ -> go (Just link) (Just (otherEntryPayloadToFilePath fn)) rest SymbolicLink{} -> Next ((castEntry e) { entryContent = SymbolicLink link }) (go Nothing Nothing rest) HardLink{} -> Next ((castEntry e) { entryContent = HardLink link }) (go Nothing Nothing rest) _ -> Fail $ Right NoLinkEntryAfterTypeKEntry go (Just link) (Just path) (Next e rest) = case entryContent e of OtherEntryType 'K' _ _ -> Fail $ Right TwoTypeKEntries OtherEntryType 'L' _ _ -> Fail $ Right TwoTypeLEntries SymbolicLink{} -> Next ((castEntry e) { entryTarPath = path, entryContent = SymbolicLink link }) (go Nothing Nothing rest) HardLink{} -> Next ((castEntry e) { entryTarPath = path, entryContent = HardLink link }) (go Nothing Nothing rest) _ -> Fail $ Right NoLinkEntryAfterTypeKEntry otherEntryPayloadToFilePath :: BL.ByteString -> FilePath otherEntryPayloadToFilePath = fromPosixString . byteToPosixString . B.takeWhile (/= '\0') . BL.toStrict castEntry :: Entry -> GenEntry FilePath FilePath castEntry e = e { entryTarPath = fromTarPathToPosixPath (entryTarPath e) , entryContent = castEntryContent (entryContent e) } castEntryContent :: EntryContent -> GenEntryContent FilePath castEntryContent = \case NormalFile x y -> NormalFile x y Directory -> Directory SymbolicLink linkTarget -> SymbolicLink $ fromLinkTargetToPosixPath linkTarget HardLink linkTarget -> HardLink $ fromLinkTargetToPosixPath linkTarget CharacterDevice x y -> CharacterDevice x y BlockDevice x y -> BlockDevice x y NamedPipe -> NamedPipe OtherEntryType x y z -> OtherEntryType x y z tar-0.6.4.0/Codec/Archive/Tar/Pack.hs0000644000000000000000000002062007346545000015174 0ustar0000000000000000{-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE ViewPatterns #-} {-# OPTIONS_HADDOCK hide #-} {-# OPTIONS_GHC -Wno-unrecognised-pragmas #-} {-# HLINT ignore "Avoid restricted function" #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2009, 2012, 2016 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Pack ( pack, packAndCheck, packFileEntry, packDirectoryEntry, packSymlinkEntry, longLinkEntry, ) where import Codec.Archive.Tar.LongNames import Codec.Archive.Tar.PackAscii (filePathToOsPath, osPathToFilePath) import Codec.Archive.Tar.Types import Data.Bifunctor (bimap) import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import Data.Foldable import System.File.OsPath import System.OsPath ( OsPath, () ) import qualified System.OsPath as FilePath.Native ( addTrailingPathSeparator, hasTrailingPathSeparator ) import System.Directory.OsPath ( doesDirectoryExist, getModificationTime , pathIsSymbolicLink, getSymbolicLinkTarget , Permissions(..), getPermissions, getFileSize ) import qualified System.Directory.OsPath.Types as FT import System.Directory.OsPath.Streaming (getDirectoryContentsRecursive) import Data.Time.Clock.POSIX ( utcTimeToPOSIXSeconds ) import System.IO ( IOMode(ReadMode), hFileSize ) import System.IO.Unsafe (unsafeInterleaveIO) import Control.Exception (throwIO, SomeException) -- | Creates a tar archive from a list of directory or files. Any directories -- specified will have their contents included recursively. Paths in the -- archive will be relative to the given base directory. -- -- This is a portable implementation of packing suitable for portable archives. -- In particular it only constructs 'NormalFile', 'Directory' and 'SymbolicLink' -- entries. Hard links are treated like ordinary files. Special files like -- FIFOs (named pipes), sockets or device files will cause problems. -- -- * This function returns results lazily. Subdirectories are scanned -- and files are read one by one as the list of entries is consumed. -- Do not change their contents before the output of 'Codec.Archive.Tar.pack' was consumed in full. -- pack :: FilePath -- ^ Base directory -> [FilePath] -- ^ Files and directories to pack, relative to the base dir -> IO [Entry] pack = packAndCheck (const Nothing) -- | Like 'Codec.Archive.Tar.pack', but allows to specify additional sanity/security -- checks on the input filenames. This is useful if you know which -- check will be used on client side -- in 'Codec.Archive.Tar.unpack' / 'Codec.Archive.Tar.unpackAndCheck'. -- -- @since 0.6.0.0 packAndCheck :: (GenEntry FilePath FilePath -> Maybe SomeException) -> FilePath -- ^ Base directory -> [FilePath] -- ^ Files and directories to pack, relative to the base dir -> IO [Entry] packAndCheck secCB (filePathToOsPath -> baseDir) (map filePathToOsPath -> relpaths) = do paths <- preparePaths baseDir relpaths entries' <- packPaths baseDir paths let entries = map (bimap osPathToFilePath osPathToFilePath) entries' traverse_ (maybe (pure ()) throwIO . secCB) entries pure $ concatMap encodeLongNames entries preparePaths :: OsPath -> [OsPath] -> IO [OsPath] preparePaths baseDir = fmap concat . interleave . map go where go :: OsPath -> IO [OsPath] go relpath = do let abspath = baseDir relpath isDir <- doesDirectoryExist abspath isSymlink <- pathIsSymbolicLink abspath if isDir && not isSymlink then do entries <- getDirectoryContentsRecursive abspath let entries' = map ((relpath ) . addSeparatorIfDir) entries return $ if relpath == mempty then entries' else FilePath.Native.addTrailingPathSeparator relpath : entries' else return [relpath] addSeparatorIfDir (fn, ty) = case ty of FT.Directory{} -> FilePath.Native.addTrailingPathSeparator fn _ -> fn -- | Pack paths while accounting for overlong filepaths. packPaths :: OsPath -> [OsPath] -> IO [GenEntry OsPath OsPath] packPaths baseDir paths = interleave $ flip map paths $ \relpath -> do let isDir = FilePath.Native.hasTrailingPathSeparator abspath abspath = baseDir relpath isSymlink <- pathIsSymbolicLink abspath let mkEntry | isSymlink = packSymlinkEntry' | isDir = packDirectoryEntry' | otherwise = packFileEntry' mkEntry abspath relpath interleave :: [IO a] -> IO [a] interleave = unsafeInterleaveIO . go where go [] = return [] go (x:xs) = do x' <- x xs' <- interleave xs return (x':xs') -- | Construct a tar entry based on a local file. -- -- This sets the entry size, the data contained in the file and the file's -- modification time. If the file is executable then that information is also -- preserved. File ownership and detailed permissions are not preserved. -- -- * The file contents is read lazily. -- packFileEntry :: FilePath -- ^ Full path to find the file on the local disk -> tarPath -- ^ Path to use for the tar 'GenEntry' in the archive -> IO (GenEntry tarPath linkTarget) packFileEntry = packFileEntry' . filePathToOsPath packFileEntry' :: OsPath -- ^ Full path to find the file on the local disk -> tarPath -- ^ Path to use for the tar 'GenEntry' in the archive -> IO (GenEntry tarPath linkTarget) packFileEntry' filepath tarpath = do mtime <- getModTime filepath perms <- getPermissions filepath -- Get file size without opening it. approxSize <- getFileSize filepath (content, size) <- if approxSize < 131072 -- If file is short enough, just read it strictly -- so that no file handle dangles around indefinitely. then do cnt <- readFile' filepath pure (BL.fromStrict cnt, fromIntegral $ B.length cnt) else do hndl <- openBinaryFile filepath ReadMode -- File size could have changed between measuring approxSize -- and here. Measuring again. sz <- hFileSize hndl -- Lazy I/O at its best: once cnt is forced in full, -- BL.hGetContents will close the handle. cnt <- BL.hGetContents hndl -- It would be wrong to return (cnt, BL.length sz): -- NormalFile constructor below forces size which in turn -- allocates entire cnt in memory at once. pure (cnt, fromInteger sz) pure (simpleEntry tarpath (NormalFile content size)) { entryPermissions = if executable perms then executableFilePermissions else ordinaryFilePermissions , entryTime = mtime } -- | Construct a tar entry based on a local directory (but not its contents). -- -- The only attribute of the directory that is used is its modification time. -- Directory ownership and detailed permissions are not preserved. -- packDirectoryEntry :: FilePath -- ^ Full path to find the file on the local disk -> tarPath -- ^ Path to use for the tar 'GenEntry' in the archive -> IO (GenEntry tarPath linkTarget) packDirectoryEntry = packDirectoryEntry' . filePathToOsPath packDirectoryEntry' :: OsPath -- ^ Full path to find the file on the local disk -> tarPath -- ^ Path to use for the tar 'GenEntry' in the archive -> IO (GenEntry tarPath linkTarget) packDirectoryEntry' filepath tarpath = do mtime <- getModTime filepath return (directoryEntry tarpath) { entryTime = mtime } -- | Construct a tar entry based on a local symlink. -- -- @since 0.6.0.0 packSymlinkEntry :: FilePath -- ^ Full path to find the file on the local disk -> tarPath -- ^ Path to use for the tar 'GenEntry' in the archive -> IO (GenEntry tarPath FilePath) packSymlinkEntry = ((fmap (fmap osPathToFilePath) .) . packSymlinkEntry') . filePathToOsPath packSymlinkEntry' :: OsPath -- ^ Full path to find the file on the local disk -> tarPath -- ^ Path to use for the tar 'GenEntry' in the archive -> IO (GenEntry tarPath OsPath) packSymlinkEntry' filepath tarpath = do linkTarget <- getSymbolicLinkTarget filepath pure $ symlinkEntry tarpath linkTarget getModTime :: OsPath -> IO EpochTime getModTime path = do -- The directory package switched to the new time package t <- getModificationTime path return . floor . utcTimeToPOSIXSeconds $ t tar-0.6.4.0/Codec/Archive/Tar/PackAscii.hs0000644000000000000000000000304507346545000016147 0ustar0000000000000000{-# LANGUAGE PackageImports #-} {-# OPTIONS_HADDOCK hide #-} {-# OPTIONS_GHC -Wno-deprecations #-} {-# OPTIONS_GHC -Wno-unrecognised-pragmas #-} {-# HLINT ignore "Avoid restricted function" #-} module Codec.Archive.Tar.PackAscii ( toPosixString , fromPosixString , posixToByteString , byteToPosixString , packAscii , filePathToOsPath , osPathToFilePath ) where import Data.ByteString (ByteString) import qualified Data.ByteString.Char8 as BS.Char8 import qualified Data.ByteString.Short as Sh import Data.Char import GHC.Stack import System.IO.Unsafe (unsafePerformIO) import "os-string" System.OsString.Posix (PosixString) import qualified "filepath" System.OsPath as OS import qualified "os-string" System.OsString.Posix as PS import qualified "os-string" System.OsString.Internal.Types as PS toPosixString :: FilePath -> PosixString toPosixString = unsafePerformIO . PS.encodeFS fromPosixString :: PosixString -> FilePath fromPosixString = unsafePerformIO . PS.decodeFS posixToByteString :: PosixString -> ByteString posixToByteString = Sh.fromShort . PS.getPosixString byteToPosixString :: ByteString -> PosixString byteToPosixString = PS.PosixString . Sh.toShort packAscii :: HasCallStack => FilePath -> BS.Char8.ByteString packAscii xs | all isAscii xs = BS.Char8.pack xs | otherwise = error $ "packAscii: only ASCII inputs are supported, but got " ++ xs filePathToOsPath :: FilePath -> OS.OsPath filePathToOsPath = unsafePerformIO . OS.encodeFS osPathToFilePath :: OS.OsPath -> FilePath osPathToFilePath = unsafePerformIO . OS.decodeFS tar-0.6.4.0/Codec/Archive/Tar/Read.hs0000644000000000000000000002533407346545000015200 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE PackageImports #-} {-# OPTIONS_HADDOCK hide #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Read -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2009 Duncan Coutts, -- 2011 Max Bolingbroke -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Read ( read , FormatError(..) ) where import Codec.Archive.Tar.PackAscii import Codec.Archive.Tar.Types import Data.Char (ord) import Data.Int (Int64) import Data.Bits (Bits(shiftL, (.&.), complement)) import Control.Exception (Exception(..)) import Data.Typeable (Typeable) import Control.Applicative import Control.Monad import Control.DeepSeq import Control.Monad.Trans.State.Lazy import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS.Char8 import qualified Data.ByteString.Unsafe as BS import qualified Data.ByteString.Lazy as LBS import System.IO.Unsafe (unsafePerformIO) import "os-string" System.OsString.Posix (PosixString, PosixChar) import qualified "os-string" System.OsString.Posix as PS import Prelude hiding (read) -- | Errors that can be encountered when parsing a Tar archive. data FormatError = TruncatedArchive | ShortTrailer | BadTrailer | TrailingJunk | ChecksumIncorrect | NotTarFormat | UnrecognisedTarFormat | HeaderBadNumericEncoding deriving (Eq, Show, Typeable) instance Exception FormatError where displayException TruncatedArchive = "truncated tar archive" displayException ShortTrailer = "short tar trailer" displayException BadTrailer = "bad tar trailer" displayException TrailingJunk = "tar file has trailing junk" displayException ChecksumIncorrect = "tar checksum error" displayException NotTarFormat = "data is not in tar format" displayException UnrecognisedTarFormat = "tar entry not in a recognised format" displayException HeaderBadNumericEncoding = "tar header is malformed (bad numeric encoding)" instance NFData FormatError where rnf !_ = () -- enumerations are fully strict by construction -- | Convert a data stream in the tar file format into an internal data -- structure. Decoding errors are reported by the 'Fail' constructor of the -- 'Entries' type. -- -- * The conversion is done lazily. -- read :: LBS.ByteString -> Entries FormatError read = evalState (readStreaming getN get) where getN :: Int64 -> State LBS.ByteString LBS.ByteString getN n = do (pref, st) <- LBS.splitAt n <$> get put st pure pref readStreaming :: Monad m => (Int64 -> m LBS.ByteString) -> m LBS.ByteString -> m (Entries FormatError) readStreaming = (unfoldEntriesM id .) . getEntryStreaming getEntryStreaming :: Monad m => (Int64 -> m LBS.ByteString) -> m LBS.ByteString -> m (Either FormatError (Maybe Entry)) getEntryStreaming getN getAll = do header <- getN 512 if LBS.length header < 512 then pure (Left TruncatedArchive) else do -- Tar files end with at least two blocks of all '0'. Checking this serves -- two purposes. It checks the format but also forces the tail of the data -- which is necessary to close the file if it came from a lazily read file. -- -- It's tempting to fall into trailer parsing as soon as LBS.head bs == '\0', -- because, if interpreted as an 'Entry', it means that 'entryTarPath' is an empty -- string. Yet it's not a concern of this function: parse it as an 'Entry' -- and let further pipeline such as 'checkEntrySecurity' deal with it. After all, -- it might be a format extension with unknown semantics. Such somewhat malformed -- archives do exist in the wild, see https://github.com/haskell/tar/issues/73. -- -- Only if an entire block is null, we assume that we are parsing a trailer. if LBS.all (== 0) header then do nextBlock <- getN 512 if LBS.length nextBlock < 512 then pure (Left ShortTrailer) else if LBS.all (== 0) nextBlock then do remainder <- getAll pure $ if LBS.all (== 0) remainder then Right Nothing else Left TrailingJunk else pure (Left BadTrailer) else case parseHeader header of Left err -> pure $ Left err Right (name, mode, uid, gid, size, mtime, typecode, linkname, format, uname, gname, devmajor, devminor, prefix) -> do -- It is crucial to get (size + padding) in one monadic operation -- and drop padding in a pure computation. If you get size bytes first, -- then skip padding, unpacking in constant memory will become impossible. let paddedSize = (size + 511) .&. complement 511 paddedContent <- getN paddedSize let content = LBS.take size paddedContent pure $ Right $ Just $ Entry { entryTarPath = TarPath (byteToPosixString name) (byteToPosixString prefix), entryContent = case typecode of '\0' -> NormalFile content size '0' -> NormalFile content size '1' -> HardLink (LinkTarget $ byteToPosixString linkname) '2' -> SymbolicLink (LinkTarget $ byteToPosixString linkname) _ | format == V7Format -> OtherEntryType typecode content size '3' -> CharacterDevice devmajor devminor '4' -> BlockDevice devmajor devminor '5' -> Directory '6' -> NamedPipe '7' -> NormalFile content size _ -> OtherEntryType typecode content size, entryPermissions = mode, entryOwnership = Ownership (BS.Char8.unpack uname) (BS.Char8.unpack gname) uid gid, entryTime = mtime, entryFormat = format } parseHeader :: LBS.ByteString -> Either FormatError (BS.ByteString, Permissions, Int, Int, Int64, EpochTime, Char, BS.ByteString, Format, BS.ByteString, BS.ByteString, DevMajor, DevMinor, BS.ByteString) parseHeader header' = do case (chksum_, format_ magic) of (Right chksum, _ ) | correctChecksum header chksum -> return () (Right _, Right _) -> Left ChecksumIncorrect _ -> Left NotTarFormat mode <- mode_ uid <- uid_ gid <- gid_ size <- size_ mtime <- mtime_ format <- format_ magic devmajor <- devmajor_ devminor <- devminor_ pure (name, mode, uid, gid, size, mtime, typecode, linkname, format, uname, gname, devmajor, devminor, prefix) where header = LBS.toStrict header' name = getString 0 100 header mode_ = getOct 100 8 header uid_ = getOct 108 8 header gid_ = getOct 116 8 header size_ = getOct 124 12 header mtime_ = getOct 136 12 header chksum_ = getOct 148 8 header typecode = getByte 156 header linkname = getString 157 100 header magic = getChars 257 8 header uname = getString 265 32 header gname = getString 297 32 header devmajor_ = getOct 329 8 header devminor_ = getOct 337 8 header prefix = getString 345 155 header -- trailing = getBytes 500 12 header format_ :: BS.ByteString -> Either FormatError Format format_ magic | magic == ustarMagic = return UstarFormat | magic == gnuMagic = return GnuFormat | magic == v7Magic = return V7Format | otherwise = Left UnrecognisedTarFormat v7Magic, ustarMagic, gnuMagic :: BS.ByteString v7Magic = BS.Char8.pack "\0\0\0\0\0\0\0\0" ustarMagic = BS.Char8.pack "ustar\NUL00" gnuMagic = BS.Char8.pack "ustar \NUL" correctChecksum :: BS.ByteString -> Int -> Bool correctChecksum header checksum = checksum == checksum' where -- sum of all 512 bytes in the header block, -- treating each byte as an 8-bit unsigned value sumchars = BS.foldl' (\x y -> x + fromIntegral y) 0 -- treating the 8 bytes of chksum as blank characters. checksum' = sumchars (BS.take 148 header) + 256 -- 256 = sumchars (BS.Char8.replicate 8 ' ') + sumchars (BS.drop 156 header) -- * TAR format primitive input {-# SPECIALISE getOct :: Int -> Int -> BS.ByteString -> Either FormatError Int #-} {-# SPECIALISE getOct :: Int -> Int -> BS.ByteString -> Either FormatError Int64 #-} getOct :: (Integral a, Bits a) => Int -> Int -> BS.ByteString -> Either FormatError a getOct off len = parseOct . getBytes off len where -- As a star extension, octal fields can hold a base-256 value if the high -- bit of the initial character is set. The initial character can be: -- 0x80 ==> trailing characters hold a positive base-256 value -- 0xFF ==> trailing characters hold a negative base-256 value -- -- In both cases, there won't be a trailing NUL/space. -- -- GNU tar seems to contain a half-implementation of code that deals with -- extra bits in the first character, but I don't think it works and the -- docs I can find on star seem to suggest that these will always be 0, -- which is what I will assume. parseOct s | BS.head s == 128 = return (readBytes (BS.tail s)) | BS.head s == 255 = return (negate (readBytes (BS.tail s))) parseOct s | BS.null stripped = return 0 | otherwise = case readOct stripped of Just x -> return x Nothing -> Left HeaderBadNumericEncoding where stripped = BS.Char8.takeWhile (\c -> c /= '\NUL' && c /= ' ') $ BS.Char8.dropWhile (== ' ') s readBytes :: (Integral a, Bits a) => BS.ByteString -> a readBytes = BS.foldl' (\acc x -> acc `shiftL` 8 + fromIntegral x) 0 getBytes :: Int -> Int -> BS.ByteString -> BS.ByteString getBytes off len = BS.take len . BS.drop off getByte :: Int -> BS.ByteString -> Char getByte off bs = BS.Char8.index bs off getChars :: Int -> Int -> BS.ByteString -> BS.ByteString getChars = getBytes getString :: Int -> Int -> BS.ByteString -> BS.ByteString getString off len = BS.copy . BS.Char8.takeWhile (/='\0') . getBytes off len {-# SPECIALISE readOct :: BS.ByteString -> Maybe Int #-} {-# SPECIALISE readOct :: BS.ByteString -> Maybe Int64 #-} readOct :: Integral n => BS.ByteString -> Maybe n readOct = go 0 0 where go :: Integral n => Int -> n -> BS.ByteString -> Maybe n go !i !n !bs = case BS.uncons bs of Nothing -> if i == 0 then Nothing else Just n Just (w, tl) | w >= 0x30 && w <= 0x39 -> go (i+1) (n * 8 + (fromIntegral w - 0x30)) tl | otherwise -> Nothing tar-0.6.4.0/Codec/Archive/Tar/Types.hs0000644000000000000000000006304007346545000015425 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE DeriveTraversable #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE PackageImports #-} {-# LANGUAGE QuasiQuotes #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE ScopedTypeVariables #-} {-# OPTIONS_HADDOCK hide #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Types -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2009 Duncan Coutts -- 2011 Max Bolingbroke -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- -- Types to represent the content of @.tar@ archives. -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Types ( GenEntry(..), Entry, entryPath, GenEntryContent(..), EntryContent, FileSize, Permissions, Ownership(..), EpochTime, TypeCode, DevMajor, DevMinor, Format(..), simpleEntry, longLinkEntry, longSymLinkEntry, fileEntry, symlinkEntry, directoryEntry, ordinaryFilePermissions, symbolicLinkPermission, executableFilePermissions, directoryPermissions, TarPath(..), toTarPath, toTarPath', ToTarPathResult(..), fromTarPath, fromTarPathToPosixPath, fromTarPathToWindowsPath, fromFilePathToNative, LinkTarget(..), toLinkTarget, fromLinkTarget, fromLinkTargetToPosixPath, fromLinkTargetToWindowsPath, fromFilePathToWindowsPath, GenEntries(..), Entries, mapEntries, mapEntriesNoFail, foldEntries, foldlEntries, unfoldEntries, unfoldEntriesM, ) where import Data.Bifunctor (Bifunctor, bimap) import Data.Int (Int64) import Data.List.NonEmpty (NonEmpty(..)) import Data.Monoid (Monoid(..)) import Data.Semigroup as Sem import Data.Typeable import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS.Char8 import qualified Data.ByteString.Lazy as LBS import Control.DeepSeq import Control.Exception (Exception, displayException) import qualified System.FilePath as FilePath.Native ( joinPath, splitDirectories, addTrailingPathSeparator, hasTrailingPathSeparator, pathSeparator, isAbsolute, hasTrailingPathSeparator ) import qualified System.FilePath.Posix as FilePath.Posix ( joinPath, splitPath, splitDirectories, hasTrailingPathSeparator , addTrailingPathSeparator, pathSeparator ) import qualified System.FilePath.Windows as FilePath.Windows ( joinPath, addTrailingPathSeparator, pathSeparator ) import System.Posix.Types ( FileMode ) import "os-string" System.OsString.Posix (PosixString, PosixChar) import qualified "os-string" System.OsString.Posix as PS import Codec.Archive.Tar.PackAscii -- | File size in bytes. type FileSize = Int64 -- | The number of seconds since the UNIX epoch. type EpochTime = Int64 -- | Major device number. type DevMajor = Int -- | Minor device number. type DevMinor = Int -- | User-defined tar format expansion. type TypeCode = Char -- | Permissions information for 'GenEntry'. type Permissions = FileMode -- | Polymorphic tar archive entry. High-level interfaces -- commonly work with 'GenEntry' 'FilePath' 'FilePath', -- while low-level ones use 'GenEntry' t'TarPath' t'LinkTarget'. -- -- @since 0.6.0.0 data GenEntry tarPath linkTarget = Entry { -- | The path of the file or directory within the archive. entryTarPath :: !tarPath, -- | The real content of the entry. For 'NormalFile' this includes the -- file data. An entry usually contains a 'NormalFile' or a 'Directory'. entryContent :: !(GenEntryContent linkTarget), -- | File permissions (Unix style file mode). entryPermissions :: {-# UNPACK #-} !Permissions, -- | The user and group to which this file belongs. entryOwnership :: {-# UNPACK #-} !Ownership, -- | The time the file was last modified. entryTime :: {-# UNPACK #-} !EpochTime, -- | The tar format the archive is using. entryFormat :: !Format } deriving ( Eq , Show , Functor -- ^ @since 0.6.4.0 ) -- | @since 0.6.4.0 instance Bifunctor GenEntry where bimap f g e = e { entryTarPath = f (entryTarPath e) , entryContent = fmap g (entryContent e) } -- | Monomorphic tar archive entry, ready for serialization / deserialization. -- type Entry = GenEntry TarPath LinkTarget -- | Low-level function to get a native 'FilePath' of the file or directory -- within the archive, not accounting for long names. It's likely -- that you want to apply 'Codec.Archive.Tar.decodeLongNames' -- and use 'Codec.Archive.Tar.Entry.entryTarPath' afterwards instead of 'entryPath'. -- entryPath :: GenEntry TarPath linkTarget -> FilePath entryPath = fromTarPath . entryTarPath -- | Polymorphic content of a tar archive entry. High-level interfaces -- commonly work with 'GenEntryContent' 'FilePath', -- while low-level ones use 'GenEntryContent' t'LinkTarget'. -- -- Portable archives should contain only 'NormalFile' and 'Directory'. -- -- @since 0.6.0.0 data GenEntryContent linkTarget = NormalFile LBS.ByteString {-# UNPACK #-} !FileSize | Directory | SymbolicLink !linkTarget | HardLink !linkTarget | CharacterDevice {-# UNPACK #-} !DevMajor {-# UNPACK #-} !DevMinor | BlockDevice {-# UNPACK #-} !DevMajor {-# UNPACK #-} !DevMinor | NamedPipe | OtherEntryType {-# UNPACK #-} !TypeCode LBS.ByteString {-# UNPACK #-} !FileSize deriving ( Eq , Ord , Show , Functor -- ^ @since 0.6.4.0 ) -- | Monomorphic content of a tar archive entry, -- ready for serialization / deserialization. type EntryContent = GenEntryContent LinkTarget -- | Ownership information for 'GenEntry'. data Ownership = Ownership { -- | The owner user name. Should be set to @\"\"@ if unknown. -- Must not contain non-ASCII characters. ownerName :: String, -- | The owner group name. Should be set to @\"\"@ if unknown. -- Must not contain non-ASCII characters. groupName :: String, -- | Numeric owner user id. Should be set to @0@ if unknown. ownerId :: {-# UNPACK #-} !Int, -- | Numeric owner group id. Should be set to @0@ if unknown. groupId :: {-# UNPACK #-} !Int } deriving (Eq, Ord, Show) -- | There have been a number of extensions to the tar file format over the -- years. They all share the basic entry fields and put more meta-data in -- different extended headers. -- data Format = -- | This is the classic Unix V7 tar format. It does not support owner and -- group names, just numeric Ids. It also does not support device numbers. V7Format -- | The \"USTAR\" format is an extension of the classic V7 format. It was -- later standardised by POSIX. It has some restrictions but is the most -- portable format. | UstarFormat -- | The GNU tar implementation also extends the classic V7 format, though -- in a slightly different way from the USTAR format. This is the only format -- supporting long file names. | GnuFormat deriving (Eq, Ord, Show) instance (NFData tarPath, NFData linkTarget) => NFData (GenEntry tarPath linkTarget) where rnf (Entry p c _ _ _ _) = rnf p `seq` rnf c instance NFData linkTarget => NFData (GenEntryContent linkTarget) where rnf x = case x of NormalFile c _ -> rnf c SymbolicLink lnk -> rnf lnk HardLink lnk -> rnf lnk OtherEntryType _ c _ -> rnf c _ -> seq x () instance NFData Ownership where rnf (Ownership o g _ _) = rnf o `seq` rnf g -- | @rw-r--r--@ for normal files ordinaryFilePermissions :: Permissions ordinaryFilePermissions = 0o0644 -- | @rw-r--r--@ for normal files -- -- @since 0.6.0.0 symbolicLinkPermission :: Permissions symbolicLinkPermission = 0o0777 -- | @rwxr-xr-x@ for executable files executableFilePermissions :: Permissions executableFilePermissions = 0o0755 -- | @rwxr-xr-x@ for directories directoryPermissions :: Permissions directoryPermissions = 0o0755 -- | An entry with all default values except for the file name and type. It -- uses the portable USTAR/POSIX format (see 'UstarFormat'). -- -- You can use this as a basis and override specific fields, eg: -- -- > (emptyEntry name HardLink) { linkTarget = target } -- simpleEntry :: tarPath -> GenEntryContent linkTarget -> GenEntry tarPath linkTarget simpleEntry tarpath content = Entry { entryTarPath = tarpath, entryContent = content, entryPermissions = case content of Directory -> directoryPermissions SymbolicLink _ -> symbolicLinkPermission _ -> ordinaryFilePermissions, entryOwnership = Ownership "" "" 0 0, entryTime = 0, entryFormat = UstarFormat } -- | A tar entry for a file. -- -- Entry fields such as file permissions and ownership have default values. -- -- You can use this as a basis and override specific fields. For example if you -- need an executable file you could use: -- -- > (fileEntry name content) { fileMode = executableFileMode } -- fileEntry :: tarPath -> LBS.ByteString -> GenEntry tarPath linkTarget fileEntry name fileContent = simpleEntry name (NormalFile fileContent (LBS.length fileContent)) -- | A tar entry for a symbolic link. symlinkEntry :: tarPath -> linkTarget -> GenEntry tarPath linkTarget symlinkEntry name targetLink = simpleEntry name (SymbolicLink targetLink) -- | [GNU extension](https://www.gnu.org/software/tar/manual/html_node/Standard.html) -- to store a filepath too long to fit into 'Codec.Archive.Tar.Entry.entryTarPath' -- as 'OtherEntryType' @\'L\'@ with the full filepath as 'entryContent'. -- The next entry must contain the actual -- data with truncated 'Codec.Archive.Tar.Entry.entryTarPath'. -- -- See [What exactly is the GNU tar ././@LongLink "trick"?](https://stackoverflow.com/questions/2078778/what-exactly-is-the-gnu-tar-longlink-trick) -- -- @since 0.6.0.0 longLinkEntry :: FilePath -> GenEntry TarPath linkTarget longLinkEntry tarpath = Entry { entryTarPath = TarPath [PS.pstr|././@LongLink|] mempty, entryContent = OtherEntryType 'L' (LBS.fromStrict $ posixToByteString $ toPosixString tarpath) (fromIntegral $ length tarpath), entryPermissions = ordinaryFilePermissions, entryOwnership = Ownership "" "" 0 0, entryTime = 0, entryFormat = GnuFormat } -- | [GNU extension](https://www.gnu.org/software/tar/manual/html_node/Standard.html) -- to store a link target too long to fit into 'Codec.Archive.Tar.Entry.entryTarPath' -- as 'OtherEntryType' @\'K\'@ with the full filepath as 'entryContent'. -- The next entry must contain the actual -- data with truncated 'Codec.Archive.Tar.Entry.entryTarPath'. -- -- @since 0.6.0.0 longSymLinkEntry :: FilePath -> GenEntry TarPath linkTarget longSymLinkEntry linkTarget = Entry { entryTarPath = TarPath [PS.pstr|././@LongLink|] mempty, entryContent = OtherEntryType 'K' (LBS.fromStrict $ posixToByteString $ toPosixString $ linkTarget) (fromIntegral $ length linkTarget), entryPermissions = ordinaryFilePermissions, entryOwnership = Ownership "" "" 0 0, entryTime = 0, entryFormat = GnuFormat } -- | A tar entry for a directory. -- -- Entry fields such as file permissions and ownership have default values. -- directoryEntry :: tarPath -> GenEntry tarPath linkTarget directoryEntry name = simpleEntry name Directory -- -- * Tar paths -- -- | The classic tar format allowed just 100 characters for the file name. The -- USTAR format extended this with an extra 155 characters, however it uses a -- complex method of splitting the name between the two sections. -- -- Instead of just putting any overflow into the extended area, it uses the -- extended area as a prefix. The aggravating insane bit however is that the -- prefix (if any) must only contain a directory prefix. That is the split -- between the two areas must be on a directory separator boundary. So there is -- no simple calculation to work out if a file name is too long. Instead we -- have to try to find a valid split that makes the name fit in the two areas. -- -- The rationale presumably was to make it a bit more compatible with old tar -- programs that only understand the classic format. A classic tar would be -- able to extract the file name and possibly some dir prefix, but not the -- full dir prefix. So the files would end up in the wrong place, but that's -- probably better than ending up with the wrong names too. -- -- So it's understandable but rather annoying. -- -- * Tar paths use Posix format (ie @\'/\'@ directory separators), irrespective -- of the local path conventions. -- -- * The directory separator between the prefix and name is /not/ stored. -- data TarPath = TarPath {-# UNPACK #-} !PosixString -- ^ path name, 100 characters max. {-# UNPACK #-} !PosixString -- ^ path prefix, 155 characters max. deriving (Eq, Ord) instance NFData TarPath where rnf (TarPath _ _) = () -- fully strict by construction instance Show TarPath where show = show . fromTarPath -- | Convert a t'TarPath' to a native 'FilePath'. -- -- The native 'FilePath' will use the native directory separator but it is not -- otherwise checked for validity or sanity. In particular: -- -- * The tar path may be invalid as a native path, eg the file name @\"nul\"@ -- is not valid on Windows. -- -- * The tar path may be an absolute path or may contain @\"..\"@ components. -- For security reasons this should not usually be allowed, but it is your -- responsibility to check for these conditions -- (e.g., using 'Codec.Archive.Tar.Check.checkEntrySecurity'). -- fromTarPath :: TarPath -> FilePath fromTarPath = fromPosixString . fromTarPathInternal (PS.unsafeFromChar FilePath.Native.pathSeparator) -- | Convert a t'TarPath' to a Unix\/Posix 'FilePath'. -- -- The difference compared to 'fromTarPath' is that it always returns a Unix -- style path irrespective of the current operating system. -- -- This is useful to check how a t'TarPath' would be interpreted on a specific -- operating system, eg to perform portability checks. -- fromTarPathToPosixPath :: TarPath -> FilePath fromTarPathToPosixPath = fromPosixString . fromTarPathInternal (PS.unsafeFromChar FilePath.Posix.pathSeparator) -- | Convert a t'TarPath' to a Windows 'FilePath'. -- -- The only difference compared to 'fromTarPath' is that it always returns a -- Windows style path irrespective of the current operating system. -- -- This is useful to check how a t'TarPath' would be interpreted on a specific -- operating system, eg to perform portability checks. -- fromTarPathToWindowsPath :: TarPath -> FilePath fromTarPathToWindowsPath = fromPosixString . fromTarPathInternal (PS.unsafeFromChar FilePath.Windows.pathSeparator) fromTarPathInternal :: PosixChar -> TarPath -> PosixString fromTarPathInternal sep = go where posixSep = PS.unsafeFromChar FilePath.Posix.pathSeparator adjustSeps = if sep == posixSep then id else PS.map $ \c -> if c == posixSep then sep else c go (TarPath name prefix) | PS.null prefix = adjustSeps name | PS.null name = adjustSeps prefix | otherwise = adjustSeps prefix <> PS.cons sep (adjustSeps name) {-# INLINE fromTarPathInternal #-} -- | Convert a native 'FilePath' to a t'TarPath'. -- -- The conversion may fail if the 'FilePath' is empty or too long. toTarPath :: Bool -- ^ Is the path for a directory? This is needed because for -- directories a t'TarPath' must always use a trailing @\/@. -> FilePath -> Either String TarPath toTarPath isDir path = case toTarPath' path' of FileNameEmpty -> Left "File name empty" FileNameOK tarPath -> Right tarPath FileNameTooLong{} -> Left $ "File name too long: " ++ path' where path' = if isDir && not (FilePath.Native.hasTrailingPathSeparator path) then path <> [FilePath.Native.pathSeparator] else path -- | Convert a native 'FilePath' to a t'TarPath'. -- Directory paths must always have a trailing @\/@, this is not checked. -- -- @since 0.6.0.0 toTarPath' :: FilePath -> ToTarPathResult toTarPath' = splitLongPath . (if nativeSep == posixSep then id else adjustSeps) where nativeSep = FilePath.Native.pathSeparator posixSep = FilePath.Posix.pathSeparator adjustSeps = map $ \c -> if c == nativeSep then posixSep else c -- | Return type of 'toTarPath''. -- -- @since 0.6.0.0 data ToTarPathResult = FileNameEmpty -- ^ 'FilePath' was empty, but t'TarPath' must be non-empty. | FileNameOK TarPath -- ^ All good, this is just a normal t'TarPath'. | FileNameTooLong TarPath -- ^ 'FilePath' was longer than 255 characters, t'TarPath' contains -- a truncated part only. An actual entry must be preceded by -- 'longLinkEntry'. -- | Take a sanitised path, split on directory separators and try to pack it -- into the 155 + 100 tar file name format. -- -- The strategy is this: take the name-directory components in reverse order -- and try to fit as many components into the 100 long name area as possible. -- If all the remaining components fit in the 155 name area then we win. splitLongPath :: FilePath -> ToTarPathResult splitLongPath path = case reverse (FilePath.Posix.splitPath path) of [] -> FileNameEmpty c : cs -> case packName nameMax (c :| cs) of Nothing -> FileNameTooLong $ TarPath (toPosixString $ take 100 path) mempty Just (name, []) -> FileNameOK $! TarPath (toPosixString name) mempty Just (name, first:rest) -> case packName prefixMax remainder of Nothing -> FileNameTooLong $ TarPath (toPosixString $ take 100 path) mempty Just (_ , _:_) -> FileNameTooLong $ TarPath (toPosixString $ take 100 path) mempty Just (prefix, []) -> FileNameOK $! TarPath (toPosixString name) (toPosixString prefix) where -- drop the '/' between the name and prefix: remainder = init first :| rest where nameMax, prefixMax :: Int nameMax = 100 prefixMax = 155 packName :: Int -> NonEmpty FilePath -> Maybe (FilePath, [FilePath]) packName maxLen (c :| cs) | n > maxLen = Nothing | otherwise = Just (packName' maxLen n [c] cs) where n = length c packName' :: Int -> Int -> [FilePath] -> [FilePath] -> (FilePath, [FilePath]) packName' maxLen n ok (c:cs) | n' <= maxLen = packName' maxLen n' (c:ok) cs where n' = n + length c packName' _ _ ok cs = (FilePath.Posix.joinPath ok, cs) -- | The tar format allows just 100 ASCII characters for the 'SymbolicLink' and -- 'HardLink' entry types. -- newtype LinkTarget = LinkTarget PosixString deriving (Eq, Ord, Show) instance NFData LinkTarget where rnf (LinkTarget bs) = rnf bs -- | Convert a native 'FilePath' to a tar t'LinkTarget'. -- string is longer than 100 characters or if it contains non-portable -- characters. toLinkTarget :: FilePath -> Maybe LinkTarget toLinkTarget path | length path <= 100 = do target <- toLinkTarget' path Just $! LinkTarget (toPosixString target) | otherwise = Nothing data LinkTargetException = IsAbsolute FilePath | TooLong FilePath deriving (Show,Typeable) instance Exception LinkTargetException where displayException (IsAbsolute fp) = "Link target \"" <> fp <> "\" is unexpectedly absolute" displayException (TooLong _) = "The link target is too long" -- | Convert a native 'FilePath' to a unix filepath suitable for -- using as t'LinkTarget'. Does not error if longer than 100 characters. toLinkTarget' :: FilePath -> Maybe FilePath toLinkTarget' path | FilePath.Native.isAbsolute path = Nothing | otherwise = Just $ adjustDirectory $ FilePath.Posix.joinPath $ FilePath.Native.splitDirectories path where adjustDirectory | FilePath.Native.hasTrailingPathSeparator path = FilePath.Posix.addTrailingPathSeparator | otherwise = id -- | Convert a tar t'LinkTarget' to a native 'FilePath'. fromLinkTarget :: LinkTarget -> FilePath fromLinkTarget (LinkTarget pathbs) = fromFilePathToNative $ fromPosixString pathbs -- | Convert a tar t'LinkTarget' to a Unix\/POSIX 'FilePath' (@\'/\'@ path separators). fromLinkTargetToPosixPath :: LinkTarget -> FilePath fromLinkTargetToPosixPath (LinkTarget pathbs) = fromPosixString pathbs -- | Convert a tar t'LinkTarget' to a Windows 'FilePath' (@\'\\\\\'@ path separators). fromLinkTargetToWindowsPath :: LinkTarget -> FilePath fromLinkTargetToWindowsPath (LinkTarget pathbs) = fromFilePathToWindowsPath $ fromPosixString pathbs -- | Convert a unix FilePath to a native 'FilePath'. fromFilePathToNative :: FilePath -> FilePath fromFilePathToNative = fromFilePathInternal FilePath.Posix.pathSeparator FilePath.Native.pathSeparator -- | Convert a unix FilePath to a Windows 'FilePath'. fromFilePathToWindowsPath :: FilePath -> FilePath fromFilePathToWindowsPath = fromFilePathInternal FilePath.Posix.pathSeparator FilePath.Windows.pathSeparator fromFilePathInternal :: Char -> Char -> FilePath -> FilePath fromFilePathInternal fromSep toSep = adjustSeps where adjustSeps = if fromSep == toSep then id else map $ \c -> if c == fromSep then toSep else c {-# INLINE fromFilePathInternal #-} -- -- * Entries type -- -- | Polymorphic sequence of archive entries. -- High-level interfaces -- commonly work with 'GenEntries' 'FilePath' 'FilePath', -- while low-level ones use 'GenEntries' t'TarPath' t'LinkTarget'. -- -- The point of this type as opposed to just using a list is that it makes the -- failure case explicit. We need this because the sequence of entries we get -- from reading a tarball can include errors. -- -- Converting from a list can be done with just @foldr Next Done@. Converting -- back into a list can be done with 'foldEntries' however in that case you -- must be prepared to handle the 'Fail' case inherent in the 'Entries' type. -- -- The 'Monoid' instance lets you concatenate archives or append entries to an -- archive. -- -- @since 0.6.0.0 data GenEntries tarPath linkTarget e = Next (GenEntry tarPath linkTarget) (GenEntries tarPath linkTarget e) | Done | Fail e deriving ( Eq , Show , Functor , Foldable -- ^ @since 0.6.0.0 , Traversable -- ^ @since 0.6.0.0 ) infixr 5 `Next` -- | Monomorphic sequence of archive entries, -- ready for serialization / deserialization. type Entries e = GenEntries TarPath LinkTarget e -- | This is like the standard 'Data.List.unfoldr' function on lists, but for 'Entries'. -- It includes failure as an extra possibility that the stepper function may -- return. -- -- It can be used to generate 'Entries' from some other type. For example it is -- used internally to lazily unfold entries from a 'LBS.ByteString'. -- unfoldEntries :: (a -> Either e (Maybe (GenEntry tarPath linkTarget, a))) -> a -> GenEntries tarPath linkTarget e unfoldEntries f = unfold where unfold x = case f x of Left err -> Fail err Right Nothing -> Done Right (Just (e, x')) -> Next e (unfold x') unfoldEntriesM :: Monad m => (forall a. m a -> m a) -- ^ id or unsafeInterleaveIO -> m (Either e (Maybe (GenEntry tarPath linkTarget))) -> m (GenEntries tarPath linkTarget e) unfoldEntriesM interleave f = unfold where unfold = do f' <- f case f' of Left err -> pure $ Fail err Right Nothing -> pure Done Right (Just e) -> Next e <$> interleave unfold -- | This is like the standard 'Data.List.foldr' function on lists, but for 'Entries'. -- Compared to 'Data.List.foldr' it takes an extra function to account for the -- possibility of failure. -- -- This is used to consume a sequence of entries. For example it could be used -- to scan a tarball for problems or to collect an index of the contents. -- foldEntries :: (GenEntry tarPath linkTarget -> a -> a) -> a -> (e -> a) -> GenEntries tarPath linkTarget e -> a foldEntries next done fail' = fold where fold (Next e es) = next e (fold es) fold Done = done fold (Fail err) = fail' err -- | A 'Data.List.foldl'-like function on Entries. It either returns the final -- accumulator result, or the failure along with the intermediate accumulator -- value. -- foldlEntries :: (a -> GenEntry tarPath linkTarget -> a) -> a -> GenEntries tarPath linkTarget e -> Either (e, a) a foldlEntries f = go where go !acc (Next e es) = go (f acc e) es go !acc Done = Right acc go !acc (Fail err) = Left (err, acc) -- | This is like the standard 'Data.List.map' function on lists, but for 'Entries'. It -- includes failure as a extra possible outcome of the mapping function. -- -- If your mapping function cannot fail it may be more convenient to use -- 'mapEntriesNoFail' mapEntries :: (GenEntry tarPath linkTarget -> Either e' (GenEntry tarPath linkTarget)) -- ^ Function to apply to each entry -> GenEntries tarPath linkTarget e -- ^ Input sequence -> GenEntries tarPath linkTarget (Either e e') mapEntries f = foldEntries (\entry rest -> either (Fail . Right) (`Next` rest) (f entry)) Done (Fail . Left) -- | Like 'mapEntries' but the mapping function itself cannot fail. -- mapEntriesNoFail :: (GenEntry tarPath linkTarget -> GenEntry tarPath linkTarget) -> GenEntries tarPath linkTarget e -> GenEntries tarPath linkTarget e mapEntriesNoFail f = foldEntries (Next . f) Done Fail -- | @since 0.5.1.0 instance Sem.Semigroup (GenEntries tarPath linkTarget e) where a <> b = foldEntries Next b Fail a instance Monoid (GenEntries tarPath linkTarget e) where mempty = Done mappend = (Sem.<>) instance (NFData tarPath, NFData linkTarget, NFData e) => NFData (GenEntries tarPath linkTarget e) where rnf (Next e es) = rnf e `seq` rnf es rnf Done = () rnf (Fail e) = rnf e tar-0.6.4.0/Codec/Archive/Tar/Unpack.hs0000644000000000000000000003022207346545000015536 0ustar0000000000000000{-# LANGUAGE LambdaCase #-} {-# LANGUAGE ViewPatterns #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE RankNTypes #-} {-# OPTIONS_HADDOCK hide #-} {-# OPTIONS_GHC -Wno-unrecognised-pragmas #-} {-# HLINT ignore "Use for_" #-} {-# HLINT ignore "Avoid restricted function" #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2009, 2012, 2016 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Unpack ( unpack, unpackAndCheck, ) where import Codec.Archive.Tar.Types import Codec.Archive.Tar.Check import Codec.Archive.Tar.LongNames import Codec.Archive.Tar.PackAscii (filePathToOsPath) import Data.Bits ( testBit ) import Data.List (partition, nub) import Data.Maybe ( fromMaybe ) import qualified Data.ByteString.Char8 as Char8 import qualified Data.ByteString.Lazy as BS import Prelude hiding (writeFile) import System.File.OsPath import System.OsPath ( OsPath, () ) import qualified System.OsPath as FilePath.Native ( takeDirectory ) import System.Directory.OsPath ( createDirectoryIfMissing, copyFile, setPermissions, listDirectory, doesDirectoryExist, createDirectoryLink, createFileLink, setModificationTime, emptyPermissions, setOwnerReadable, setOwnerWritable, setOwnerExecutable, setOwnerSearchable ) import Control.Exception ( Exception, throwIO, handle ) import System.IO ( stderr, hPutStr ) import System.IO.Error ( ioeGetErrorType, isPermissionError ) import GHC.IO (unsafeInterleaveIO) import Data.Foldable (traverse_) import GHC.IO.Exception (IOErrorType(InappropriateType, IllegalOperation, PermissionDenied, InvalidArgument)) import Data.Time.Clock.POSIX ( posixSecondsToUTCTime ) import Control.Exception as Exception ( catch, SomeException(..) ) -- | Create local files and directories based on the entries of a tar archive. -- -- This is a portable implementation of unpacking suitable for portable -- archives. It handles 'NormalFile' and 'Directory' entries and has simulated -- support for 'SymbolicLink' and 'HardLink' entries. Links are implemented by -- copying the target file. This therefore works on Windows as well as Unix. -- All other entry types are ignored, that is they are not unpacked and no -- exception is raised. -- -- If the 'Entries' ends in an error then it is raised an an exception. Any -- files or directories that have been unpacked before the error was -- encountered will not be deleted. For this reason you may want to unpack -- into an empty directory so that you can easily clean up if unpacking fails -- part-way. -- -- On its own, this function only checks for security (using 'checkEntrySecurity'). -- Use 'unpackAndCheck' if you need more checks. -- unpack :: Exception e => FilePath -- ^ Base directory -> Entries e -- ^ Entries to upack -> IO () unpack = unpackAndCheck (fmap SomeException . checkEntrySecurity) -- | Like 'Codec.Archive.Tar.unpack', but run custom sanity/security checks instead of 'checkEntrySecurity'. -- For example, -- -- > import Control.Exception (SomeException(..)) -- > import Control.Applicative ((<|>)) -- > -- > unpackAndCheck (\x -> SomeException <$> checkEntryPortability x -- > <|> SomeException <$> checkEntrySecurity x) dir entries -- -- @since 0.6.0.0 unpackAndCheck :: Exception e => (GenEntry FilePath FilePath -> Maybe SomeException) -- ^ Checks to run on each entry before unpacking -> FilePath -- ^ Base directory -> Entries e -- ^ Entries to upack -> IO () unpackAndCheck secCB (filePathToOsPath -> baseDir) entries = do let resolvedEntries = decodeLongNames entries uEntries <- unpackEntries [] resolvedEntries let (hardlinks, symlinks) = partition (\(_, _, x) -> x) uEntries -- handle hardlinks first, in case a symlink points to it handleHardLinks hardlinks handleSymlinks symlinks where -- We're relying here on 'secCB' to make sure we're not scribbling -- files all over the place. unpackEntries :: Exception e => [(OsPath, OsPath, Bool)] -- ^ links (path, link, isHardLink) -> GenEntries FilePath FilePath (Either e DecodeLongNamesError) -- ^ entries -> IO [(OsPath, OsPath, Bool)] unpackEntries _ (Fail err) = either throwIO throwIO err unpackEntries links Done = return links unpackEntries links (Next entry es) = do case secCB entry of Nothing -> pure () Just e -> throwIO e case entryContent entry of NormalFile file _ -> do extractFile (entryPermissions entry) (entryTarPath entry) file (entryTime entry) unpackEntries links es Directory -> do extractDir (entryTarPath entry) (entryTime entry) unpackEntries links es HardLink link -> do (unpackEntries $! saveLink True (entryTarPath entry) link links) es SymbolicLink link -> do (unpackEntries $! saveLink False (entryTarPath entry) link links) es OtherEntryType{} -> -- the spec demands that we attempt to extract as normal file on unknown typecode, -- but we just skip it unpackEntries links es CharacterDevice{} -> unpackEntries links es BlockDevice{} -> unpackEntries links es NamedPipe -> unpackEntries links es extractFile :: Permissions -> FilePath -> BS.ByteString -> EpochTime -> IO () extractFile permissions (filePathToNativeOsPath -> path) content mtime = do -- Note that tar archives do not make sure each directory is created -- before files they contain, indeed we may have to create several -- levels of directory. createDirectoryIfMissing True absDir writeFile absPath content setOwnerPermissions absPath permissions setModTime absPath mtime where absDir = baseDir FilePath.Native.takeDirectory path absPath = baseDir path extractDir :: FilePath -> EpochTime -> IO () extractDir (filePathToNativeOsPath -> path) mtime = do createDirectoryIfMissing True absPath setModTime absPath mtime where absPath = baseDir path saveLink :: t -> FilePath -> FilePath -> [(OsPath, OsPath, t)] -> [(OsPath, OsPath, t)] saveLink isHardLink (filePathToNativeOsPath -> path) (filePathToNativeOsPath -> link) = path `seq` link `seq` ((path, link, isHardLink) :) -- for hardlinks, we just copy handleHardLinks :: [(OsPath, OsPath, t)] -> IO () handleHardLinks = mapM_ $ \(relPath, relLinkTarget, _) -> let absPath = baseDir relPath -- hard links link targets are always "absolute" paths in -- the context of the tar root absTarget = baseDir relLinkTarget -- we don't expect races here, since we should be the -- only process unpacking the tar archive and writing to -- the destination in doesDirectoryExist absTarget >>= \case True -> copyDirectoryRecursive absTarget absPath False -> copyFile absTarget absPath -- For symlinks, we first try to recreate them and if that fails -- with 'IllegalOperation', 'PermissionDenied' or 'InvalidArgument', -- we fall back to copying. -- This error handling isn't too fine grained and maybe should be -- platform specific, but this way it might catch erros on unix even on -- FAT32 fuse mounted volumes. handleSymlinks :: [(OsPath, OsPath, c)] -> IO () handleSymlinks = mapM_ $ \(relPath, relLinkTarget, _) -> let absPath = baseDir relPath -- hard links link targets are always "absolute" paths in -- the context of the tar root absTarget = FilePath.Native.takeDirectory absPath relLinkTarget -- we don't expect races here, since we should be the -- only process unpacking the tar archive and writing to -- the destination in doesDirectoryExist absTarget >>= \case True -> handleSymlinkError (copyDirectoryRecursive absTarget absPath) $ createDirectoryLink relLinkTarget absPath False -> handleSymlinkError (copyFile absTarget absPath) $ createFileLink relLinkTarget absPath where handleSymlinkError action = handle (\e -> if ioeGetErrorType e `elem` [IllegalOperation ,PermissionDenied ,InvalidArgument] then action else throwIO e ) filePathToNativeOsPath :: FilePath -> OsPath filePathToNativeOsPath = filePathToOsPath . fromFilePathToNative -- | Recursively copy the contents of one directory to another path. -- -- This is a rip-off of Cabal library. copyDirectoryRecursive :: OsPath -> OsPath -> IO () copyDirectoryRecursive srcDir destDir = do srcFiles <- getDirectoryContentsRecursive srcDir copyFilesWith copyFile destDir [ (srcDir, f) | f <- srcFiles ] where -- | Common implementation of 'copyFiles', 'installOrdinaryFiles', -- 'installExecutableFiles' and 'installMaybeExecutableFiles'. copyFilesWith :: (OsPath -> OsPath -> IO ()) -> OsPath -> [(OsPath, OsPath)] -> IO () copyFilesWith doCopy targetDir srcFiles = do -- Create parent directories for everything let dirs = map (targetDir ) . nub . map (FilePath.Native.takeDirectory . snd) $ srcFiles traverse_ (createDirectoryIfMissing True) dirs -- Copy all the files sequence_ [ let src = srcBase srcFile dest = targetDir srcFile in doCopy src dest | (srcBase, srcFile) <- srcFiles ] -- | List all the files in a directory and all subdirectories. -- -- The order places files in sub-directories after all the files in their -- parent directories. The list is generated lazily so is not well defined if -- the source directory structure changes before the list is used. -- getDirectoryContentsRecursive :: OsPath -> IO [OsPath] getDirectoryContentsRecursive topdir = recurseDirectories [mempty] where recurseDirectories :: [OsPath] -> IO [OsPath] recurseDirectories [] = return [] recurseDirectories (dir:dirs) = unsafeInterleaveIO $ do (files, dirs') <- collect [] [] =<< listDirectory (topdir dir) files' <- recurseDirectories (dirs' ++ dirs) return (files ++ files') where collect files dirs' [] = return (reverse files ,reverse dirs') collect files dirs' (entry:entries) = do let dirEntry = dir entry isDirectory <- doesDirectoryExist (topdir dirEntry) if isDirectory then collect files (dirEntry:dirs') entries else collect (dirEntry:files) dirs' entries setModTime :: OsPath -> EpochTime -> IO () setModTime path t = setModificationTime path (posixSecondsToUTCTime (fromIntegral t)) `Exception.catch` \e -> case ioeGetErrorType e of PermissionDenied -> return () -- On FAT32 file system setting time prior to DOS Epoch (1980-01-01) -- throws InvalidArgument, https://github.com/haskell/tar/issues/37 InvalidArgument -> return () _ -> throwIO e setOwnerPermissions :: OsPath -> Permissions -> IO () setOwnerPermissions path permissions = setPermissions path ownerPermissions where -- | Info on Permission bits can be found here: -- https://www.gnu.org/software/libc/manual/html_node/Permission-Bits.html ownerPermissions = setOwnerReadable (testBit permissions 8) $ setOwnerWritable (testBit permissions 7) $ setOwnerExecutable (testBit permissions 6) $ setOwnerSearchable (testBit permissions 6) emptyPermissions tar-0.6.4.0/Codec/Archive/Tar/Write.hs0000644000000000000000000001377507346545000015425 0ustar0000000000000000{-# LANGUAGE PackageImports #-} {-# OPTIONS_HADDOCK hide #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Write -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2009 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Write (write) where import Codec.Archive.Tar.PackAscii import Codec.Archive.Tar.Types import Data.Bits import Data.Char (chr,ord) import Data.Int import Data.List (foldl') import Data.Monoid (mempty) import Numeric (showOct) import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS.Char8 import qualified Data.ByteString.Lazy as LBS import qualified Data.ByteString.Lazy.Char8 as LBS.Char8 import "os-string" System.OsString.Posix (PosixString) import qualified "os-string" System.OsString.Posix as PS -- | Create the external representation of a tar archive by serialising a list -- of tar entries. -- -- * The conversion is done lazily. -- write :: [Entry] -> LBS.ByteString write es = LBS.concat $ map putEntry es ++ [LBS.replicate (512*2) 0] putEntry :: Entry -> LBS.ByteString putEntry entry = case entryContent entry of NormalFile content size -- size field is 12 bytes long, so in octal format (see 'putOct') -- it can hold numbers up to 8Gb | size >= 1 `shiftL` (3 * (12 -1)) , entryFormat entry == V7Format -> error "putEntry: support for files over 8Gb is a Ustar extension" | otherwise -> LBS.concat [ header, content, padding size ] OtherEntryType 'K' _ _ | entryFormat entry /= GnuFormat -> error "putEntry: long symlink support is a GNU extension" OtherEntryType 'L' _ _ | entryFormat entry /= GnuFormat -> error "putEntry: long filename support is a GNU extension" OtherEntryType _ content size -> LBS.concat [ header, content, padding size ] _ -> header where header = putHeader entry padding size = LBS.replicate paddingSize 0 where paddingSize = fromIntegral (negate size `mod` 512) putHeader :: Entry -> LBS.ByteString putHeader entry = LBS.fromStrict $ BS.take 148 block <> putOct 7 checksum <> BS.cons 0x20 (BS.drop 156 block) where block = putHeaderNoChkSum entry checksum :: Int checksum = BS.foldl' (\x y -> x + fromIntegral y) 0 block putHeaderNoChkSum :: Entry -> BS.ByteString putHeaderNoChkSum Entry { entryTarPath = TarPath name prefix, entryContent = content, entryPermissions = permissions, entryOwnership = ownership, entryTime = modTime, entryFormat = format } = BS.concat [ putPosixString 100 name , putOct 8 permissions , putOct 8 $ ownerId ownership , putOct 8 $ groupId ownership , numField 12 contentSize , putOct 12 modTime , BS.replicate 8 0x20 -- dummy checksum , putChar8 typeCode , putPosixString 100 linkTarget ] <> case format of V7Format -> BS.replicate 255 0x00 UstarFormat -> BS.concat [ putBString 8 ustarMagic , putString 32 $ ownerName ownership , putString 32 $ groupName ownership , putOct 8 deviceMajor , putOct 8 deviceMinor , putPosixString 155 prefix , BS.replicate 12 0x00 ] GnuFormat -> BS.concat [ putBString 8 gnuMagic , putString 32 $ ownerName ownership , putString 32 $ groupName ownership , putGnuDev 8 deviceMajor , putGnuDev 8 deviceMinor , putPosixString 155 prefix , BS.replicate 12 0x00 ] where numField :: FieldWidth -> Int64 -> BS.ByteString numField w n | n >= 0 && n < 1 `shiftL` (3 * (w - 1)) = putOct w n | otherwise = putLarge w n (typeCode, contentSize, linkTarget, deviceMajor, deviceMinor) = case content of NormalFile _ size -> ('0' , size, mempty, 0, 0) Directory -> ('5' , 0, mempty, 0, 0) SymbolicLink (LinkTarget link) -> ('2' , 0, link, 0, 0) HardLink (LinkTarget link) -> ('1' , 0, link, 0, 0) CharacterDevice major minor -> ('3' , 0, mempty, major, minor) BlockDevice major minor -> ('4' , 0, mempty, major, minor) NamedPipe -> ('6' , 0, mempty, 0, 0) OtherEntryType code _ size -> (code, size, mempty, 0, 0) putGnuDev w n = case content of CharacterDevice _ _ -> putOct w n BlockDevice _ _ -> putOct w n _ -> BS.replicate w 0x00 ustarMagic, gnuMagic :: BS.ByteString ustarMagic = BS.pack [0x75, 0x73, 0x74, 0x61, 0x72, 0x00, 0x30, 0x30] -- ustar\NUL00 gnuMagic = BS.pack [0x75, 0x73, 0x74, 0x61, 0x72, 0x20, 0x20, 0x00] -- ustar \NUL -- * TAR format primitive output type FieldWidth = Int putBString :: FieldWidth -> BS.ByteString -> BS.ByteString putBString n s = BS.take n s <> BS.replicate (n - BS.length s) 0x00 putPosixString :: FieldWidth -> PosixString -> BS.ByteString putPosixString n s = posixToByteString (PS.take n s) <> BS.replicate (n - PS.length s) 0x00 putString :: FieldWidth -> String -> BS.ByteString putString n s = BS.take n (packAscii s) <> BS.replicate (n - length s) 0x00 {-# SPECIALISE putLarge :: FieldWidth -> Int64 -> BS.ByteString #-} putLarge :: (Bits a, Integral a) => FieldWidth -> a -> BS.ByteString putLarge n0 x0 = BS.Char8.pack $ '\x80' : reverse (go (n0-1) x0) where go 0 _ = [] go n x = chr (fromIntegral (x .&. 0xff)) : go (n-1) (x `shiftR` 8) putOct :: (Integral a, Show a) => FieldWidth -> a -> BS.ByteString putOct n x = let octStr = BS.take (n-1) $ BS.Char8.pack $ showOct x "" in BS.replicate (n - BS.length octStr - 1) 0x30 <> octStr <> BS.singleton 0x00 putChar8 :: Char -> BS.ByteString putChar8 = BS.Char8.singleton tar-0.6.4.0/LICENSE0000644000000000000000000000305507346545000011646 0ustar0000000000000000Copyright (c) 2007 Bjƶrn Bringert, 2008-2015 Duncan Coutts, 2011 Max Bolingbroke All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. - Neither the names of the copyright owners nor the names of the contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. tar-0.6.4.0/README.md0000644000000000000000000000126407346545000012120 0ustar0000000000000000# tar [![Hackage](https://img.shields.io/hackage/v/tar.svg)](https://hackage.haskell.org/package/tar) This library is for working with `.tar` archive files. It can read and write a range of common variations of archive format including V7, POSIX USTAR and GNU formats. It provides support for packing and unpacking portable archives and features for random access to archive content using an index. For a quick start with the API look at `htar/htar.hs`, which implements a very basic `tar` command-line tool. To run benchmarks download [`01-index.tar`](https://hackage.haskell.org/01-index.tar) into the package folder: ```sh wget https://hackage.haskell.org/01-index.tar cabal bench ``` tar-0.6.4.0/Setup.lhs0000644000000000000000000000010307346545000012440 0ustar0000000000000000> import Distribution.Simple > main :: IO () > main = defaultMain tar-0.6.4.0/bench/0000755000000000000000000000000007346545000011715 5ustar0000000000000000tar-0.6.4.0/bench/Main.hs0000644000000000000000000000341107346545000013134 0ustar0000000000000000module Main where import qualified Codec.Archive.Tar as Tar import qualified Codec.Archive.Tar.Index as TarIndex import qualified Data.ByteString.Lazy as BS import Data.Maybe import Control.Exception import System.Directory import System.Environment import System.IO.Temp import Test.Tasty.Bench main = defaultMain benchmarks benchmarks :: [Benchmark] benchmarks = [ env loadTarFile $ \tarfile -> bench "read" (nf Tar.read tarfile) , env loadTarEntriesList $ \entries -> bench "write" (nf Tar.write entries) , env loadTarEntries $ \entries -> bench "index build" (nf TarIndex.build entries) , env loadTarIndex $ \entries -> bench "index rebuild" (nf (TarIndex.finalise . TarIndex.unfinalise) entries) , env loadTarEntries $ \entries -> bench "unpack" (nfIO $ withSystemTempDirectory "tar-bench" $ \baseDir -> Tar.unpack baseDir entries) , env (fmap TarIndex.serialise loadTarIndex) $ \tarfile -> bench "deserialise index" (nf TarIndex.deserialise tarfile) ] loadTarFile :: IO BS.ByteString loadTarFile = do mTarFile <- lookupEnv "TAR_TEST_FILE" let tarFile = fromMaybe "01-index.tar" mTarFile exists <- doesFileExist tarFile if exists then BS.readFile tarFile else case mTarFile of Just _ -> error $ tarFile <> " does not exist" Nothing -> error "01-index.tar does not exist, copy it from ~/.cabal/packages/hackage.haskell.org/01-index.tar" loadTarEntries :: IO (Tar.Entries Tar.FormatError) loadTarEntries = fmap Tar.read loadTarFile loadTarEntriesList :: IO [Tar.Entry] loadTarEntriesList = fmap (Tar.foldEntries (:) [] throw) loadTarEntries loadTarIndex :: IO TarIndex.TarIndex loadTarIndex = fmap (either throw id . TarIndex.build) loadTarEntries tar-0.6.4.0/changelog.md0000644000000000000000000001460307346545000013113 0ustar0000000000000000## 0.6.4.0 Bodigrim January 2025 * Migrate internals of packing / unpacking to `OsPath`. * Use `getDirectoryContentsRecursive` from `directory-ospath-streaming`. ## 0.6.3.0 Bodigrim June 2024 * [Speed up `deserialize`](https://github.com/haskell/tar/pull/95). ## 0.6.2.0 Bodigrim March 2024 * Fix issues with Unicode support in filenames. ## 0.6.1.0 Bodigrim January 2024 * Support Unicode in filenames (encoded as UTF-8). * Reduce peak memory consumption when unpacking large files. ## 0.6.0.0 Bodigrim December 2023 This release features support for long file paths and symlinks (thanks to Julian Ospald) and variety of changes and improvements across entire package, fixing multiple causes of silent data corruption. Breaking changes: * Generalize `Entries`, `Entry` and `EntryContent` to `GenEntries`, `GenEntry` and `GenEntryContent`. * Functions working on entries have been generalized to more polymorphic types, where possible. * Modules which used to `import Codec.Archive.Tar (Entry(..))` should now `import Codec.Archive.Tar (Entry, pattern Entry)` and similar for other `Gen`-types. Another option is to import the entire module qualified. * Redesign `Codec.Archive.Tar.Check`. * Change types of `checkSecurity`, `checkTarbomb`, `checkPortability`. * Add offending path as new field to `TarBombError` constructor. * Extend `FileNameError` with `UnsafeLinkTarget` constructor. * Drop deprecated `emptyIndex` and `finaliseIndex`. Examples of migration: * [`hackage-security`](https://github.com/haskell/hackage-security/commit/24693ce115c9769fe3c6ec9ca1d137d14d0d27ff) * [`archive-backpack`](https://github.com/vmchale/archive-backpack/commit/4b3d1bdff15fcf044d6171ca649a930c775d491b) * [`keter`](https://github.com/snoyberg/keter/commit/20a33d9276d5781ca6993b857d8d097085983ede) * [`libarchive`](https://github.com/vmchale/libarchive/commit/c0e101fede924a6e12f1d726587626c48444e65d) * [`cabal-install`](https://github.com/haskell/cabal/commit/51e6483f95ecb4f395dce36e47af296902a75143) * [`ghcup`](https://github.com/haskell/ghcup-hs/commit/6ae312c1f9dd054546e4afe4c969c37cd54b09a9) * [`hackage-server`](https://github.com/haskell/hackage-server/commit/6b71d1659500aba50b6a1e48aa53039046720af8) * [`hedgehog-extras`](https://github.com/input-output-hk/hedgehog-extras/commit/1d4468ce4e74e7a4b3c1fec5c1b21360051a3e72) Bug fixes: * Add support for over-long filepaths via GNU extension. * Now `entryPath` corresponds to an internal, low-level path, limited to 255 characters. To list filenames properly use `decodeLongNames`, followed by `entryTarPath`. * Fix handling of hardlinks and symlinks. * Handle > 8 GB files insted of silent corruption. * Prohibit non-ASCII file names instead of silent corruption. * Set permissions on extracted files. * Ignore FAT32 errors when setting modification time. * Switch to trailer parsing mode only after a full block of `NUL`. New API: * Add `Traversable Entries` instance. * Add `toTarPath'`, `ToTarPathResult`, `longLinkEntry`, `longSymLinkEntry`. * Add `packSymlinkEntry` and `symbolicLinkPermission`. * Add `packAndCheck` and `unpackAndCheck`. * Add `checkEntrySecurity`, `checkEntryTarbomb` and `checkEntryPortability`. * Add `encodeLongNames`, `decodeLongNames`, `DecodeLongNamesError`. Improvements: * Speed up `fromTarPath`, `fromTarPathToPosixPath` and `fromTarPathToWindowsPath`. * Alleviate leakage of file handles in `packFileEntry`. * Fix tests on 32-bit architectures. ## 0.5.1.1 Herbert Valerio Riedel August 2019 * Add support for GHC 8.8.1 / base-4.13 ## 0.5.1.0 Herbert Valerio Riedel March 2018 * Add support for GHC 8.4.1 / base-4.11 * Add `Semigroup` instance for `Entries` ## 0.5.0.3 Duncan Coutts May 2016 * Fix tarbomb logic to ignore special PAX entries. Was breaking many valid tarballs. https://github.com/haskell/cabal/issues/3390 ## 0.5.0.2 Duncan Coutts April 2016 * Fix compatability when using ghc-7.4.x and directory >= 1.2.3 ## 0.5.0.1 Duncan Coutts January 2016 * Fix compatability with directory-1.2.3+ ## 0.5.0.0 Duncan Coutts January 2016 * Work with old version of bytestring (using bytestring-builder package). * Builds with GHC 6.10 -- 8.0. * Change type of Index.serialise to be simply strict bytestring. * Preserve file timestamps on unpack (with directory-1.2.3+) ## 0.4.5.0 Duncan Coutts January 2016 * Revert accidental minor API change in 0.4.x series (the type of the owner and group name strings). The 0.4.3.0 and 0.4.4.0 releases contained the accidental API change. * Add a handy foldlEntries function ## 0.4.4.0 Duncan Coutts January 2016 * Build and warning fixes for GHC 7.10 and 8.0 * New Index module function `toList` to get all index entries ## 0.4.3.0 Duncan Coutts January 2016 * New Index function `unfinalise` to extend existing index * 9x faster reading * 9x faster index construction * 24x faster index extension * More compact entry types, using ByteStrings * More Eq and Show instances * Greater QC test coverage * Fix minor bug in reading non-standard v7 format entries ## 0.4.2.2 Edsko de Vries October 2015 * Fix bug in Index ## 0.4.2.1 Duncan Coutts July 2015 * Fix tests for the Index modules (the code was right) ## 0.4.2.0 Duncan Coutts July 2015 * New Index module for random access to tar file contents * New lower level tar file I/O actions * New tarball file 'append' action ## 0.4.1.0 Duncan Coutts January 2015 * Build with GHC 7.10 * Switch from old-time to time package * Added more instance for Entries type ## 0.4.0.1 Duncan Coutts October 2012 * fixes to work with directory 1.2 * More Eq/Ord instances ## 0.4.0.0 Duncan Coutts February 2012 * More explicit error types and error handling * Support star base-256 number format * Improved API documentation tar-0.6.4.0/tar.cabal0000644000000000000000000001130607346545000012411 0ustar0000000000000000cabal-version: 2.2 name: tar version: 0.6.4.0 license: BSD-3-Clause license-file: LICENSE author: Duncan Coutts Bjorn Bringert maintainer: Bodigrim bug-reports: https://github.com/haskell/tar/issues copyright: 2007 Bjorn Bringert 2008-2016 Duncan Coutts category: Codec synopsis: Reading, writing and manipulating ".tar" archive files. description: This library is for working with \"@.tar@\" archive files. It can read and write a range of common variations of archive format including V7, POSIX USTAR and GNU formats. . It provides support for packing and unpacking portable archives. This makes it suitable for distribution but not backup because details like file ownership and exact permissions are not preserved. . It also provides features for random access to archive content using an index. build-type: Simple extra-source-files: test/data/long-filepath.tar test/data/long-symlink.tar test/data/symlink.tar extra-doc-files: changelog.md README.md tested-with: GHC==9.12.1, GHC==9.10.1, GHC==9.8.4, GHC==9.6.5, GHC==9.4.8, GHC==9.2.8, GHC==9.0.2, GHC==8.10.7, GHC==8.8.4, GHC==8.6.5 source-repository head type: git location: https://github.com/haskell/tar.git library default-language: Haskell2010 build-depends: tar-internal reexported-modules: Codec.Archive.Tar, Codec.Archive.Tar.Entry, Codec.Archive.Tar.Check, Codec.Archive.Tar.Index library tar-internal default-language: Haskell2010 build-depends: base >= 4.12 && < 5, array < 0.6, bytestring >= 0.10 && < 0.13, containers >= 0.2 && < 0.8, deepseq >= 1.1 && < 1.6, directory >= 1.3.1 && < 1.4, directory-ospath-streaming >= 0.2.1 && < 0.3, file-io < 0.2, filepath >= 1.4.100 && < 1.6, os-string >= 2.0 && < 2.1, time < 1.15, transformers < 0.7, exposed-modules: Codec.Archive.Tar Codec.Archive.Tar.Entry Codec.Archive.Tar.Check Codec.Archive.Tar.Check.Internal Codec.Archive.Tar.Index Codec.Archive.Tar.LongNames Codec.Archive.Tar.Types Codec.Archive.Tar.Read Codec.Archive.Tar.Write Codec.Archive.Tar.Pack Codec.Archive.Tar.PackAscii Codec.Archive.Tar.Unpack Codec.Archive.Tar.Index.StringTable Codec.Archive.Tar.Index.IntTrie Codec.Archive.Tar.Index.Internal Codec.Archive.Tar.Index.Utils other-extensions: BangPatterns CPP DeriveDataTypeable GeneralizedNewtypeDeriving PatternGuards ScopedTypeVariables ghc-options: -Wall -fno-warn-unused-imports test-suite properties type: exitcode-stdio-1.0 default-language: Haskell2010 build-depends: base < 5, array, bytestring >= 0.10, containers, deepseq, directory >= 1.2, directory-ospath-streaming, file-embed, filepath, QuickCheck == 2.*, tar-internal, tasty >= 0.10 && <1.6, tasty-quickcheck >= 0.8 && <1, temporary < 1.4, time if impl(ghc < 9.0) build-depends: bytestring-handle < 0.2 hs-source-dirs: test main-is: Properties.hs other-modules: Codec.Archive.Tar.Tests Codec.Archive.Tar.Index.Tests Codec.Archive.Tar.Index.IntTrie.Tests Codec.Archive.Tar.Index.StringTable.Tests Codec.Archive.Tar.Pack.Tests Codec.Archive.Tar.Types.Tests Codec.Archive.Tar.Unpack.Tests other-extensions: CPP BangPatterns, DeriveDataTypeable ScopedTypeVariables ghc-options: -fno-ignore-asserts benchmark bench type: exitcode-stdio-1.0 default-language: Haskell2010 hs-source-dirs: bench main-is: Main.hs build-depends: base < 5, tar, bytestring >= 0.10, filepath, directory >= 1.2, array, containers, deepseq, temporary < 1.4, time, tasty-bench >= 0.4 && < 0.5 tar-0.6.4.0/test/Codec/Archive/Tar/Index/IntTrie/0000755000000000000000000000000007346545000017366 5ustar0000000000000000tar-0.6.4.0/test/Codec/Archive/Tar/Index/IntTrie/Tests.hs0000644000000000000000000001651007346545000021027 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeApplications #-} module Codec.Archive.Tar.Index.IntTrie.Tests ( test1, test2, test3, ValidPaths(..), prop_lookup, prop_completions, prop_lookup_mono, prop_completions_mono, prop_construct_toList, prop_finalise_unfinalise, prop_serialise_deserialise, prop_serialiseSize, ) where import Prelude hiding (lookup) import Codec.Archive.Tar.Index.IntTrie import qualified Data.Array.Unboxed as A import Data.Char import Data.Function (on) import Data.List hiding (lookup, insert) import Data.Word (Word32) import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS #if MIN_VERSION_bytestring(0,10,2) || defined(MIN_VERSION_bytestring_builder) import Data.ByteString.Builder as BS #else import Data.ByteString.Lazy.Builder as BS #endif #if MIN_VERSION_containers(0,5,0) import qualified Data.IntMap.Strict as IntMap import Data.IntMap.Strict (IntMap) #else import qualified Data.IntMap as IntMap import Data.IntMap (IntMap) #endif import Test.QuickCheck import Control.Applicative ((<$>), (<*>)) import Data.Bits import Data.Int -- Example mapping: -- example0 :: [(FilePath, Int)] example0 = [("foo-1.0/foo-1.0.cabal", 512) -- tar block 1 ,("foo-1.0/LICENSE", 2048) -- tar block 4 ,("foo-1.0/Data/Foo.hs", 4096)] -- tar block 8 -- After converting path components to integers this becomes: -- example1 :: [([Key], Value)] example1 = [([Key 1, Key 2], Value 512) ,([Key 1, Key 3], Value 2048) ,([Key 1, Key 4, Key 5], Value 4096)] -- As a trie this looks like: -- [ (1, *) ] -- | -- [ (2, 512), (3, 1024), (4, *) ] -- | -- [ (5, 4096) ] -- We use an intermediate trie representation mktrie :: [(Int, TrieNode)] -> IntTrieBuilder mkleaf :: Key -> Value -> (Int, TrieNode) mknode :: Key -> IntTrieBuilder -> (Int, TrieNode) mktrie = IntTrieBuilder . IntMap.fromList mkleaf k v = (fromIntegral $ unKey k, TrieLeaf (unValue v)) mknode k t = (fromIntegral $ unKey k, TrieNode t) example2 :: IntTrieBuilder example2 = mktrie [ mknode (Key 1) t1 ] where t1 = mktrie [ mkleaf (Key 2) (Value 512), mkleaf (Key 3) (Value 2048), mknode (Key 4) t2 ] t2 = mktrie [ mkleaf (Key 5) (Value 4096) ] example2' :: IntTrieBuilder example2' = mktrie [ mknode (Key 0) t1 ] where t1 = mktrie [ mknode (Key 3) t2 ] t2 = mktrie [ mknode (Key 1) t3, mknode (Key 2) t4 ] t3 = mktrie [ mkleaf (Key 4) (Value 10608) ] t4 = mktrie [ mkleaf (Key 4) (Value 10612) ] {- 0: [1,N0,3] 3: [1,N3,6] 6: [2,N1,N2,11,12] 11: [1,4,10608] 14: [1,4,10612] -} example2'' :: IntTrieBuilder example2'' = mktrie [ mknode (Key 1) t1, mknode (Key 2) t2 ] where t1 = mktrie [ mkleaf (Key 4) (Value 10608) ] t2 = mktrie [ mkleaf (Key 4) (Value 10612) ] example2''' :: IntTrieBuilder example2''' = mktrie [ mknode (Key 0) t3 ] where t3 = mktrie [ mknode (Key 4) t8, mknode (Key 6) t11 ] t8 = mktrie [ mknode (Key 1) t14 ] t11 = mktrie [ mkleaf (Key 5) (Value 10605) ] t14 = mktrie [ mknode (Key 2) t19, mknode (Key 3) t22 ] t19 = mktrie [ mkleaf (Key 7) (Value 10608) ] t22 = mktrie [ mkleaf (Key 7) (Value 10612) ] {- 0: [1,N0,3] 3: [2,N4,N6,8,11] 8: [1,N1,11] 11: [1,5,10605] 14: [2,N2,N3,16,19] 19: [1,7,10608] 22: [1,7,10612] -} -- We convert from the 'Paths' to the 'IntTrieBuilder' using 'inserts': -- test1 = example2 === inserts example1 empty -- So the overall array form of the above trie is: -- -- offset: 0 1 2 3 4 5 6 7 8 9 10 11 12 -- array: [ 1 | N1 | 3 ][ 3 | 2, 3, N4 | 512, 2048, 10 ][ 1 | 5 | 4096 ] -- \__/ \___/ example3 :: [Word32] example3 = [1, tagNode 1, 3, 3, tagLeaf 2, tagLeaf 3, tagNode 4, 512, 2048, 10, 1, tagLeaf 5, 4096 ] -- We get the array form by using flattenTrie: test2 = example3 === flattenTrie example2 example4 :: IntTrie example4 = IntTrie (mkArray example3) mkArray :: [Word32] -> A.UArray Word32 Word32 mkArray xs = A.listArray (0, fromIntegral (length xs) - 1) xs test3 = case lookup example4 [Key 1] of Just (Completions [(Key 2,_),(Key 3,_),(Key 4,_)]) -> True _ -> False test1 :: Property prop_lookup :: [([Key], Value)] -> Property prop_lookup paths = conjoin $ flip map paths $ \(key, value) -> counterexample (show (trie, key)) $ lookup trie key === Just (Entry value) where trie = construct paths prop_completions :: [([Key], Value)] -> Property prop_completions paths = inserts paths empty === convertCompletions (completionsFrom (construct paths) 0) where convertCompletions :: Completions -> IntTrieBuilder convertCompletions kls = IntTrieBuilder $ IntMap.fromList [ case l of Entry v -> mkleaf k v Completions kls' -> mknode k (convertCompletions kls') | (k, l) <- sortBy (compare `on` fst) kls ] prop_lookup_mono :: ValidPaths -> Property prop_lookup_mono (ValidPaths paths) = prop_lookup paths prop_completions_mono :: ValidPaths -> Property prop_completions_mono (ValidPaths paths) = prop_completions paths prop_construct_toList :: ValidPaths -> Property prop_construct_toList (ValidPaths paths) = sortBy (compare `on` fst) (toList (construct paths)) === sortBy (compare `on` fst) paths prop_finalise_unfinalise :: ValidPaths -> Property prop_finalise_unfinalise (ValidPaths paths) = builder === unfinalise (finalise builder) where builder :: IntTrieBuilder builder = inserts paths empty prop_serialise_deserialise :: ValidPaths -> Property prop_serialise_deserialise (ValidPaths paths) = Just (trie, BS.empty) === (deserialise . LBS.toStrict . BS.toLazyByteString . serialise) trie where trie :: IntTrie trie = construct paths prop_serialiseSize :: ValidPaths -> Property prop_serialiseSize (ValidPaths paths) = (fromIntegral . LBS.length . BS.toLazyByteString . serialise) trie === serialiseSize trie where trie :: IntTrie trie = construct paths newtype ValidPaths = ValidPaths [([Key], Value)] deriving Show instance Arbitrary ValidPaths where arbitrary = ValidPaths . makeNoPrefix <$> listOf ((,) -- Key is actually Word31 <$> listOf1 (fmap (Key . fromIntegral @Int32 . getNonNegative) arbitrary) <*> fmap Value arbitrary) where makeNoPrefix :: [([Key], Value)] -> [([Key], Value)] makeNoPrefix [] = [] makeNoPrefix ((ks, v) : ksvs) | all (\(ks', _) -> not (isPrefixOfOther ks ks')) ksvs = (ks, v) : makeNoPrefix ksvs | otherwise = makeNoPrefix ksvs shrink (ValidPaths kvs) = map ValidPaths . filter noPrefix . filter nonEmpty . map (map (\(ks, v) -> (map Key ks, Value v))) . shrink . map (\(ks, v) -> (map unKey ks, unValue v)) $ kvs where noPrefix :: [([Key], Value)] -> Bool noPrefix [] = True noPrefix ((k,_):kvs') = all (\(k', _) -> not (isPrefixOfOther k k')) kvs' && noPrefix kvs' nonEmpty = all (not . null . fst) isPrefixOfOther :: [Key] -> [Key] -> Bool isPrefixOfOther a b = a `isPrefixOf` b || b `isPrefixOf` a tar-0.6.4.0/test/Codec/Archive/Tar/Index/StringTable/0000755000000000000000000000000007346545000020226 5ustar0000000000000000tar-0.6.4.0/test/Codec/Archive/Tar/Index/StringTable/Tests.hs0000644000000000000000000000535507346545000021674 0ustar0000000000000000{-# LANGUAGE CPP #-} module Codec.Archive.Tar.Index.StringTable.Tests ( prop_valid, prop_sorted, prop_finalise_unfinalise, prop_serialise_deserialise, prop_serialiseSize, ) where import Prelude hiding (lookup) import Codec.Archive.Tar.Index.StringTable import Test.Tasty.QuickCheck import Data.List hiding (lookup, insert) import qualified Data.Array.Unboxed as A import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS #if MIN_VERSION_bytestring(0,10,2) || defined(MIN_VERSION_bytestring_builder) import Data.ByteString.Builder as BS import Data.ByteString.Builder.Extra as BS (byteStringCopy) #else import Data.ByteString.Lazy.Builder as BS import Data.ByteString.Lazy.Builder.Extras as BS (byteStringCopy) #endif prop_valid :: [BS.ByteString] -> Property prop_valid strs = conjoin (map lookupIndex (enumStrings tbl)) .&&. conjoin (map indexLookup (enumIds tbl)) where tbl :: StringTable Int tbl = construct strs lookupIndex :: BS.ByteString -> Property lookupIndex str = index tbl ident === str where Just ident = lookup tbl str indexLookup :: Int -> Property indexLookup ident = lookup tbl str === Just ident where str = index tbl ident -- this is important so we can use Map.fromAscList prop_sorted :: [BS.ByteString] -> Property prop_sorted strings = property $ isSorted [ index' strs offsets ix | ix <- A.range (A.bounds ids) ] where _tbl :: StringTable Int _tbl@(StringTable strs offsets ids _ixs) = construct strings isSorted xs = and (zipWith (<) xs (drop 1 xs)) prop_finalise_unfinalise :: [BS.ByteString] -> Property prop_finalise_unfinalise strs = builder === unfinalise (finalise builder) where builder :: StringTableBuilder Int builder = foldl' (\tbl s -> fst (insert s tbl)) empty strs prop_serialise_deserialise :: [BS.ByteString] -> Property prop_serialise_deserialise strs = Just (strtable, BS.empty) === (deserialiseV2 . LBS.toStrict . BS.toLazyByteString . serialise) strtable where strtable :: StringTable Int strtable = construct strs prop_serialiseSize :: [BS.ByteString] -> Property prop_serialiseSize strs = (fromIntegral . LBS.length . BS.toLazyByteString . serialise) strtable === serialiseSize strtable where strtable :: StringTable Int strtable = construct strs enumStrings :: Enum id => StringTable id -> [BS.ByteString] enumStrings (StringTable bs offsets _ _) = map (index' bs offsets) [0..h-1] where (0,h) = A.bounds offsets enumIds :: Enum id => StringTable id -> [id] enumIds (StringTable _ offsets _ _) = [toEnum 0 .. toEnum (fromIntegral (h-1))] where (0,h) = A.bounds offsets tar-0.6.4.0/test/Codec/Archive/Tar/Index/0000755000000000000000000000000007346545000016010 5ustar0000000000000000tar-0.6.4.0/test/Codec/Archive/Tar/Index/Tests.hs0000644000000000000000000002316307346545000017453 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE CPP #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Index.Tests -- Copyright : (c) 2010-2015 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Index.Tests ( prop_lookup, prop_toList, prop_valid, prop_serialise_deserialise, prop_serialiseSize, #ifdef MIN_VERSION_bytestring_handle prop_index_matches_tar, #endif prop_finalise_unfinalise, ) where import Codec.Archive.Tar (GenEntries(..), Entries, GenEntry, Entry, GenEntryContent(..)) import Codec.Archive.Tar.Index.Internal (TarIndexEntry(..), TarIndex(..), IndexBuilder, TarEntryOffset(..)) import qualified Codec.Archive.Tar.Index.Internal as Tar import qualified Codec.Archive.Tar.Index.IntTrie as IntTrie import qualified Codec.Archive.Tar.Index.IntTrie.Tests as IntTrie import qualified Codec.Archive.Tar.Index.StringTable as StringTable import qualified Codec.Archive.Tar.Index.StringTable.Tests as StringTable import qualified Codec.Archive.Tar.Types as Tar import qualified Codec.Archive.Tar.Write as Tar import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS.Char8 import qualified Data.ByteString.Lazy as LBS import Data.Int #if (MIN_VERSION_base(4,5,0)) import Data.Monoid ((<>)) #endif import qualified System.FilePath.Posix as FilePath import System.IO import Prelude hiding (lookup) import qualified Prelude import Test.QuickCheck import Test.QuickCheck.Property (ioProperty) import Control.Applicative ((<$>), (<*>)) import Control.Monad (unless) import Data.List (nub, sort, sortBy, stripPrefix, isPrefixOf) import Data.Maybe import Data.Function (on) import Control.Exception (SomeException, try, throwIO) #ifdef MIN_VERSION_bytestring_handle import qualified Data.ByteString.Handle as HBS #endif -- Not quite the properties of a finite mapping because we also have lookups -- that result in completions. prop_lookup :: ValidPaths -> NonEmptyFilePath -> Property prop_lookup (ValidPaths paths) (NonEmptyFilePath p) = case (Tar.lookup index p, Prelude.lookup p paths) of (Nothing, Nothing) -> property True (Just (TarFileEntry offset), Just (_,offset')) -> offset === offset' (Just (TarDir entries), Nothing) -> sort (nub (map fst entries)) === sort (nub completions) _ -> property False where index = construct paths completions = [ hd | (path,_) <- paths , completion <- maybeToList $ stripPrefix (p ++ "/") path , let hd : _ = FilePath.splitDirectories completion ] prop_toList :: ValidPaths -> Property prop_toList (ValidPaths paths) = sort (Tar.toList index) === sort [ (path, off) | (path, (_sz, off)) <- paths ] where index = construct paths prop_valid :: ValidPaths -> Property prop_valid (ValidPaths paths) = StringTable.prop_valid pathbits .&&. IntTrie.prop_lookup intpaths .&&. IntTrie.prop_completions intpaths .&&. prop' where index@(TarIndex pathTable _ _) = construct paths pathbits = concatMap (map BS.Char8.pack . FilePath.splitDirectories . fst) paths intpaths :: [([IntTrie.Key], IntTrie.Value)] intpaths = [ (map (\(Tar.PathComponentId n) -> IntTrie.Key (fromIntegral n)) cids, IntTrie.Value offset) | (path, (_size, offset)) <- paths , let Just cids = Tar.toComponentIds pathTable path ] prop' = conjoin $ flip map paths $ \(file, (_size, offset)) -> case Tar.lookup index file of Just (TarFileEntry offset') -> offset' === offset _ -> property False prop_serialise_deserialise :: ValidPaths -> Property prop_serialise_deserialise (ValidPaths paths) = Just (index, BS.empty) === (Tar.deserialise . Tar.serialise) index where index = construct paths prop_serialiseSize :: ValidPaths -> Property prop_serialiseSize (ValidPaths paths) = case (LBS.toChunks . Tar.serialiseLBS) index of [c1] -> BS.length c1 === Tar.serialiseSize index _ -> property False where index = construct paths newtype NonEmptyFilePath = NonEmptyFilePath FilePath deriving Show instance Arbitrary NonEmptyFilePath where arbitrary = NonEmptyFilePath . FilePath.joinPath <$> listOf1 (elements ["a", "b", "c", "d"]) newtype ValidPaths = ValidPaths [(FilePath, (Int64, TarEntryOffset))] deriving Show instance Arbitrary ValidPaths where arbitrary = do paths <- makeNoPrefix <$> listOf arbitraryPath sizes <- vectorOf (length paths) (getNonNegative <$> arbitrary) let offsets = scanl (\o sz -> o + 1 + blocks sz) 0 sizes return (ValidPaths (zip paths (zip sizes offsets))) where arbitraryPath = FilePath.joinPath <$> listOf1 (elements ["a", "b", "c", "d"]) makeNoPrefix [] = [] makeNoPrefix (k:ks) | all (not . isPrefixOfOther k) ks = k : makeNoPrefix ks | otherwise = makeNoPrefix ks isPrefixOfOther a b = a `isPrefixOf` b || b `isPrefixOf` a blocks :: Int64 -> TarEntryOffset blocks size = fromIntegral (1 + ((size - 1) `div` 512)) -- Helper for bulk construction. construct :: [(FilePath, (Int64, TarEntryOffset))] -> TarIndex construct = either (const undefined) id . Tar.build . foldr (\(path, (size, _off)) es -> Next (testEntry path size) es) Done example0 :: Entries () example0 = testEntry "foo-1.0/foo-1.0.cabal" 1500 -- at block 0 `Next` testEntry "foo-1.0/LICENSE" 2000 -- at block 4 `Next` testEntry "foo-1.0/Data/Foo.hs" 1000 -- at block 9 `Next` Done example1 :: Entries () example1 = Next (testEntry "./" 1500) Done <> example0 testEntry :: FilePath -> Int64 -> Entry testEntry name size = Tar.simpleEntry path (NormalFile mempty size) where Right path = Tar.toTarPath False name -- | Simple tar archive containing regular files only data SimpleTarArchive = SimpleTarArchive { simpleTarEntries :: Tar.Entries () , simpleTarRaw :: [(FilePath, LBS.ByteString)] , simpleTarBS :: LBS.ByteString } instance Show SimpleTarArchive where show = show . simpleTarRaw #ifdef MIN_VERSION_bytestring_handle prop_index_matches_tar :: SimpleTarArchive -> Property prop_index_matches_tar sta = ioProperty (try go >>= either (\e -> throwIO (e :: SomeException)) (\_ -> return True)) where go :: IO () go = do h <- HBS.readHandle True (simpleTarBS sta) goEntries h 0 (simpleTarEntries sta) goEntries :: Handle -> TarEntryOffset -> Entries () -> IO () goEntries _ _ Done = return () goEntries _ _ (Fail _) = throwIO (userError "Fail entry in SimpleTarArchive") goEntries h offset (Tar.Next e es) = do goEntry h offset e goEntries h (Tar.nextEntryOffset e offset) es goEntry :: Handle -> TarEntryOffset -> Tar.Entry -> IO () goEntry h offset e = do e' <- Tar.hReadEntry h offset case (Tar.entryContent e, Tar.entryContent e') of (Tar.NormalFile bs sz, Tar.NormalFile bs' sz') -> unless (sz == sz' && bs == bs') $ throwIO $ userError "Entry mismatch" _otherwise -> throwIO $ userError "unexpected entry types" #endif instance Arbitrary SimpleTarArchive where arbitrary = do numEntries <- sized $ \n -> choose (0, n) rawEntries <- mkRaw numEntries let entries = mkList rawEntries return SimpleTarArchive { simpleTarEntries = mkEntries entries , simpleTarRaw = rawEntries , simpleTarBS = Tar.write entries } where mkRaw :: Int -> Gen [(FilePath, LBS.ByteString)] mkRaw 0 = return [] mkRaw n = do -- Pick a size around 0, 1, or 2 block boundaries sz <- sized $ \n -> elements (take n fileSizes) bs <- LBS.pack `fmap` vectorOf sz arbitrary es <- mkRaw (n - 1) return $ ("file" ++ show n, bs) : es mkList :: [(FilePath, LBS.ByteString)] -> [Tar.Entry] mkList [] = [] mkList ((fp, bs):es) = entry : mkList es where Right path = Tar.toTarPath False fp entry = Tar.simpleEntry path content content = NormalFile bs (LBS.length bs) mkEntries :: [Tar.Entry] -> Tar.Entries () mkEntries [] = Tar.Done mkEntries (e:es) = Tar.Next e (mkEntries es) -- Sizes around 0, 1, and 2 block boundaries fileSizes :: [Int] fileSizes = [ 0 , 1 , 2 , 510 , 511 , 512 , 513 , 514 , 1022 , 1023 , 1024 , 1025 , 1026 ] -- | t'IndexBuilder' constructed from a 'SimpleIndex' newtype SimpleIndexBuilder = SimpleIndexBuilder IndexBuilder deriving Show instance Arbitrary SimpleIndexBuilder where arbitrary = SimpleIndexBuilder . build' . simpleTarEntries <$> arbitrary where -- like 'build', but don't finalize build' :: Show e => Entries e -> IndexBuilder build' = go Tar.empty where go !builder (Next e es) = go (Tar.addNextEntry e builder) es go !builder Done = builder go !_ (Fail err) = error (show err) prop_finalise_unfinalise :: SimpleIndexBuilder -> Property prop_finalise_unfinalise (SimpleIndexBuilder index) = Tar.unfinalise (Tar.finalise index) === index #if !(MIN_VERSION_base(4,5,0)) (<>) :: Monoid m => m -> m -> m (<>) = mappend #endif tar-0.6.4.0/test/Codec/Archive/Tar/Pack/0000755000000000000000000000000007346545000015617 5ustar0000000000000000tar-0.6.4.0/test/Codec/Archive/Tar/Pack/Tests.hs0000644000000000000000000001572107346545000017263 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE ScopedTypeVariables #-} {-# OPTIONS_GHC -Wno-unrecognised-pragmas #-} {-# HLINT ignore "Avoid restricted function" #-} module Codec.Archive.Tar.Pack.Tests ( prop_roundtrip , unit_roundtrip_unicode , unit_roundtrip_symlink , unit_roundtrip_long_symlink , unit_roundtrip_long_filepath ) where import Control.DeepSeq import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as BL import Data.Char import Data.FileEmbed import qualified Codec.Archive.Tar as Tar import qualified Codec.Archive.Tar.Pack as Pack import Codec.Archive.Tar.PackAscii (filePathToOsPath) import qualified Codec.Archive.Tar.Read as Read import Codec.Archive.Tar.Types (GenEntries(..), Entries, simpleEntry, toTarPath, GenEntry (entryTarPath)) import qualified Codec.Archive.Tar.Unpack as Unpack import qualified Codec.Archive.Tar.Write as Write import Control.Exception import qualified Data.List as L import Data.List.NonEmpty (NonEmpty(..)) import GHC.IO.Encoding import System.Directory import System.Directory.OsPath.Streaming (getDirectoryContentsRecursive) import System.FilePath import qualified System.FilePath.Posix as Posix import qualified System.Info import System.IO.Temp import System.IO.Unsafe import Test.Tasty.QuickCheck supportsUnicode :: Bool supportsUnicode = unsafePerformIO $ do -- Normally getFileSystemEncoding returns a Unicode encoding, -- but if it is ASCII, we should not generate Unicode filenames. enc <- getFileSystemEncoding pure $ case textEncodingName enc of "ASCII" -> False "ANSI_X3.4-1968" -> False _ -> True {-# NOINLINE supportsUnicode #-} -- | Write a single file, deeply buried within nested folders; -- pack and unpack; read back and compare results. prop_roundtrip :: Int -> [String] -> String -> Property prop_roundtrip n' xss cnt | x : xs <- filter (not . null) $ map mkFilePath xss = ioProperty $ withSystemTempDirectory "tar-test" $ \baseDir -> do file : dirs <- pure $ trimUpToMaxPathLength baseDir (x : xs) let relDir = joinPath dirs absDir = baseDir relDir relFile = relDir file absFile = absDir file n = n' `mod` (length dirs + 1) (target, expectedFileNames) = case n of 0 -> (relFile, [relFile]) _ -> (joinPath $ take (n - 1) dirs, map (addTrailingPathSeparator . joinPath) (drop (max 1 (n - 1)) $ L.inits dirs) ++ [relFile]) errMsg = "relDir = '" ++ relDir ++ "'" ++ "\nabsDir = '" ++ absDir ++ "'" ++ "\nrelFile = '" ++ relFile ++ "'" ++ "\nabsFile = '" ++ absFile ++ "'" ++ "\ntarget = '" ++ target ++ "'" -- Not all filesystems allow paths to contain arbitrary Unicode. -- E. g., at the moment of writing Apple FS does not support characters -- introduced in Unicode 15.0. canCreateDirectory <- try (createDirectoryIfMissing True absDir) case canCreateDirectory of Left (e :: IOException) -> discard Right () -> do canWriteFile <- try (writeFile absFile cnt) case canWriteFile of Left (e :: IOException) -> discard Right () -> counterexample errMsg <$> do -- Forcing the result, otherwise lazy IO misbehaves. !entries <- Pack.pack baseDir [target] >>= evaluate . force let fileNames = map (map (\c -> if c == Posix.pathSeparator then pathSeparator else c)) $ Tar.foldEntries ((:) . entryTarPath) [] undefined -- decodeLongNames produces FilePath with POSIX path separators $ Tar.decodeLongNames $ foldr Next Done entries if expectedFileNames /= fileNames then pure (expectedFileNames === fileNames) else do -- Try hard to clean up removeFile absFile writeFile absFile "" case dirs of [] -> pure () d : _ -> removeDirectoryRecursive (baseDir d) -- Unpack back Unpack.unpack baseDir (foldr Next Done entries :: Entries IOException) exist <- doesFileExist absFile if exist then do cnt' <- readFile absFile >>= evaluate . force pure $ cnt === cnt' else do -- Forcing the result, otherwise lazy IO misbehaves. recFiles <- getDirectoryContentsRecursive (filePathToOsPath baseDir) >>= evaluate . force pure $ counterexample ("File " ++ absFile ++ " does not exist; instead found\n" ++ unlines (map show recFiles)) False | otherwise = discard mkFilePath :: String -> FilePath mkFilePath xs = makeValid $ filter isGood $ map (if supportsUnicode then id else chr . (`mod` 128) . ord) xs where isGood c = not (isPathSeparator c) && c `notElem` [' ', '\n', '\r', '.', ':'] && generalCategory c /= Surrogate && (supportsUnicode || isAscii c) trimUpToMaxPathLength :: FilePath -> [FilePath] -> [FilePath] trimUpToMaxPathLength baseDir = go (maxPathLength - utf8Length baseDir - 1) where go :: Int -> [FilePath] -> [FilePath] go cnt [] = [] go cnt (x : xs) | cnt < 4 = [] | cnt <= utf8Length x = [take (cnt `quot` 4) x] | otherwise = x : go (cnt - utf8Length x - 1) xs utf8Length :: String -> Int utf8Length = sum . map charLength where charLength c | c < chr 0x80 = 1 | c < chr 0x800 = 2 | c < chr 0x10000 = 3 | otherwise = 4 maxPathLength :: Int maxPathLength = case System.Info.os of "mingw32" -> 255 _ -> 1023 -- macOS does not like longer names unit_roundtrip_symlink :: Property unit_roundtrip_symlink = let tar :: BL.ByteString = BL.fromStrict $(embedFile "test/data/symlink.tar") entries = Tar.foldEntries (:) [] (const []) (Tar.read tar) in Tar.write entries === tar unit_roundtrip_long_filepath :: Property unit_roundtrip_long_filepath = let tar :: BL.ByteString = BL.fromStrict $(embedFile "test/data/long-filepath.tar") entries = Tar.foldEntries (:) [] (const []) (Tar.read tar) in Tar.write entries === tar unit_roundtrip_long_symlink :: Property unit_roundtrip_long_symlink = let tar :: BL.ByteString = BL.fromStrict $(embedFile "test/data/long-symlink.tar") entries = Tar.foldEntries (:) [] (const []) (Tar.read tar) in Tar.write entries === tar unit_roundtrip_unicode :: Property unit_roundtrip_unicode = do ioProperty $ withSystemTempDirectory "tar-test" $ \baseDir -> do let relFile = "TModula𐐀bA.hs" canWriteFile <- try (writeFile (baseDir relFile) "foo") case canWriteFile of Left (e :: IOException) -> pure $ property True Right () -> do entries <- Pack.pack baseDir [relFile] pure $ case Tar.foldlEntries (flip seq) () (Read.read (Write.write entries)) of Right{} -> property True Left (err, _) -> counterexample (show err) $ property False tar-0.6.4.0/test/Codec/Archive/Tar/0000755000000000000000000000000007346545000014741 5ustar0000000000000000tar-0.6.4.0/test/Codec/Archive/Tar/Tests.hs0000644000000000000000000000342507346545000016403 0ustar0000000000000000{-# LANGUAGE CPP #-} ----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Tests -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2012 Duncan Coutts -- License : BSD3 -- -- Maintainer : duncan@community.haskell.org -- Portability : portable -- ----------------------------------------------------------------------------- module Codec.Archive.Tar.Tests ( prop_write_read_ustar, prop_write_read_gnu, prop_write_read_v7, prop_large_filesize, ) where import Codec.Archive.Tar import Codec.Archive.Tar.Types import Codec.Archive.Tar.Types.Tests import qualified Data.ByteString.Lazy as BL import Prelude hiding (read) import Test.Tasty.QuickCheck prop_write_read_ustar :: [Entry] -> Property prop_write_read_ustar entries = foldr Next Done entries' === read (write entries') where entries' = filter ((== UstarFormat) . entryFormat) entries prop_write_read_gnu :: [Entry] -> Property prop_write_read_gnu entries = foldr Next Done entries' === read (write entries') where entries' = filter ((== GnuFormat) . entryFormat) entries prop_write_read_v7 :: [Entry] -> Property prop_write_read_v7 entries = foldr Next Done entries' === read (write entries') where entries' = map limitToV7FormatCompat $ filter ((== V7Format) . entryFormat) entries prop_large_filesize :: Word -> Property prop_large_filesize n = sz === sz' where sz = fromIntegral $ n * 1024 * 1024 * 128 Right fn = toTarPath False "Large.file" entry = simpleEntry fn (NormalFile (BL.replicate sz 42) sz) -- Trim the tail so it does not blow up RAM tar = BL.take 2048 $ write [entry] Next entry' _ = read tar NormalFile _ sz' = entryContent entry' tar-0.6.4.0/test/Codec/Archive/Tar/Types/0000755000000000000000000000000007346545000016045 5ustar0000000000000000tar-0.6.4.0/test/Codec/Archive/Tar/Types/Tests.hs0000644000000000000000000002076307346545000017513 0ustar0000000000000000----------------------------------------------------------------------------- -- | -- Module : Codec.Archive.Tar.Types.Tests -- Copyright : (c) 2007 Bjorn Bringert, -- 2008 Andrea Vezzosi, -- 2008-2009 Duncan Coutts -- 2011 Max Bolingbroke -- License : BSD3 -- ----------------------------------------------------------------------------- {-# LANGUAGE RecordWildCards #-} module Codec.Archive.Tar.Types.Tests ( limitToV7FormatCompat , prop_fromTarPath , prop_fromTarPathToPosixPath , prop_fromTarPathToWindowsPath ) where import Codec.Archive.Tar.PackAscii import Codec.Archive.Tar.Types import qualified Data.ByteString as BS import qualified Data.ByteString.Char8 as BS.Char8 import qualified Data.ByteString.Lazy as LBS import qualified System.FilePath as FilePath.Native ( joinPath, splitDirectories, addTrailingPathSeparator ) import qualified System.FilePath.Posix as FilePath.Posix ( joinPath, splitPath, splitDirectories, hasTrailingPathSeparator , addTrailingPathSeparator ) import qualified System.FilePath.Windows as FilePath.Windows ( joinPath, splitDirectories, addTrailingPathSeparator ) import Test.QuickCheck import Control.Applicative ((<$>), (<*>), pure) import Data.Word (Word16) prop_fromTarPath :: TarPath -> Property prop_fromTarPath tp = fromTarPath tp === fromTarPathRef tp prop_fromTarPathToPosixPath :: TarPath -> Property prop_fromTarPathToPosixPath tp = fromTarPathToPosixPath tp === fromTarPathToPosixPathRef tp prop_fromTarPathToWindowsPath :: TarPath -> Property prop_fromTarPathToWindowsPath tp = fromTarPathToWindowsPath tp === fromTarPathToWindowsPathRef tp fromTarPathRef :: TarPath -> FilePath fromTarPathRef (TarPath namebs prefixbs) = adjustDirectory $ FilePath.Native.joinPath $ FilePath.Posix.splitDirectories prefix ++ FilePath.Posix.splitDirectories name where name = BS.Char8.unpack $ posixToByteString namebs prefix = BS.Char8.unpack $ posixToByteString prefixbs adjustDirectory | FilePath.Posix.hasTrailingPathSeparator name = FilePath.Native.addTrailingPathSeparator | otherwise = id fromTarPathToPosixPathRef :: TarPath -> FilePath fromTarPathToPosixPathRef (TarPath namebs prefixbs) = adjustDirectory $ FilePath.Posix.joinPath $ FilePath.Posix.splitDirectories prefix ++ FilePath.Posix.splitDirectories name where name = BS.Char8.unpack $ posixToByteString namebs prefix = BS.Char8.unpack $ posixToByteString prefixbs adjustDirectory | FilePath.Posix.hasTrailingPathSeparator name = FilePath.Posix.addTrailingPathSeparator | otherwise = id fromTarPathToWindowsPathRef :: TarPath -> FilePath fromTarPathToWindowsPathRef (TarPath namebs prefixbs) = adjustDirectory $ FilePath.Windows.joinPath $ FilePath.Posix.splitDirectories prefix ++ FilePath.Posix.splitDirectories name where name = BS.Char8.unpack $ posixToByteString namebs prefix = BS.Char8.unpack $ posixToByteString prefixbs adjustDirectory | FilePath.Posix.hasTrailingPathSeparator name = FilePath.Windows.addTrailingPathSeparator | otherwise = id instance (Arbitrary tarPath, Arbitrary linkTarget) => Arbitrary (GenEntry tarPath linkTarget) where arbitrary = do entryTarPath <- arbitrary entryContent <- arbitrary entryPermissions <- fromIntegral <$> (arbitrary :: Gen Word16) entryOwnership <- arbitrary entryTime <- arbitraryOctal 11 entryFormat <- case entryContent of OtherEntryType 'K' _ _ -> pure GnuFormat OtherEntryType 'L' _ _ -> pure GnuFormat _ -> arbitrary pure Entry{..} shrink (Entry path content perms author time format) = [ Entry path' content' perms author' time' format | (path', content', author', time') <- shrink (path, content, author, time) ] ++ [ Entry path content perms' author time format | perms' <- shrinkIntegral perms ] instance Arbitrary TarPath where arbitrary = either error id . toTarPath False . FilePath.Posix.joinPath <$> listOf1ToN (255 `div` 5) (elements (map (replicate 4) "abcd")) shrink = map (either error id . toTarPath False) . map FilePath.Posix.joinPath . filter (not . null) . shrinkList shrinkNothing . FilePath.Posix.splitPath . fromTarPathToPosixPath instance Arbitrary LinkTarget where arbitrary = maybe (error "link target too large") id . toLinkTarget . FilePath.Native.joinPath <$> listOf1ToN (100 `div` 5) (elements (map (replicate 4) "abcd")) shrink = map (maybe (error "link target too large") id . toLinkTarget) . map FilePath.Posix.joinPath . filter (not . null) . shrinkList shrinkNothing . FilePath.Posix.splitPath . fromLinkTargetToPosixPath listOf1ToN :: Int -> Gen a -> Gen [a] listOf1ToN n g = sized $ \sz -> do n <- choose (1, min n (max 1 sz)) vectorOf n g listOf0ToN :: Int -> Gen a -> Gen [a] listOf0ToN n g = sized $ \sz -> do n <- choose (0, min n sz) vectorOf n g instance Arbitrary linkTarget => Arbitrary (GenEntryContent linkTarget) where arbitrary = frequency [ (16, do bs <- arbitrary; return (NormalFile bs (LBS.length bs))) , (2, pure Directory) , (1, SymbolicLink <$> arbitrary) , (1, HardLink <$> arbitrary) , (1, CharacterDevice <$> arbitraryOctal 7 <*> arbitraryOctal 7) , (1, BlockDevice <$> arbitraryOctal 7 <*> arbitraryOctal 7) , (1, pure NamedPipe) , (1, do c <- elements (['A'..'Z']++['a'..'z']) bs <- arbitrary; return (OtherEntryType c bs (LBS.length bs))) ] shrink (NormalFile bs _) = [ NormalFile bs' (LBS.length bs') | bs' <- shrink bs ] shrink Directory = [] shrink (SymbolicLink link) = [ SymbolicLink link' | link' <- shrink link ] shrink (HardLink link) = [ HardLink link' | link' <- shrink link ] shrink (CharacterDevice ma mi) = [ CharacterDevice ma' mi' | (ma', mi') <- shrink (ma, mi) ] shrink (BlockDevice ma mi) = [ BlockDevice ma' mi' | (ma', mi') <- shrink (ma, mi) ] shrink NamedPipe = [] shrink (OtherEntryType c bs _) = [ OtherEntryType c bs' (LBS.length bs') | bs' <- shrink bs ] instance Arbitrary LBS.ByteString where arbitrary = fmap LBS.pack arbitrary shrink = map LBS.pack . shrink . LBS.unpack instance Arbitrary BS.ByteString where arbitrary = fmap BS.pack arbitrary shrink = map BS.pack . shrink . BS.unpack instance Arbitrary Ownership where arbitrary = Ownership <$> name <*> name <*> idno <*> idno where -- restrict user/group to posix ^[a-z][-a-z0-9]{0,30}$ name = do first <- choose ('a', 'z') rest <- listOf0ToN 30 (oneof [choose ('a', 'z'), choose ('0', '9'), pure '-']) return $ first : rest idno = arbitraryOctal 7 shrink (Ownership oname gname oid gid) = [ Ownership oname' gname' oid' gid' | (oname', gname', oid', gid') <- shrink (oname, gname, oid, gid) ] instance Arbitrary Format where arbitrary = elements [V7Format, UstarFormat, GnuFormat] shrink GnuFormat = [] shrink _ = [GnuFormat] --arbitraryOctal :: (Integral n, Random n) => Int -> Gen n arbitraryOctal n = oneof [ pure 0 , choose (0, upperBound) , pure upperBound ] where upperBound = 8^n-1 -- For QC tests it's useful to have a way to limit the info to that which can -- be expressed in the old V7 format limitToV7FormatCompat :: Entry -> Entry limitToV7FormatCompat entry@Entry { entryFormat = V7Format } = entry { entryContent = case entryContent entry of CharacterDevice _ _ -> OtherEntryType '3' LBS.empty 0 BlockDevice _ _ -> OtherEntryType '4' LBS.empty 0 Directory -> OtherEntryType '5' LBS.empty 0 NamedPipe -> OtherEntryType '6' LBS.empty 0 other -> other, entryOwnership = (entryOwnership entry) { groupName = "", ownerName = "" }, entryTarPath = let TarPath name _prefix = entryTarPath entry in TarPath name mempty } limitToV7FormatCompat entry = entry tar-0.6.4.0/test/Codec/Archive/Tar/Unpack/0000755000000000000000000000000007346545000016162 5ustar0000000000000000tar-0.6.4.0/test/Codec/Archive/Tar/Unpack/Tests.hs0000644000000000000000000000165207346545000017624 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} module Codec.Archive.Tar.Unpack.Tests ( case_modtime_1970 ) where import qualified Codec.Archive.Tar as Tar import qualified Codec.Archive.Tar.Types as Tar import Codec.Archive.Tar.Types (GenEntries(..), Entries, GenEntry(..)) import qualified Codec.Archive.Tar.Unpack as Unpack import Control.Exception import Data.Time.Clock import Data.Time.Clock.System import System.Directory import System.FilePath import System.IO.Temp import Test.Tasty.QuickCheck case_modtime_1970 :: Property case_modtime_1970 = ioProperty $ withSystemTempDirectory "tar-test" $ \baseDir -> do let filename = "foo" Right tarPath <- pure $ Tar.toTarPath False filename let entry = (Tar.fileEntry tarPath "bar") { entryTime = 0 } entries = Next entry Done :: Entries IOException Tar.unpack baseDir entries modTime <- getModificationTime (baseDir filename) pure $ modTime === UTCTime systemEpochDay 0 tar-0.6.4.0/test/0000755000000000000000000000000007346545000011615 5ustar0000000000000000tar-0.6.4.0/test/Properties.hs0000644000000000000000000000605407346545000014312 0ustar0000000000000000{-# LANGUAGE CPP #-} module Main where import qualified Codec.Archive.Tar.Index.Tests as Index import qualified Codec.Archive.Tar.Index.IntTrie.Tests as IntTrie import qualified Codec.Archive.Tar.Index.StringTable.Tests as StringTable import qualified Codec.Archive.Tar.Pack.Tests as Pack import qualified Codec.Archive.Tar.Tests as Tar import qualified Codec.Archive.Tar.Types.Tests as Types import qualified Codec.Archive.Tar.Unpack.Tests as Unpack import Test.Tasty import Test.Tasty.QuickCheck main :: IO () main = defaultMain $ testGroup "tar tests" [ testGroup "fromTarPath" [ testProperty "fromTarPath" Types.prop_fromTarPath, testProperty "fromTarPathToPosixPath" Types.prop_fromTarPathToPosixPath, testProperty "fromTarPathToWindowsPath" Types.prop_fromTarPathToWindowsPath ] , testGroup "write/read" [ testProperty "ustar format" Tar.prop_write_read_ustar, testProperty "gnu format" Tar.prop_write_read_gnu, testProperty "v7 format" Tar.prop_write_read_v7, testProperty "large filesize" Tar.prop_large_filesize ] , testGroup "string table" [ testProperty "construction" StringTable.prop_valid, testProperty "sorted" StringTable.prop_sorted, testProperty "serialise" StringTable.prop_serialise_deserialise, testProperty "size" StringTable.prop_serialiseSize, testProperty "unfinalise" StringTable.prop_finalise_unfinalise ] , testGroup "int trie" [ testProperty "unit 1" IntTrie.test1, testProperty "unit 2" IntTrie.test2, testProperty "unit 3" IntTrie.test3, testProperty "lookups" IntTrie.prop_lookup_mono, testProperty "completions" IntTrie.prop_completions_mono, testProperty "toList" IntTrie.prop_construct_toList, testProperty "serialise" IntTrie.prop_serialise_deserialise, testProperty "size" IntTrie.prop_serialiseSize, testProperty "unfinalise" IntTrie.prop_finalise_unfinalise ] , testGroup "index" [ testProperty "lookup" Index.prop_lookup, testProperty "valid" Index.prop_valid, testProperty "toList" Index.prop_toList, testProperty "serialise" Index.prop_serialise_deserialise, testProperty "size" Index.prop_serialiseSize, #ifdef MIN_VERSION_bytestring_handle testProperty "matches tar" Index.prop_index_matches_tar, #endif testProperty "unfinalise" Index.prop_finalise_unfinalise ] , testGroup "pack" [ adjustOption (\(QuickCheckMaxRatio n) -> QuickCheckMaxRatio (max n 100)) $ testProperty "roundtrip" Pack.prop_roundtrip, testProperty "unicode" Pack.unit_roundtrip_unicode, testProperty "symlink" Pack.unit_roundtrip_symlink, testProperty "long filepath" Pack.unit_roundtrip_long_filepath, testProperty "long symlink" Pack.unit_roundtrip_long_symlink ] , testGroup "unpack" [ testProperty "modtime 1970-01-01" Unpack.case_modtime_1970 ] ] tar-0.6.4.0/test/data/0000755000000000000000000000000007346545000012526 5ustar0000000000000000tar-0.6.4.0/test/data/long-filepath.tar0000644000000000000000000005000007346545000015762 0ustar0000000000000000ghc-8.6.5/0000755000000000000000000000000014535421547010473 5ustar0000000000000000ghc-8.6.5/libraries/0000755000000000000000000000000014535421547012447 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/0000755000000000000000000000000014535421547014641 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/0000755000000000000000000000000014535421547017250 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/0000755000000000000000000000000014535421547020347 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/0000755000000000000000000000000014535421547021220 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/0000755000000000000000000000000014535421547023312 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/0000755000000000000000000000000014535421547024700 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/0000755000000000000000000000000014535421547025607 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/0000755000000000000000000000000014535421547027363 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/0000755000000000000000000000000014535421547030600 5ustar0000000000000000libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/0000755000000000000000000000000014535421547032475 5ustar0000000000000000ghc-8.6.5bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/0000755000000000000000000000000014535421547034667 5ustar0000000000000000ghc-8.6.5/librariesbuild/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/0000755000000000000000000000000014535421547037276 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-installbuild/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/0000755000000000000000000000000014535421547040375 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-installbuild/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/0000755000000000000000000000000014535421547041246 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-installByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/0000755000000000000000000000000014535421547043340 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/DataBuilder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/0000755000000000000000000000000014535421547044726 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteStringPrim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/0000755000000000000000000000000014535421547045635 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/BuilderInternal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/0000755000000000000000000000000014535421547047411 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Primghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/0000755000000000000000000000000014535421547050626 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internallibraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/0000755000000000000000000000000014535421547052602 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/0000755000000000000000000000000014535421547054774 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/librariesbuild/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/0000755000000000000000000000000014535421547057403 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-installbuild/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/0000755000000000000000000000000014535421547060502 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-installbuild/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/0000755000000000000000000000000014535421547061353 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-installByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/0000755000000000000000000000000014535421547063445 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/DataBuilder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/0000755000000000000000000000000014535421547065033 5ustar0000000000000000ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString././@LongLink0000644000000000000000000000037700000000000010002 Lustar ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/li0000755000000000000000000000000014535421547031125 5ustar0000000000000000././@LongLink0000644000000000000000000000041000000000000007766 Lustar ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/li0000755000000000000000000000000014535421547031125 5ustar0000000000000000././@LongLink0000644000000000000000000000043200000000000007772 Lustar ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/UncheckedShifts.highc-8.6.5/libraries/bytestring/dist-install/build/Data/ByteString/Builder/Prim/Internal/ghc-8.6.5/li0000644000000000000000000000000414526427343031121 0ustar0000000000000000foo tar-0.6.4.0/test/data/long-symlink.tar0000644000000000000000000001300007346545000015653 0ustar0000000000000000long/0000755000000000000000000000000014535421507010211 5ustar0000000000000000././@LongLink0000644000000000000000000000037700000000000010001 Kustar ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssse2././@LongLink0000644000000000000000000000040400000000000007771 Lustar long/ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssslinklong/ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt0000777000000000000000000000000000000000000064301 2ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttustar0000000000000000././@LongLink0000644000000000000000000000040400000000000007771 Lustar long/ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssse2long/ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt0000644000000000000000000000000014526166767035633 0ustar0000000000000000tar-0.6.4.0/test/data/symlink.tar0000644000000000000000000000600007346545000014720 0ustar0000000000000000input/0000755000000000000000000000000014535421736010415 5ustar0000000000000000input/foo0000644000000000000000000000000414526431634011113 0ustar0000000000000000foo input/bar0000777000000000000000000000000000000000000011536 2fooustar0000000000000000