Fast and full-featured Tar for Node.js
The API is designed to mimic the behavior of tar(1) on unix systems.
If you are familiar with how tar works, most of this will hopefully be
straightforward for you. If not, then hopefully this module can teach
you useful unix skills that may come in handy someday :)
A "tar file" or "tarball" is an archive of file system entries
(directories, files, links, etc.) The name comes from "tape archive".
If you run man tar on almost any Unix command line, you'll learn
quite a bit about what it can do, and its history.
Tar has 5 main top-level commands:
c Create an archiver Replace entries within an archiveu Update entries within an archive (ie, replace if they're newer)t List out the contents of an archivex Extract an archive to diskThe other flags and options modify how this top level function works.
These 5 functions are the high-level API. All of them have a
single-character name (for unix nerds familiar with tar(1)) as well
as a long name (for everyone else).
All the high-level functions take the following arguments, all three
of which are optional and may be omitted.
options - An optional object specifying various optionspaths - An array of paths to add or extractcallback - Called when the command is completed, if async. (IfTypeError.)If the command is sync (ie, if options.sync=true), then the
callback is not allowed, since the action will be completed immediately.
If a file argument is specified, and the command is async, then aPromise is returned. In this case, if async, a callback may be
provided which is called when the command is completed.
If a file option is not specified, then a stream is returned. Forcreate, this is a readable stream of the generated archive. Forlist and extract this is a writable stream that an archive should
be written into. If a file is not specified, then a callback is not
allowed, because you're already getting a stream to work with.
replace and update only work on existing archives, and so require
a file argument.
Sync commands without a file argument return a stream that acts on its
input immediately in the same tick. For readable streams, this means
that all of the data is immediately available by callingstream.read(). For writable streams, it will be acted upon as soon
as it is provided, but this can be at any time.
Some things cause tar to emit a warning, but should usually not cause
the entire operation to fail. There are three ways to handle
warnings:
onwarn function to the options, or listen'warn' event on any tar stream. The function will getonwarn(message, data). Handle as appropriate.strict: true in the options object, andwarn messages will be emitted as 'error' events instead. Iferror handler, this causes the program to crash. IfThe API mimics the tar(1) command line functionality, with aliases
for more human-readable option and function names. The goal is that
if you know how to use tar(1) in Unix, then you know how to userequire('tar') in JavaScript.
To replicate tar czf my-tarball.tgz files and folders, you'd do:
tar.c(
{
gzip: <true|gzip options>,
file: 'my-tarball.tgz'
},
['some', 'files', 'and', 'folders']
).then(_ => { .. tarball has been created .. })
To replicate tar cz files and folders > my-tarball.tgz, you'd do:
tar.c( // or tar.create
{
gzip: <true|gzip options>
},
['some', 'files', 'and', 'folders']
).pipe(fs.createWriteStream('my-tarball.tgz'))
To replicate tar xf my-tarball.tgz you'd do:
tar.x( // or tar.extract(
{
file: 'my-tarball.tgz'
}
).then(_=> { .. tarball has been dumped in cwd .. })
To replicate cat my-tarball.tgz | tar x -C some-dir --strip=1:
fs.createReadStream('my-tarball.tgz').pipe(
tar.x({
strip: 1,
C: 'some-dir' // alias for cwd:'some-dir', also ok
})
)
To replicate tar tf my-tarball.tgz, do this:
tar.t({
file: 'my-tarball.tgz',
onentry: entry => { .. do whatever with it .. }
})
To replicate cat my-tarball.tgz | tar t do:
fs.createReadStream('my-tarball.tgz')
.pipe(tar.t())
.on('entry', entry => { .. do whatever with it .. })
To do anything synchronous, add sync: true to the options. Note
that sync functions don't take a callback and don't return a promise.
When the function returns, it's already done. Sync methods without a
file argument return a sync stream, which flushes immediately. But,
of course, it still won't be done until you .end() it.
To filter entries, add filter: <function> to the options.
Tar-creating methods call the filter with filter(path, stat).
Tar-reading methods (including extraction) call the filter withfilter(path, entry). The filter is called in the this-context of
the Pack or Unpack stream object.
The arguments list to tar t and tar x specify a list of filenames
to extract or list, so they're equivalent to a filter that tests if
the file is in the list.
For those who aren't fans of tar's single-character command names:
tar.c === tar.create
tar.r === tar.replace (appends to archive, file is required)
tar.u === tar.update (appends if newer, file is required)
tar.x === tar.extract
tar.t === tar.list
Keep reading for all the command descriptions and options, as well as
the low-level API that they are built on.
Create a tarball archive.
The fileList is an array of paths to add to the tarball. Adding a
directory also adds its children recursively.
An entry in fileList that starts with an @ symbol is a tar archive
whose entries will be added. To add a file that starts with @,
prepend it with ./.
The following options are supported:
file Write the tarball archive to the specified filename. If thisf]sync Act synchronously. If this is set, then any provided filetar.c. If this is set,read or emit('data') as soon as youonwarn A function that will get called with (message, data) forstrict Treat warnings as crash-worthy errors. Default false.cwd The current working directory for creating the archive.process.cwd(). [Alias: C]prefix A path portion to prefix onto the entries in the archive.gzip Set to any truthy value to create a gzipped archive, or anzlib.Gzip() [Alias: z]filter A function that gets called with (path, stat) for eachtrue to add the entry to the archive,false to omit it.portable Omit metadata that is system-specific: ctime, atime,uid, gid, uname, gname, dev, ino, and nlink. Notemtime is still included, because this is necessary otherpreservePaths Allow absolute paths. By default, / is strippedP]mode The mode to set on the created file archivenoDirRecurse Do not recursively archive the contents ofn]follow Set to true to pack the targets of symbolic links. WithoutL, h]noPax Suppress pax extended headers. Note that this means thatnoMtime Set to true to omit writing mtime values for entries.tar.update or the keepNewer option with the resulting tar archive.m, no-mtime]mtime Set to a Date object to force a specific mtime fornoMtime.The following options are mostly internal, but can be modified in some
advanced use cases, such as re-using caches between runs.
linkCache A Map object containing the device and inode value forstatCache A Map object that caches calls lstat.readdirCache A Map object that caches calls to readdir.jobs A number specifying how many concurrent jobs to run.maxReadSize The maximum buffer size for fs.read() operations.Extract a tarball archive.
The fileList is an array of paths to extract from the tarball. If
no paths are provided, then all the entries are extracted.
If the archive is gzipped, then tar will detect this and unzip it.
Note that all directories that are created will be forced to be
writable, readable, and listable by their owner, to avoid cases where
a directory prevents extraction of child entries by virtue of its
mode.
Most extraction errors will cause a warn event to be emitted. If
the cwd is missing, or not a directory, then the extraction will
fail completely.
The following options are supported:
cwd Extract files relative to the specified directory. Defaultsprocess.cwd(). If provided, this must exist and must be aC]file The archive file to extract. If not specified, then af]sync Create files and directories synchronously.strict Treat warnings as crash-worthy errors. Default false.filter A function that gets called with (path, entry) for eachtrue to unpack the entry from thefalse to skip it.newer Set to true to keep the existing file on disk if it's newerkeep-newer,keep-newer-files]keep Do not overwrite existing files. In particular, if a filek, keep-existing]preservePaths Allow absolute paths, paths containing .., and/ is stripped from.. paths are not extracted, and any file whoseP]unlink Unlink files before creating them. Without this option,U]strip Remove the specified number of leading path elements.strip-components, stripComponents]onwarn A function that will get called with (message, data) forpreserveOwner If true, tar will set the uid and gid ofuid and gid fields in the archive.-p intar(1), but ACLs and other system-specific data is never unpackedp]uid Set to a number to force ownership of all extracted files anduid field in the archive.preserveOwner. Requires also setting agid option.gid Set to a number to force ownership of all extracted files andgid field in the archive.preserveOwner. Requires also setting auid option.noMtime Set to true to omit writing mtime value for extractedm, no-mtime]transform Provide a function that takes an entry object, andfilter option described above.)onentry A function that gets called with (entry) for each entryThe following options are mostly internal, but can be modified in some
advanced use cases, such as re-using caches between runs.
maxReadSize The maximum buffer size for fs.read() operations.umask Filter the modes of entries like process.umask().dmode Default mode for directoriesfmode Default mode for filesdirCache A Map object of which directories exist.maxMetaEntrySize The maximum size of meta entries that isNote that using an asynchronous stream type with the transform
option will cause undefined behavior in sync extractions.
MiniPass-based streams are designed for this
use case.
List the contents of a tarball archive.
The fileList is an array of paths to list from the tarball. If
no paths are provided, then all the entries are listed.
If the archive is gzipped, then tar will detect this and unzip it.
Returns an event emitter that emits entry events withtar.ReadEntry objects. However, they don't emit 'data' or 'end'
events. (If you want to get actual readable entries, use thetar.Parse class instead.)
The following options are supported:
cwd Extract files relative to the specified directory. Defaultsprocess.cwd(). [Alias: C]file The archive file to list. If not specified, then af]sync Read the specified file synchronously. (This has no effectstrict Treat warnings as crash-worthy errors. Default false.filter A function that gets called with (path, entry) for eachtrue to emit the entry from thefalse to skip it.onentry A function that gets called with (entry) for each entryfile andsync are set, because it will be called synchronously.maxReadSize The maximum buffer size for fs.read() operations.noResume By default, entry streams are resumed immediately afteronentry. Set noResume: true to suppress thisAdd files to an archive if they are newer than the entry already in
the tarball archive.
The fileList is an array of paths to add to the tarball. Adding a
directory also adds its children recursively.
An entry in fileList that starts with an @ symbol is a tar archive
whose entries will be added. To add a file that starts with @,
prepend it with ./.
The following options are supported:
file Required. Write the tarball archive to the specifiedf]sync Act synchronously. If this is set, then any provided filetar.c.onwarn A function that will get called with (message, data) forstrict Treat warnings as crash-worthy errors. Default false.cwd The current working directory for adding entries to theprocess.cwd(). [Alias: C]prefix A path portion to prefix onto the entries in the archive.gzip Set to any truthy value to create a gzipped archive, or anzlib.Gzip() [Alias: z]filter A function that gets called with (path, stat) for eachtrue to add the entry to the archive,false to omit it.portable Omit metadata that is system-specific: ctime, atime,uid, gid, uname, gname, dev, ino, and nlink. Notemtime is still included, because this is necessary otherpreservePaths Allow absolute paths. By default, / is strippedP]maxReadSize The maximum buffer size for fs.read() operations.noDirRecurse Do not recursively archive the contents ofn]follow Set to true to pack the targets of symbolic links. WithoutL, h]noPax Suppress pax extended headers. Note that this means thatnoMtime Set to true to omit writing mtime values for entries.tar.update or the keepNewer option with the resulting tar archive.m, no-mtime]mtime Set to a Date object to force a specific mtime fornoMtime.Add files to an existing archive. Because later entries override
earlier entries, this effectively replaces any existing entries.
The fileList is an array of paths to add to the tarball. Adding a
directory also adds its children recursively.
An entry in fileList that starts with an @ symbol is a tar archive
whose entries will be added. To add a file that starts with @,
prepend it with ./.
The following options are supported:
file Required. Write the tarball archive to the specifiedf]sync Act synchronously. If this is set, then any provided filetar.c.onwarn A function that will get called with (message, data) forstrict Treat warnings as crash-worthy errors. Default false.cwd The current working directory for adding entries to theprocess.cwd(). [Alias: C]prefix A path portion to prefix onto the entries in the archive.gzip Set to any truthy value to create a gzipped archive, or anzlib.Gzip() [Alias: z]filter A function that gets called with (path, stat) for eachtrue to add the entry to the archive,false to omit it.portable Omit metadata that is system-specific: ctime, atime,uid, gid, uname, gname, dev, ino, and nlink. Notemtime is still included, because this is necessary otherpreservePaths Allow absolute paths. By default, / is strippedP]maxReadSize The maximum buffer size for fs.read() operations.noDirRecurse Do not recursively archive the contents ofn]follow Set to true to pack the targets of symbolic links. WithoutL, h]noPax Suppress pax extended headers. Note that this means thatnoMtime Set to true to omit writing mtime values for entries.tar.update or the keepNewer option with the resulting tar archive.m, no-mtime]mtime Set to a Date object to force a specific mtime fornoMtime.A readable tar stream.
Has all the standard readable stream interface stuff. 'data' and'end' events, read() method, pause() and resume(), etc.
The following options are supported:
onwarn A function that will get called with (message, data) forstrict Treat warnings as crash-worthy errors. Default false.cwd The current working directory for creating the archive.process.cwd().prefix A path portion to prefix onto the entries in the archive.gzip Set to any truthy value to create a gzipped archive, or anzlib.Gzip()filter A function that gets called with (path, stat) for eachtrue to add the entry to the archive,false to omit it.portable Omit metadata that is system-specific: ctime, atime,uid, gid, uname, gname, dev, ino, and nlink. Notemtime is still included, because this is necessary otherpreservePaths Allow absolute paths. By default, / is strippedlinkCache A Map object containing the device and inode value forstatCache A Map object that caches calls lstat.readdirCache A Map object that caches calls to readdir.jobs A number specifying how many concurrent jobs to run.maxReadSize The maximum buffer size for fs.read() operations.noDirRecurse Do not recursively archive the contents offollow Set to true to pack the targets of symbolic links. WithoutnoPax Suppress pax extended headers. Note that this means thatnoMtime Set to true to omit writing mtime values for entries.tar.update or the keepNewer option with the resulting tar archive.mtime Set to a Date object to force a specific mtime fornoMtime.Adds an entry to the archive. Returns the Pack stream.
Adds an entry to the archive. Returns true if flushed.
Finishes the archive.
Synchronous version of tar.Pack.
A writable stream that unpacks a tar archive onto the file system.
All the normal writable stream stuff is supported. write() andend() methods, 'drain' events, etc.
Note that all directories that are created will be forced to be
writable, readable, and listable by their owner, to avoid cases where
a directory prevents extraction of child entries by virtue of its
mode.
'close' is emitted when it's done writing stuff to the file system.
Most unpack errors will cause a warn event to be emitted. If thecwd is missing, or not a directory, then an error will be emitted.
cwd Extract files relative to the specified directory. Defaultsprocess.cwd(). If provided, this must exist and must be afilter A function that gets called with (path, entry) for eachtrue to unpack the entry from thefalse to skip it.newer Set to true to keep the existing file on disk if it's newerkeep Do not overwrite existing files. In particular, if a filepreservePaths Allow absolute paths, paths containing .., and/ is stripped from.. paths are not extracted, and any file whoseunlink Unlink files before creating them. Without this option,strip Remove the specified number of leading path elements.onwarn A function that will get called with (message, data) forumask Filter the modes of entries like process.umask().dmode Default mode for directoriesfmode Default mode for filesdirCache A Map object of which directories exist.maxMetaEntrySize The maximum size of meta entries that ispreserveOwner If true, tar will set the uid and gid ofuid and gid fields in the archive.-p intar(1), but ACLs and other system-specific data is never unpackedwin32 True if on a windows platform. Causes behavior where<|>? chars are converted touid Set to a number to force ownership of all extracted files anduid field in the archive.preserveOwner. Requires also setting agid option.gid Set to a number to force ownership of all extracted files andgid field in the archive.preserveOwner. Requires also setting auid option.noMtime Set to true to omit writing mtime value for extractedtransform Provide a function that takes an entry object, andfilter option described above.)strict Treat warnings as crash-worthy errors. Default false.onentry A function that gets called with (entry) for each entryonwarn A function that will get called with (message, data) forSynchronous version of tar.Unpack.
Note that using an asynchronous stream type with the transform
option will cause undefined behavior in sync unpack streams.
MiniPass-based streams are designed for this
use case.
A writable stream that parses a tar archive stream. All the standard
writable stream stuff is supported.
If the archive is gzipped, then tar will detect this and unzip it.
Emits 'entry' events with tar.ReadEntry objects, which are
themselves readable streams that you can pipe wherever.
Each entry will not emit until the one before it is flushed through,
so make sure to either consume the data (with on('data', ...) or.pipe(...)) or throw it away with .resume() to keep the stream
flowing.
Returns an event emitter that emits entry events withtar.ReadEntry objects.
The following options are supported:
strict Treat warnings as crash-worthy errors. Default false.filter A function that gets called with (path, entry) for eachtrue to emit the entry from thefalse to skip it.onentry A function that gets called with (entry) for each entryonwarn A function that will get called with (message, data) forStop all parsing activities. This is called when there are zlib
errors. It also emits a warning with the message and error provided.
A representation of an entry that is being read out of a tar archive.
It has the following fields:
extended The extended metadata object provided to the constructor.globalExtended The global extended metadata object provided to theremain The number of bytes remaining to be written into theblockRemain The number of 512-byte blocks remaining to be writtenignore Whether this entry should be ignored.meta True if this represents metadata about the next entry, falsepath, type,size,mode`, and so on.Create a new ReadEntry object with the specified header, extended
header, and global extended header values.
A representation of an entry that is being written from the file
system into a tar archive.
Emits data for the Header, and for the Pax Extended Header if one is
required, as well as any body data.
Creating a WriteEntry for a directory does not also create
WriteEntry objects for all of the directory contents.
It has the following fields:
path The path field that will be written to the archive. Byportable Omit metadata that is system-specific: ctime, atime,uid, gid, uname, gname, dev, ino, and nlink. Notemtime is still included, because this is necessary othermyuid If supported, the uid of the user running the currentmyuser The env.USER string if set, or ''. Set as the entryuname field if the file's uid matches this.myuid.maxReadSize The maximum buffer size for fs.read() operations.linkCache A Map object containing the device and inode value forstatCache A Map object that caches calls lstat.preservePaths Allow absolute paths. By default, / is strippedcwd The current working directory for creating the archive.process.cwd().absolute The absolute path to the entry on the filesystem. Bypath.resolve(this.cwd, this.path), but it can bestrict Treat warnings as crash-worthy errors. Default false.win32 True if on a windows platform. Causes behavior where paths\ with / and filenames containing the windows-compatible<|>?: characters are converted to actual <|>?: charactersnoPax Suppress pax extended headers. Note that this means thatnoMtime Set to true to omit writing mtime values for entries.tar.update or the keepNewer option with the resulting tar archive.path is the path of the entry as it is written in the archive.
The following options are supported:
portable Omit metadata that is system-specific: ctime, atime,uid, gid, uname, gname, dev, ino, and nlink. Notemtime is still included, because this is necessary othermaxReadSize The maximum buffer size for fs.read() operations.linkCache A Map object containing the device and inode value forstatCache A Map object that caches calls lstat.preservePaths Allow absolute paths. By default, / is strippedcwd The current working directory for creating the archive.process.cwd().absolute The absolute path to the entry on the filesystem. Bypath.resolve(this.cwd, this.path), but it can bestrict Treat warnings as crash-worthy errors. Default false.win32 True if on a windows platform. Causes behavior where paths\ with /.onwarn A function that will get called with (message, data) fornoMtime Set to true to omit writing mtime values for entries.tar.update or the keepNewer option with the resulting tar archive.umask Set to restrict the modes on the entries in the archive,process.umask() on unix systems, or 0o22 on Windows.If strict, emit an error with the provided message.
Othewise, emit a 'warn' event with the provided message and data.
Synchronous version of tar.WriteEntry
A version of tar.WriteEntry that gets its data from a tar.ReadEntry
instead of from the filesystem.
readEntry is the entry being read out of another archive.
The following options are supported:
portable Omit metadata that is system-specific: ctime, atime,uid, gid, uname, gname, dev, ino, and nlink. Notemtime is still included, because this is necessary otherpreservePaths Allow absolute paths. By default, / is strippedstrict Treat warnings as crash-worthy errors. Default false.onwarn A function that will get called with (message, data) fornoMtime Set to true to omit writing mtime values for entries.tar.update or the keepNewer option with the resulting tar archive.A class for reading and writing header blocks.
It has the following fields:
nullBlock True if decoding a block which is entirely composed of0x00 null bytes. (Useful because tar files are terminated bycksumValid True if the checksum in the header is valid, falseneedPax True if the values, as encoded, will require a Paxpath The path of the entry.mode The 4 lowest-order octal digits of the file mode. That is,uid Numeric user id of the file ownergid Numeric group id of the file ownersize Size of the file in bytesmtime Modified time of the filecksum The checksum of the header. This is generated by adding all0x20).type The human-readable name of the type of entry this represents,typeKey The alphanumeric key for the type of entry this headerlinkpath The target of Link and SymbolicLink entries.uname Human-readable user name of the file ownergname Human-readable group name of the file ownerdevmaj The major portion of the device number. Always 0 fordevmin The minor portion of the device number. Always 0 foratime File access time.ctime File change time.data is optional. It is either a Buffer that should be interpreted
as a tar Header starting at the specified offset and continuing for
512 bytes, or a data object of keys and values to set on the header
object, and eventually encode as a tar Header.
Decode the provided buffer starting at the specified offset.
Buffer length must be greater than 512 bytes.
Set the fields in the data object.
Encode the header fields into the buffer at the specified offset.
Returns this.needPax to indicate whether a Pax Extended Header is
required to properly encode the specified data.
An object representing a set of key-value pairs in an Pax extended
header entry.
It has the following fields. Where the same name is used, they have
the same semantics as the tar.Header field of the same name.
global True if this represents a global extended header, or falseatimecharsetcommentctimegidgnamelinkpathmtimepathsizeuidunamedevinonlinkSet the fields set in the object. global is a boolean that defaults
to false.
Return a Buffer containing the header and body for the Pax extended
header entry, or null if there is nothing to encode.
Return a string representing the body of the pax extended header
entry.
Return a string representing the key/value encoding for the specified
fieldName, or '' if the field is unset.
Return a new Pax object created by parsing the contents of the string
provided.
If the extended object is set, then also add the fields from that
object. (This is necessary because multiple metadata entries can
occur in sequence.)
A translation table for the type field in tar headers.
Get the human-readable name for a given alphanumeric code.
Get the alphanumeric code for a given human-readable name.