Skip Navigation Links
Skip Navigation LinksHome > ZipArchive > How to Use > Article
Segmented Archives: Splitting and Spanning
Applies To: All

Introduction

The ZipArchive Library can create segmented archives using the following methods: splitting, binary splitting and spanning.
  • splitting - an archive is split into multiple files that are usually located in the same directory. This method creates the same internal structure as spanning.
  • binary splitting - an archive file is logically a regular single-segment archive, but is binary split into multiple files. A regular archive can be created from such split archive by simply concatenating all its parts.
  • spanning - an archive spans multiple removable disks (e.g. floppy disks).
The differences between splitting and spanning are summarized below:

Splitting Spanning
Destination media not limited to any removable
Archive Structure splits into volumes
(usually in the same folder)
spans multiple disks
Naming extension is based on the volume number,
(it is possible to implement a custom naming scheme)
each volume has the same name
Single Volume Size declared by the user when creating an archive auto-detected from the free space on the current disk
Callback not needed, but possible needed for changing volume
  • Splitting and spanning are compatible with PKZIP and WinZip.
  • Binary splitting is compatible e.g. with 7-Zip, but is not compatible with WinZip.
  • To set a callback object for splitting or spanning use the CZipArchive::SetSegmCallback() method.The class of the callback object must be derived from the CZipSegmCallback class.
  • The ZipArchive Library does not allow direct modifications of existing segmented archives. However you can apply changes to an existing segmented archive by creating a new archive and copying data from the old archive using one of the CZipArchive::GetFromArchive() methods. These methods will copy compressed data from the old archive without decompression. You can find more information about this method here: Compressing Data.
  • The CZipArchive class uses a write buffer to optimize the speed of write operations. You can change its size with the CZipArchive::SetAdvanced() method (set the first argument). While creating a segmented archive, set the size of the buffer to the maximum size of the volume for the best performance.
  • To determine the total number of volumes in an archive, first request the central directory information using the CZipArchive::GetCentralDirInfo() method. The total number of volumes can be then obtained by adding one to the
    CZipCentralDir::CInfo::m_uLastVolume value, as illustrated in the sample code below.
    Sample Code
    CZipArchive zip;
    zip.Open(_T("C:\\Temp\\test.zip"));
    CZipCentralDir::CInfo info;
    zip.GetCentralDirInfo(info);
    ZIP_VOLUME_TYPE uTotalSegments = info.m_uLastVolume + 1;
    // ...
    zip.Close();

Conversion Between Split and Spanned Archives

To convert between split and spanned archives, it is enough to change the names of volumes and copy the volumes to appropriate locations.
  • To convert a spanned archive to a split archive, copy all the volumes into one location and rename their extensions according to the printf function format using the pattern: z%.2u. For the volumes numbers greater than 99 this pattern becomes z%d. Use the one-based volumes number as an argument. Use the "zip" extension for the last volume. This way the volumes are named this way:
    • name.z01
    • name.z02
    • ...
    • name.z100
    • ...
    • name.zip

  • To convert a split archive to a spanned archive, copy each volume to a separate removable media, giving it the "zip" extension. You also should name each disk with the appropriate label starting from "PKBACK# 001" (note the space between '#' and '0').
  • The conversion is not possible in case of binary splitting.

Limits in Number of Volumes

Zip format has the following limits on the number of volumes:

Splitting Spanning
Standard Zip Format 65,535 999
Zip64 Format 4,294,967,295 - 1 4,294,967,295 - 1

Splitting: All Volumes in One Folder

The volumes of a split archive are usually located in the same folder. You need to specify a size of a single volume when creating a split archive. Internal zip structures such as file headers, are not split across volumes in regular split. This may result in a volume size being slightly smaller from the declared size, when the structure could not fit entirely into the current volume and it was stored in the next volume instead. If the declared volume size is too small to hold an entire internal structure, this particular volume will be enlarged. It is recommended to use volumes sizes not smaller than 64KB.

Under Linux/OS X, when you are opening an existing split archive, use CZipArchive::zipOpenSplit mode when calling the CZipArchive::Open(LPCTSTR) method. This is caused by the lack of the implementation of the ZipPlatform::IsDriveRemovable() function and the device containing the archive is always assumed to be removable.

Sample Code
LPCTSTR zipFileName = _T("C:\\Temp\\test.zip");
CZipArchive zip;
// specify the segment size to be 1MB
zip.Open(zipFileName, CZipArchive::zipCreateSplit, 1024 * 1024);
zip.AddNewFile(_T("C:\\Temp\\big.dat"));
zip.Close();
// the segmentation type will be auto-detected as splitting
// (the archive is on a non-removable device)
zip.Open(zipFileName);
// under Linux/OS X, call instead: zip.Open(zipFileName, CZipArchive::zipOpenSplit);
zip.ExtractFile(0, _T("C:\\Temp"), false, _T("big.ext"));
zip.Close();

Using Callback with Split Archives

Using callback with split archives is not necessary, but possible. This is useful when you e.g. need to have the possibility to prompt a user for a location of a volume or perform some other actions.

When the callback is set, the CZipCallback::Callback method will be called every time a volume changes.

  • The reason for calling the callback is stored in CZipSegmCallback::m_iCode and takes one of the CZipSegmCallback::SegmCodes values.
  • You can change the filename and path of the current volume by modifying the CZipCallback::m_szExternalFile variable. When the callback is called, this variable holds the full path to the volume file as expected by the library. You can change this variable and the library will create the volume file under a new name or location. For implementing a custom naming scheme it is recommended to use the split names handler (see below) instead.
  • The number of the disk needed for reading or writing is stored in
    CZipSegmCallback::m_uVolumeNeeded.
  • To abort the archive processing, return false from this method. A CZipException will be thrown with the CZipException::aborted code.
  • The value of the uProgress parameter is set to 0 apart from the time when the callback is called for the last volume. It is then set to ZIP_SPLIT_LAST_VOLUME.
  • When creating a split archive, the callback is called twice for the last volume.
    • The first time it is called when the library doesn't know yet that the current volume is the last volume and the value of the uProgress parameter is set to 0.
    • The second time it is called when the library already knows that the current volume is the last volume and the value of the uProgress parameter is set to ZIP_SPLIT_LAST_VOLUME.
Sample Code
class CSplitCallback : public CZipSegmCallback
{
bool Callback(ZIP_SIZE_TYPE)
{
switch (m_iCode)
{
case scVolumeNeededForRead:
case scVolumeNeededForWrite:
case scFileNameDuplicated:
{
if (m_iCode == scFileNameDuplicated)
{
// it can happen only when writing an archive;
// delete the file, if it already exists
// it would be more optimal to check for the file existence
// when scVolumeNeededForWrite was called to save one turn, but
// this code is provided to illustrate the possible events
if (!ZipPlatform::RemoveFile(m_szExternalFile))
{
_tprintf(_T("Removing of the existing file failed."));
return false;
}
}
// it would be possible here to change the filename of the archive volume
// and assign to m_szExternalFile
break;
}
case scFileCreationFailure:
_tprintf(_T("Could not create the file. \
Check, if you have write permissions to the given location.\r\n"));
// abort processing
return false;
case scFileNotFound:
_tprintf(_T("The given volume could not be found.\r\n"));
// abort processing, although we could ask a user here
// to provide the location of our volume
return false;
default:
_tprintf(_T("An unexpected code detected.\r\n"));
// abort processing
return false;
break;
}
return true;
}
};
void SplittingWithCallback()
{
// this code is identical to the previous sample with the
// exception of setting the callback
LPCTSTR zipFileName = _T("C:\\Temp\\test.zip");
CZipArchive zip;
CSplitCallback callback;
// set the callback before creating the archive;
// note the second parameter value
zip.SetSegmCallback(&callback, CZipArchive::scSplit);
zip.Open(zipFileName, CZipArchive::zipCreateSplit, 1024 * 1024);
zip.AddNewFile(_T("C:\\Temp\\big.dat"));
zip.Close();
return;
// under Linux/OS X, call instead: zip.Open(zipFileName, CZipArchive::zipOpenSplit);
zip.Open(zipFileName);
zip.ExtractFile(0, _T("C:\\Temp"), false, _T("big.ext"));
zip.Close();
}

Custom Naming Scheme of Volumes

You can implement a custom naming scheme of volumes for split archives. In order to do that: If the last volume name is different from the archive name, you can retrieve it when closing the archive (it is the return value of the CZipArchive::Close() method).
Sample Code
class CCustomNamesHandler : public CZipSplitNamesHandler
{
public:
CZipString GetVolumeName(const CZipString& archiveName,
ZIP_VOLUME_TYPE uCurrentVolume,
ZipArchiveLib::CBitFlag flags) const
{
CZipString szExt;
if (uCurrentVolume < 1000)
szExt.Format(_T("vol%.3u"), uCurrentVolume);
else
szExt.Format(_T("vol%u"), uCurrentVolume);
if (flags.IsSetAny(CZipSplitNamesHandler::flExisting))
{
// change the extension, if archive name is the name of an existing archive
CZipPathComponent zpc(archiveName);
zpc.SetExtension(szExt);
return zpc.GetFullPath();
}
else
{
// otherwise, just append the extension
return archiveName + _T(".") + szExt;
}
}
};
void CustomNaming()
{
LPCTSTR zipFileName = _T("C:\\Temp\\test.zip");
CZipArchive zip;
CCustomNamesHandler namesHandler;
// set a custom names handler before creating of the archive
zip.SetSplitNamesHandler(namesHandler);
// specify the segment size to be 1MB
zip.Open(zipFileName, CZipArchive::zipCreateSplit, 1024 * 1024);
zip.AddNewFile(_T("C:\\Temp\\big.dat"));
// get the last volume name - needed for opening of the archive
CZipString szLastVolumeName = zip.Close();
if (szLastVolumeName.IsEmpty())
{
_tprintf(_T("An unexpected error ocurred.\r\n"));
return;
}
// set a custom names handler before opening of the archive
zip.SetSplitNamesHandler(namesHandler);
// under Linux/OS X, call instead: zip.Open(zipFileName, CZipArchive::zipOpenSplit);
zip.Open(szLastVolumeName);
zip.ExtractFile(0, _T("C:\\Temp"), false, _T("big.ext"));
zip.Close();
}

Binary Split

The binary splitting produces archives with the internal structure of a single-segment archive, but splits the archive into multiple files. Here is the comparison between the regular splitting and the binary splitting:

Regular Splitting Binary Spanning
Internal Archive Structure Multi-segment. Each volume is logically represented inside of the archive. Single-segment archive.
Volumes Extension Replaced with z%.2u pattern to create volume filenames (e.g. archive.z01). Consecutive numbers (%.3u pattern) are appended as an extension to an archive filename (e.g. archive.zip.001).
Last Volume's Filename The same as the filename of the archive provided to the CZipArchive::Open(LPCTSTR) method (does not contain a volume number). The filename is formed as any other volume name (contains a volume number).
Default Name Handler CZipRegularSplitNamesHandler CZipBinSplitNamesHandler
Opening of Existing Archive The mode is automatically detected. You need to open the last volume. You need to specify CZipArchive::zipOpenBinSplit when calling the CZipArchive::Open(LPCTSTR) method. You need to open the last volume.
Sample Code
CZipString zipFileName = _T("C:\\Temp\\test.zip");
CZipArchive zip;
// specify the segment size to be 1MB
zip.Open(zipFileName, CZipArchive::zipCreateBinSplit, 1024 * 1024);
zip.AddNewFile(_T("C:\\Temp\\big.dat"));
// get the last volume name - needed for opening of the archive
zipFileName = zip.Close();
if (zipFileName.IsEmpty())
{
_tprintf(_T("An unexpected error ocurred.\r\n"));
return;
}
// the segmentation mode needs to be specified
zip.Open(zipFileName, CZipArchive::zipOpenBinSplit);
zip.ExtractFile(0, _T("C:\\Temp"), false, _T("big.ext"));
zip.Close();

Spanning: Use on Removable Media

Sample Code
#include <conio.h> // for _getch()
class CSpanCallback : public CZipSegmCallback
{
bool Callback(ZIP_SIZE_TYPE)
{
switch (m_iCode)
{
case scVolumeNeededForRead:
case scVolumeNeededForWrite:
_tprintf(_T("Insert the disk number %d\r\n"), m_uVolumeNeeded);
break;
case scFileNameDuplicated:
_tprintf(_T("The file with the given name already \
exists on the disk.\r\n"));
break;
case scCannotSetVolLabel:
_tprintf(_T("Cannot set the disk volume label. \
Check if the disk is not write-protected.\r\n"));
break;
case scFileCreationFailure:
_tprintf(_T("Could not create file. \
Check if the disk is not write-protected.\r\n"));
break;
default:
_tprintf(_T("An unexpected code detected.\r\n"));
return false;
break;
}
_getch();
_tprintf(_T("...\r\n"));
// return false here to abort processing
return true;
}
};
void Spanning()
{
LPCTSTR zipFileName = _T("a:\\test.zip");
CZipArchive zip;
CSpanCallback callback;
// set the callback before creating the archive
zip.SetSegmCallback(&callback);
zip.Open(zipFileName, CZipArchive::zipCreateSpan);
zip.AddNewFile(_T("C:\\Temp\\big.dat"));
zip.Close();
// the callback is already set
// and the segmentation type will be auto-detected as spanning
// (the archive is on a removable device)
zip.Open(zipFileName);
zip.ExtractFile(0, _T("C:\\Temp"), false, _T("big.ext"));
zip.Close();
}

Detecting Last Disk in Drive

When extracting a spanned archive, you need to insert the last disk into the drive before opening the archive. The central directory written on it and the extraction starts from reading the central directory. There is no simple way to detect, if the right disk is in the drive, but the ZipArchive Library throws the CZipException with the CZipException::cdirNotFound code, when the archive you are trying to open does not have the central directory. In case of a spanned archive, it may mean that a user has not inserted the last disk into the drive.

Recovering from Invalid Disk Inserted

Invalid Last Disk

To recover from the situation when a user does not insert the last disk:
  • Catch the exception and verify its code. The code should be CZipException::cdirNotFound. Other codes may indicate a corrupted archive or file access problem.
  • Close the archive with the CZipArchive::Close() method, passing CZipArchive::afAfterException as the iAfterException parameter.
  • Prompt the user for the last disk again.
  • Open the archive.
  • Repeat the process until the archive was successfully opened or the user cancelled the operation.

Invalid Last Disk

To recover from the situation when a user does not insert a correct disk during extraction:
  • Catch the exception.
  • Call the CZipArchive::ResetCurrentVolume(). This will also close any file opened for extraction.
  • Prompt the user for the correct disk.
  • Retry the extraction of the file from the beginning.

Callbacks Called

While processing a segmented archive the following callbacks that are called are the most important: To read more about using callback objects, see Progress Notifications: Using Callback Objects.

See Also API Links

Article ID: 0610051553
Back To Top Up