eXdupe manual

eXdupe is an ultra fast file archiver and differential backup tool that supports data deduplication.

It uses sliding window data deduplication which is the strongest class of deduplication - identical data is being searched at byte granularity positions, and non-modified data during a differential backup is found even though it has been displaced. Traditional data compression is applied after deduplication.

Path names and examples are for Windows but the same rules apply on any other supported *nix operating system.

Simple backup and restore

exdupe z:\ backup.full
exdupe -R backup.full z:\

Simple backup, 2 differential backups, and restore

Once you have created a simple backup (from now on called full backup), future backups are faster and much smaller because only modifications compared to the full backup need to be stored:

exdupe z:\ backup.full
exdupe -D z:\ backup.full backup.diff1
exdupe -D z:\ backup.full backup.diff2
exdupe -RD backup.full backup.diff2 z:\

You can delete any intermediate .diff files that you don't plan to be able to restore.

Differential backups will result in larger and larger diff files as your data changes over time. At some point you might want to create a new full backup and start over.

Paths

You can specify multiple source files and directories by delimiting them by space. eXdupe will identify their common parent path (d:\projects below) and save the remainders in the archive as relative paths. Backup followed by restore to original locations:

exdupe d:\projects\android d:\projects\iphone d:\projects\test\script.pl backup.full
exdupe -R backup.full d:\projects

To see exactly how paths were stored, use the L flag to list contents:

exdupe L backup.full

To restore only the iphone directory and script.pl file into their original locations, specify them after the destination directory as printed by the -L flag:

exdupe -R backup.full d:\projects iphone test\script.pl

You can also use the -a flag during backup to store full and absolute paths. In this case you would instead restore like this:

exdupe -R backup.full d:\projects d:\projects\iphone d:\projects\test\script.pl

When restoring individual items, eXdupe will identify the common parent path of the items specified on the command line, and append the remainders to the specified destination directory.

Optimizing

Compression ratio

Data deduplication requires in the order of 1 GB of memory for each 1 TB of source data to yield a decent compression ratio. Use the -g flag to specify number of gigabytes to use, or -m for number of megabytes. The number must be a power of 2. Examples:

exdupe -g16 \\server\largedatabase backup.full
exdupe -m8 e:\dvd backup.full

Differential backups will automatically use the same amount of memory as the full backup. It is not possible to use a different amount.

Restore does not consume any significant memory.

Speed

eXdupe scales decently with respect to speed up to around 8-12 threads which yields around 1 gigabyte/second. Use the -t flag to specify the number of threads to be used. Example:

exdupe -t2 database backup.full

Multiple threads are only supported for backup, not restore.

Filtering

During backup you can filter away files and directories with two methods:

Prefixing with --

In the list of source files and directories, prefix items that must be excluded with --. For example:

exdupe c:\ --c:\pagefile.sys backup.full

All items are resolved to their absolute paths so they can be given relatively or absolute, and case insensitively on Windows.

A verbose level 7 warning is printed prior to starting the backup process if an excluded item does not exist.

LUA scripting

You can provide a script written in the LUA language to filter away files or directories during backup. Before eXdupe opens a source item, it calls your script which can return true to include the item or false to exclude it. Provide the script using the -f flag like this:

exdupe -f"return(dir and not contains ({'tmp', 'temp'}, ext))" y:\ backup.full

Information about the item is given in variables. Examples of different source items:

Examples of source files and directories and respective variable contents

Variable

Description

d:\databases\somedir

d:\somefile.txt

d:\somefile

d:\

dir

Absolute path if item is a directory, or LUA type nil if file

d:\database\somedir

nil

nil

d:\

file

Absolute path + filename if item is a file, or LUA type nil if directory

nil

d:\somefile.txt

d:\somefile

name

Name of file or directory

somedir

somefile.txt

somefile

d:

size

Size of file, or LUA type nil if directory

nil

1234

1234

nil

ext

Everything to the right of the last period in filename, or LUA type nil if directory. Empty string if no period present.

nil

txt

(empty string)

nil

date

Last modified date of file or directory, in the LUA date format

attrib

Windows: Integer with file attribute bitmask
*nix: Result of chmod

All paths are passed to you as absolute paths, and in lower case on Windows to make comparison and pattern matching simpler.

On Windows, attributes for files and directories can be read from following boolean variables: ARCHIVE, COMPRESSED, DEVICE, DIRECTORY, ENCRYPTED, HIDDEN, NORMAL, NOT_CONTENT_INDEXED, OFFLINE, READONLY, REPARSE_POINT, SPARSE_FILE, SYSTEM, TEMPORARY, VIRTUAL.

A short intro to LUA filtering will follow soon.

Console output and verbose levels

Level

Applies to restore

Applies to backup

2

x

Failed to open source file (permission denied, locked, etc)

2

x

*nix symlink attempted to be restored on Windows

2

x

 

*nix filename contains character(s) invalid on Windows that were replaced by _

3

x

Source item given on the command line does not exist

4

x

Source item in exclude list (prefixed by --) does not exist

5

x

x

Summary after completion (number of files and data amount treated)

5

x

Notification that Volume Shadow Copy Service (Windows only) has succeeded

6

x

x

Prints each processed file and directory

7

x

x

Status bar (speed and current data amount treated)

9

x

Named pipe skipped because of no -p flag

9

x

Source item filtered away by -- " prefixing

9

x

Source item filtered away by LUA script