As the man page says, the default behaviour of rsync is to create a new copy of the file in the destination and to move it into the right place when the transfer is completed. If the destination directory doesn’t exist rsync will create it.
A trailing slash on the source avoids creating an additional directory level at the destination.
rsync -avz src/ dest # content of ./src/ transferred to ./dest/
rsync -avz src dest # content of ./src/ transferred to ./dest/src/
Resume interrupted sync (handle connection loss)
Some people have mentioned using the --partial
flag works, it needs to be mentioned however that it only resumes when the --append
or --append-verify
flag is used when resuming.
--partial
creates a hidden file of the file that has not finished the sync process, the file is kept when you interrupt syncing. It continues to complete the file when you use --append
after resuming, otherwise when not using --append
, the incomplete hidden file is kept and remains incomplete.
Conclusion: You can just interrupt rsync --partial
using Ctrl
+ C
if you use rsync --append
when resuming
rsync examples
Misc.
- Lists files without copying them
rsync --dry-run src dest
- Show progress per file:
rsync --progress src dest
- Show global progress:
rsync --info=progress2 src dest
Remote transfers
Settings in $HOME/.ssh/config
are also respected by rsync making commonly accessed systems far easier to use.
rsync -avz -e "ssh -l <user>" <src> <user>@<server>:<target>
Non standard SSH port
rsync -e "ssh -p 2222" ...
Use specific ssh key
rsync -e "ssh -i $HOME/.ssh/id_rsa_for_rsync" ...
Tunnel through a jump host with key agent forwarding:
rsync -e "ssh -A -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ProxyCommand=\"ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -W %h:%p ${SSH_USER}@${BASTION_HOST}\"" ./deployment/ ${SSH_USER}@${TARGET_HOST}:/var/www/${ENV}/
Run with elevated permissions:
rsync --rsync-path 'sudo rsync' ...
Copy remote->local, keep attributes, use compression, be verbose and show human readable units:
rsync -avzh -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" \
--progress 'preview-host:/var/www/html/mb/app/*' /tmp/
Copy local->remote, keep attributes, be verbose, use root permissions on the target, delete files that are not in the source path and use checksums to compare files:
rsync --rsync-path 'sudo rsync' -avP \
-og --chown=root:root \
--checksum --delete \
./deployment/ ubuntu@${TARGET_HOST}:/opt/app/${ENV}/
Filtering
--exclude=important_file.txt
- Can be used to omit files or directories from being synced.--exclude=backups/ --include=backups/most_recent
- Inside the exclusion we can explicitly include certain file, folders or patterns that fall inside the broader exclude.
In the following example, we are excluding the node_modules and tmp directories which are located inside the src_directory:
rsync -a --exclude=node_modules --exclude=.DS_Store --exclude=tmp /src_directory/ /dst_directory/
The second option is to use the --exclude-from
argument and specify the files and directories you want to exclude in a file.
rsync -a --exclude-from='/exclude-file.txt' /src_directory/ /dst_directory/
/exclude-file.txt
file1.txt
.DS_Store
node_modules
tmp
Exclude filters can be written in a condensed form:
# Before:
rsync -a --exclude 'file1.txt' --exclude 'dir1/*' --exclude 'dir2' src_directory/ dst_directory/
# After:
rsync -a --exclude={'file1.txt','dir1/*','dir2'} src_directory/ dst_directory/
Pattern matching
'*.jpg*'
It is little trickier to exclude all other files and directories except those that match a certain pattern. Let’s say you want to exclude all other files and directories except the files ending with .jpg
.
One option is to use the following command:
rsync -a -m \
--include='*.jpg' \
--include='*/' \
--exclude='*' \
src_directory/ dst_directory/
When using multiple include/exclude option, the first matching rule applies.
--include='*.jpg'
- First we are including all.jpg
files.--include='*/'
- Then we are including all directories inside the insrc_directory
directory. Without this rsync will only copy*.jpg
files in the top level directory.-m
- Removes empty directories.
Cleanup
# Automatically delete source files after successful transfer
rsync --remove-source-files -zvh backup.tar /tmp/backups/
Since --remove-source-files
does not remove directories, issue the following commands to move files over ssh:
rsync -avh --progress --remove-source-files /source/* user@server:/target \
&& find /source -type d -empty -delete
Logging
rsync ... > /tmp/rsyncbackup.log 2> /tmp/rsyncbackup.errors.log
Useful parameters
-o, --owner preserve owner (super-user only)
-g, --group preserve group
--devices preserve device files (super-user only)
--specials preserve special files
-D same as --devices --specials
-t, --times preserve modification times
-p, --perms preserve file/directory permissions
-l, --links copy symlinks as symlinks
-u, --update skip files that are newer on the receiver
-C, --cvs-exclude auto-ignore files in the same way CVS does
--progress show progress during transfer
--stats give some file-transfer stats
--list-only list the files instead of copying them
--bwlimit=KBPS limit I/O bandwidth; KBytes per second
-a
,--archive
- Archive mode, equivalent to-rlptgoD
. This option tellsrsync
to syncs directories recursively, transfer special and block devices, preserve symbolic links, modification times, group, ownership, and permissions. Sometimes you will have to supplement the-a
parameter with:-X
- Preserve extended attributes, e.g. SELinux contexts may be stored as such attributes on distributions like CentOS/RedHat where these are used by default.-A
- Preserve ACLs (Access Control Lists)
-z
,--compress
- This option will forcersync
to compresses the data as it is sent to the destination machine. Use this option only if the connection to the remote machine is slow.-P
- equivalent to--partial --progress
. When this option is usedrsync
will show a progress bar during transfer and to keep the partially transferred files. It is useful when transferring large files over slow or unstable network connections. Without-P
or--partial
, if the connection drops during a transfer, the file is deleted and you will have to restart from scratch.--delete
- Delete files in the destination that don’t exist anymore in the source location. Used when you want to keep an exact replica of the source files/directories. Without this option, files that have been deleted in the source won’t be deleted on destination, which is preferable for most backup schemas. Keep in mind that the--delete
parameter exposes you to the risk of losing the entire backup, if used inappropriately (e.g., if you use the wrong source directory or an empty one). An option like--max-delete=3
so that rsync never deletes more than 3 files can reduce the amount of data you might lose. The number can be adjusted according to your use case.-q
,--quiet
- Use this option if you want to suppress non-error messages.-e
- This option allows you to choose a different remote shell. By default,rsync
is configured to use ssh.-v
- Verbose mode prints more statistics: what files are currently copied/transferred and summary about bytes transferred and speedup ratio.-r
- Copy every object contained in directories and subdirectories. Without this option, directories are skipped and only files are copied. E.g.,rsync -v root@example.com:/etc/* /tmp
would only copy files from/etc/
. When you are copying/transferring a single directory, you have to use this option or the-a
parameter, otherwise nothing happens, the directory is simply skipped.-h
- Show “human readable” numbers: instead of statistics being shown in bytes, they will be displayed in megabytes, kilobytes, etc., because9.82M
is easier to read than9,821,016
.
Edge Cases
Thin Provisioning
When files are getting significantly bigger on the other side Thin Provisioning (TP) is probably enabled on the source system - a method of optimizing the efficiency of available space in Storage Area Networks (SAN) or Network Attached Storages (NAS).
E.g.: The source file was only 10GB because of TP being enabled, and when transferred over using rsync without any additional configuration, the target destination was receiving the full 100GB of size. rsync could not do the magic automatically, it had to be configured.
The flag that does this work is -S
or -sparse
and it tells rsync to handle sparse files efficiently. And it will do what it says! It will only send the sparse data so source and destination will have a 10GB file.