I currently back up the contents of an external hard drive A by (1) deleting everything in the destination external hard drive B and (2) copying everything from A to B (in Finder). The external hard drive only contains “basic data” (folders, images, videos, documents, etc.), nothing fancy/weird.
There’s a lack of good file copy utilities on Mac like there is for Windows (eg. If I need to copy a bunch of files where I’m likely to come across errors copying, I’ll use rsync! This guide covers how to copy files on a Mac using an external drive or any connected network drive.
But with all the amazing free tools that exist out there I couldn't find a decent one to backup (suggestions accepted).
I want to automize this. After a few searches, it seems I can use rsync
. I want then to automize this by using rsync
to avoid copying files that are already on the destination hard drive (of course, if a change has been made, I want to update the file/folder; the same with deleted files/folders or new files/folders; the idea is to mimic what I usually do manually with Finder).
I got to this point
Now the log is full of files which name starts with .
or ._
or even ._.
. Are they necessary?
Question
How can I copy only the files that matter, leaving all those that don't matter behind. For example, if I have photo.png
I would expect to copy photo.png
, not ._photo.png
or many files.
What --exclude
or --exclude-from
can be used safely on a Mac? May be a good rule is “just copy anything that Finder shows” which is what I would copy if I passed manually through all folders.
Extra question
Is -av --progress --delete
enough? Am I committing a crime or risking some data? What options would you use.
Please, back your answer with some arguments, I would appreciate it.
I'm looking for a way to automate rsync
which seems quite powerful and free to use.
Rsync Copy Directory Structure
5 Answers
The safest option is to copy everything, including invisible meta-data files.
Files tend to exist for a reason and as software changes, so will the existence, purpose, and contents of these meta-data files. Copying everything will reduce the maintenance burden and offer ease of mind that nothing is being lost.
The log file created by rsync
will be technically involved by the very nature of rsync
being a command line tool. Consider presenting a tidied up, meta-data filtered, and maybe even coloured coded output for the user to check. You could do this with a wide range of scripting tools and languages. Please ask more questions here on Ask Different and on StackOverflow if you need help with this approach.
Delete Last & Copy OS X Resources
Regarding the rsync
flags, take a look at the question Fastest and safest way to copy massive data from one external drive to another. In this question, a few additional flags are used and explained:
In this situation, the E
will ensure resource forks and other Mac specific properties are copied.
You may want to consider --delete-after
to avoid deleting until the copy has completed; please note that this approach will potentially require a destination drive twice the size of the source.
An answer to a related question, How can I omit FCPX Render Files from a Time Machine backup?, provided a useful link of OS X files and folders that can be excluded from most back-ups. This link provides a practical list of file patterns, folders, and paths that you could exclude.
Include dot Files
There are good reasons to back up files beginning with dots, .*
matching files.
Some software keeps preferences, settings, and other information of value in folders at the top of the user's folder in invisible dot prefixed folders. Running ls -la ~/
will reveal these folders and files.
If any user uses or has software that in turn uses version control software, be sure to back-up dot files. Software like subversion and git both store critical information within their dot folders. These hidden folders can be scattered across your file system, where ever a project is checked out.
Spotlight is OS X's search service. Spotlight uses the mdworker
process to index and update the search catalogue. If you are concerned about possible disk corruption or slow copies, disabling mdworker
while running rsync
may help. Personally, I leave Spotlight running while running large rsync
transfers.
I'd advise against pruning meta-data during a backup, particularly the dot-files e.g. ._$filename
, however if you really want to exclude the dot-files from your rsync
command add --exclude '.*'
to it.
If you're using rsync
version 3.0.6 as per Carbon Copy Cloner or 3.1.2 as per Homebrew, you can take a cue from Carbon Copy Cloner arguments:
rsync -A -X -H -p --fileflags --force-change -l -N -rtx --protect-decmpfs --numeric-ids -go --delete-during --backup --backup-dir=</PATH/TO/STICK/BACKUP_when_using_delete> --protect-args <SRC>/ <DEST>
I have used rsync for backups at several jobs, and I use it at home.
I highly recommend it, but with some modifications. As a backup tool, it's great, but as an archiving tool it falls a little flat. Yes, it copies everything, but you don't get versions of everything, you always get the latest versions only.
I used this guide http://www.mikerubel.org/computers/rsync_snapshots/ as a jumping-off point. Read the whole page. It does a great job of explaining the options, and outlines how you can implement incremental backups. And amazingly, the article is over 10 years old but is still applicable today. Gotta love unix.
Rsync Vs Manual Copy Mac Finder Mac
I'm not completely pleased with the current answers, but I will try to cover here a bit of the possibilities that I've seen on the web trying to find a nice setup for rsync
.
And, by the way, if one is interested in Time Machine like copies, there's rsnapshot
. And there's also Unison for two way syncronizations. Plus, there's actually a few GUIs, like Backup Utility and arRsync; not exactly what I was looking for but they might do the job for somebody.
First, my only intention was to duplicate photos and videos, so exact copy wasn't needed, hence no need to care too much. In fact most of my doubts were if I could exclude everything (the same that would happen if, say, I downloaded a photo from the internet, I just download a .png, everything else is not downloaded but autogenerated).
Here's an unstructured list of thoughts that you may want to take in account
I you want to ensure that your laptop doesn't go to sleep, you may want to
caffeinate
the processcaffeinate -s rsync -av ...
. Taken from here.If you are doing local copies, like in my case, or even if internet connection is not too slow, you should not use
-z
option (compression), and use-W
(transfer whole file, rather than delta transfers; this is a default when local) and probably use--inplace
to make transfers fast. Taken from here.You can use
--delete-after
so any file will be first transferred, and then moved on destination (and deleted the original one) and is safer than deleting before the transfer or during the transfer.You can stop the transfer by pressing ctrl + C, and it will stop cleanly. Taken from here. This was one of my fears with SuperDuper!, if you need to stop a transfer you get the message “You will leave the hard drive in an unknown state…”.
In recent versions there's
--info=progress2
which adds even more details to-v
.There's
-P
(which equals--partial
and--progress
) that will leave mid-transferred files there so you can keep going when you restart the process (if for some reason you can't end the syncronization at once).One might be interested in stopping Spotlight or TimeMachine before doing the copy, and reenable them after the transfer. And even disable Spotlight for the external disk.
Other options used in all those references include
-x
(or--one-file-system
),-E
(--executability
),-H
(--hard-links
),-X
(--xattrs
),-A
(--acls
), and--sparse
,--hfs-compression
,--protect-decmpfs
. You may want to look at them.
For me, a basic command looks like (I might use more options, but this is enough for an example):
Now in my case I could just include {*.jpg,*.png,*.mp4,*.txt,*.pdf,…}
and no one would say “you need also system files” but since I don't want to search to get all the possible filetype I have files of, I prefer to exclude. And there are things that not only can be but seem convenient to exclude.
I found a few links, take what you want:
And from there you could probably get what's safe or sane to ignore. Here's the full list (I just removed duplicates)