I’m surprised I haven’t figured this trick out yet, but I was kindof forced to when I got a new hard drive for my laptop, didn’t want to reinstall Linux, and didn’t have a lot of options. The only place I could back it up was on my local server. A usb-attached hard drive would probably be best, but I only had a network, so I needed to get the data to my server. Rsync might have worked, but I expect it would have taken a very long time. The best option would be to transfer a gzip’ed archive but I couldn’t save it locally then scp it. So I had to direct the output of tar/gz directly to the network. I’ve done a lot of things with ssh, but not this. What I found was the way scp handles stdin and stdout.
If you pass a command as a final argument to ssh it will execute the command remotely, but pipe stdin from the local terminal to the remote command and stdout from the remote command to the local terminal. So all I had to was execute a remote command to save stdin to a file on the remote system. This can be done via the command:
$ tar -cz . | ssh user@host "cat > file.tar.gz"
The tar -cz . says “create a gzip compressed archive of the current directory.” The | (pipe) says “take the output of the tar command and pipe it to the ssh command.” The ssh user@host “cat > file.tar.gz” command says “ssh to host as user and execute the command ‘cat > file.tar.gz’”. In cat > file.tar.gz, the cat command is there to properly catch and redirect the output and says “take my stdin and output it to stdout” and the > file.tar.gz says “direct the output of cat to the file file.tar.gz”.
The tar command gets us the gzip’d archive, the ssh command lets us pipe the output to a remote command, cat gives us a command that ssh can execute that takes the command input (the output of the tar command) and output that to a file.
Then once I have the backup and setup the new hard drive in my laptop I can restore the data using the command:
$ ssh user@host "cat file.tar.gz" | tar -xz
This does the same thing in reverse. It sshs to host as user and runs the command cat file.tar.gz which reads the file.tar.gz and outputs it to stdout. Then we capture that output and pipe it through tar, locally, which gunzips/untars the data to the current directory.
I could have also mounted a remote directory on my server using something like nfs, but I didn’t feel like taking the time to set that up.
This is a really neat example of stream manipulation in Linux. Hopefully you can learn something from it.
Note that I did all this from an Ubuntu Live CD so I wasn’t actively using my old hard drive (mounted readonly with mount -o ro /dev/sda3 /mnt/sda3) when I was backing up the data and so I could setup my new hard drive. The only other thing I had to do after I restored the data was to reconfigure grub, /etc/fstab, and /etc/blkid.tab with the new UUID’s for the hard drive. I first had to use /dev/sdaX instead of UUID’s to be able to boot and find the UUID (I couldn’t find the UUID in the Ubuntu 7.10 live cd, I’m guessing because it was a little old, and I also didn’t feel like downloading and burning 8.10). Then I could configure the new UUIDs and reboot and all was good. Let me know if you would like more details on the UUID part.
Welcome to my blog / website. I'll post anything I feel worth saying (the threshold isn't always high). Enjoy the sights. Maybe you'll find something interesting.
This is some fun io redirection for sure, but I’m curious why you threw out rsync? Rsync was basically designed for just this sort of task. rsync -uavP -e ssh (src) (dst) where src could be /mnt/sda3 and dst could be user@server:/home/user/backupPath. It allows resuming interrupted transfers and gives great progress detail, etc.
I had a few reasons:
1. I didn’t have enough space on my server for all my laptop data uncompressed but I did have enough if it was compressed.
2. I am not that familiar with rsync (hence all my following reasons).
3. I wasn’t sure how permissions and special files would be preserved.
4. In my experience, transferring many small files goes much much slower than transferring one large file. I thought rsync might do this.
I have a feeling these reasons would be invalid or could be worked around, but I wasn’t as confident with rsync as I was with io redirection and tar/ssh.
The progress detail would have been helpful. Those commands give no indication of progress so I would have to check file size / disk usage.
The SSH redirect is pretty snazzy given you have the right permissions to use SSH or rlogin
In some of my work, I’ve been limited to telnet due to security protocols (don’t ask) and found I had to write scripts using the “expect” language to automatically login and pass back and forth remote commands. You could spend weeks trying to learn various redirection methods!