Never Mail me: Hunny Never Mail me: Hunny Never click here!
powered by EUserv
This page is part of my Western Digital My Book World Edition scratch pad.

rsync - make it fast!


rsync really is the swiss army knife in backing, shadowing, syncing, distributing, collecting files. There is no more powerful, comfortable and automatable tool I know. Having said this explains why I desperately tried to make rsync fast on my MBWE box.

This was written specifically with the WD My Book series in mind, but generally works on any system. It is only worthwhile for you to investigate, if you are running rsync on systems with a very poor cpu.

Before you start, understand this: What we do here is to trade speed for security. Be aware of this. In my setting I gained a factor of 3 to 5 in speed, but lost all encryption and most of authentication. If you understand the risk and can consider your general environment safe enough for that, go ahead. NEVER use this over public lines!

About performance and speed

Under good conditions, the WD Caviar drives sport a transfer rate of well above 100 MB(ytes)/s, which is roughly just short of 1 Gb(its)/s, just the speed of the nic. So theoretically...

... but I had to find out that rsync is slow on our box. The bottleneck is not the disk, and not the nic. It is only the cpu. Almost all modern transport protocols are encrypted, and so the weak cpu limits the speed! ssh will take about 80 %CPU, rsync the rest, more or less. I found a remedy to lower cpu usage.

Short version

Find and install rsh and/or rshd on the MBWE and clients. Configure the server (passive) locale. Configure and test until connecting with rsh works painlessly. Then use rsync as always, but add the flag --rsh=rsh. ssh was the cpu-killer, rsh is negligible.

About versions: There was a big performance gain in the step from versions 2.x.x to 3.x.x of rsync. So type rsync --version to check and act accordingly. This is the easiest way to gain some speed! Being no prophet I cannot judge about the future. Reading rsync home is always a good idea. Usually it is not a big issue to skip the little steps, but do not omit a change in the major version number, say from 4.x.x. to 5.x.x.x

The Remedy

Bitchy OSes!

There are some (not WD) systems that link rsh to ssh, rcp to scp and so on. Without a downward compatibility I call this effrontery! You bastards, without even saying! So you might think you have an rsh, but you don't. Complain to the packagers/distributors, remove the links and go ahead. But be careful, a re-install of ssh tools might recreate the links.

1st find a remote shell that is light on cpu and memory

I was surprised this was soooo hard to find!! From my older unix days I knew the "r-tools" like rlogin, rsh and rcp. But since the success of ssh, rsh became considered dispensable, insecure, deprecated, and nobody really misses it today. Duly, one should say, and it simply vanished from modern machines. (But see Bitchy above!) I was lucky I still had binaries on my full blown up intel linux, but for the box I did not find it in optware, nor did I find source packages easily. Eventually I found sources here. Or look for netkit-rsh-0.17.tar.gz with your favourite search engine. You only need this one package, it contains the sources for rlogin(d), rsh(d), rcp and rexec(d). Unpacking, configuring and compiling worked like a charm. The source is extremely simple and does not rely on rare libraries. Without much checking at least, I obviously had it all since long because I had the development environment on the box. Both - installing compilers on your box or cross-compiling - are explained somewhere in the wiki. make install failed for some reason I trivially did not investigate, so I copied the files to appropriate directories (like clients to /usr/bin and daemons to /usr/sbin or some such).

2nd convince rsync to use it

The second task is simple, you can run rsync on rsh (remote shell), which is - largely - the same as ssh (the default) but without the security! You do this by adding the flag "rsh=rsh" to your rsync command line.

Configuration of rsh and rshd

Caveat: This version does not (really) work with an rysnc server in daemon mode. And what follows is just a quick fire config. You really should RTFM!

You need to set up a rsh server (rshd) on the passive locale, usually your data backing machine - the MBWE. The client (rsh) on the commanding locale where you start rsync and automated scripts.
Your firewall needs be configured to allow highport connections from the commanding locales to port shell (514 usually) on the passive locales.

In my box I have inetd in use. On more modern systems you might have xinetd. As this is for the MBWE I only explain the inetd variant. Edit /etc/inetd.conf so it contains a line like:

shell stream tcp nowait root /usr/sbin/rshd -h -a

of course with the right path to rshd on your box. Reload inetd or reboot. Have a look at the man page of rshd (the one supplied with the package, not any you find elswhere, they might differ!) to find out about the flags, you might want to use different ones. Be careful: My chosen flags "-h -a" plus the configuration in the files below offer a really simple but insecure way! This is for demonstration purposes only, and perhaps usage in your otherwise secure local home net. Again, read the man pages if you want a more secure approach (authentication only - you cannot hide your data from being sniffed!) There is a global file and/or one in the home of every user whose credentials are to be used on the passive locale. Uyfse on /etc/hosts.equiv and ~/.rhosts, as the format is simple to learn but needs lengthy explanation. In very short words hosts.equiv contains a list of REMOTE systems whose users may login on THIS system, provided the same user exists here. And/or ~/.rhosts contains a list of systems_and_users who are allowed to login as THIS user (~) on HERE. Important: The owner of this file must be the one ~ refers to, and it must be r/w to him only!

cat ~/.rhost
ws mike
notebook mike
fserver mike
$ chown $USER ~/.rhosts
$ chmod 600 ~/.rhosts

Test the setup

Suppose user mike exists HERE and THERE, and the aforementioned files include the right thing, then if mike is logged in HERE he should type

$ rsh THERE ls -l

If everything runs well, he will see a directory listing of user mike's home on the remote system THERE! If not, try to figure out what happened by reading error messages on screen HERE and in the message log THERE (tail -f /tmp/messages). Remember: Whenever you change /etc/inetd, you also need to reload inetd. The other files should take effect immediately.

Using rsync

Okay, using rsync is like you used it before, just add one more parameter that reads --rsh=rsh to change it from the default --rsh=ssh. How to use rsync in an effective way is beyond the scope of this little hack.

Have fun!