16 April 2006

Backup blogger

I'm backing up this blog site with
wget --directory-prefix=/srv/blog --wait=7 --mirror --ignore-length --page-requisites --convert-links -e robots=off --span-hosts --domains=photos1.blogger.com,parwy.blogspot.com http://parwy.blogspot.com/
some notes:

saves all files to /srv/blog
--mirror: causes it to recurse to any depth and check timestamps to only download newer files
--ignore-length: blogger seems to report a different length for files which causes wget to download files that havent changed
-e robots=off: without this any phots at photos1.blogger.com are not downloaded, as this site has a robots file which prevents any crawling, since I'm only crawling over photos I link to, its okay to disable