Pages

08 December 2010

Text only mirror site

Today I have finished writing a very dirty shell script that automatically creates a text version of this site once a month. That is to say a mirror site updated on a monthly basis. You can take a look at it here

You may wonder why the text version is not a 'real text' version but a cheap copy of the actual site. Well there are several mandatory reasons for that:

1st The hosting server demands an index document in .htm format
2nd I have tried to process html files with html2text but it unfortunately does not render optimal results.

Well if anyone is interested in the code I'll publish it here. The performance is as follows:

- The script first tests working conditions and makes clean.
- Then it downloads the entire site and turns it into text. Both things using for loops.
- After that it uploads it to the server.
- Finally it sends me and e-mail to confirm success.

TODO: -Consider creating a section of useful scripts???
           - Test mirror site with w3m and lynx. (I have tested it with elinks and it works great)