Consider the following two situations where a local offline copy of websites may come handy:
Situation A: You are travelling in a taxi or an airplane where you have access to a reading device (like a laptop computer or your mobile phone) but no Internet connection.
Situation B: You have just checked into a modern hotel where there is WiFi (or an Ethernet port) available in your room but the problem is access cost - it’s way too expensive as the hotel charges you per minute.
In the first case, you may want to download a copy of all material that you want to read before embarking on the journey while in the latter case, the aim should be to download web pages and blogs as quickly as possible to save on Wifi access costs.
Store Web Pages for Offline Viewing
If you have Google Desktop running in the background, you already have a local copy of all web pages that you have recently opened / read in any browser on your computer. You can click "Browse Timeline" inside Google Desktop and your web history will be listed in reverse chronological order - the most recently visited websites will be listed at the top.
The problem with web history in Google Desktop is that it can get cluttered too easily and finding relevant pages from the history may require some effort. In that case you may install Scrapbook for Firefox and only save relevant web pages that you intend to read in an offline environment.
Scrabbook, like Google Notebook, is primarily for organizing web research but it’s an excellent offline browser as well. You can specify the depth level and all target links from the current web page (up to that level) will be saved offline automatically. For instance, you want to read all stories on the CNN and BBC website offline. Capture the home page with Scrapbook and set the depth as 1 - it will then save full text of all the front page stories as well.
Scrapbook can export all the web captures as an HTML web page so you can easily read the saved content on a mobile phone or your PDA. Another popular tool for downloading web pages in Firefox is DownloadThemAll.
The limitation with either of the above tools is that they work only in Firefox and also require some manual work. What if you want to read all front stories from all major news websites while offline? All news sites provide RSS feeds but they aren’t full text so you have no option but to scrap content from the main website in order to read it offline.
HTTrack is a free website copying software where you can create download jobs and execute them whenever you go online. For example you can create a single download job for all news websites (like BBC, NYT, etc.), set the depth limit as 1 and get an offline version all the front news stories in one go. You can also save this job and re-execute it anytime later either manually or set it up as a scheduled task.
Another good alternative to HTTrack is wget available for Mac, Windows and Linux. You don’t have to spend time learning the complicated command line switches of wget as there are nice GUI apps available both for Mac (CocoaWget) and Windows (WinWget).
Download Blogs for Offline Reading
Blogs, or websites that offers RSS feeds, are much easy to handle and save because we know exactly what stuff has changed since we last visited that site.
There are two categories of blog readers - (a) Addicts or people who are subscribed to several hundred feeds and want to read them all while offline and (b) Casual Readers or people who follow only a dozen or so feeds.
Casual readers can simply add their favorites feeds to Tabbloid and download them all as a PDF newsletter (example).
For people who fall in the category of addicts, the solution that will work best is a dedicated offline reader that can pre-fetch all the new articles and here are some good choices:
My first recommendation has always been FeedDemon - it’s fast, rich in features and the upcoming v2.8 is even better since it lets you export unread items as an HTML web page that can be read on any device.
If you are subscribed to feeds in Google Reader, you can either try RSS Bandit or Scoop - these are desktop based readers that work in offline mode and can synchronize with your Google Reader subscriptions. If you are on Bloglines, a similar solution for you exists in the form of GreatNews - a desktop RSS reader that is also portable. Google Gears is another solution for Google Reader users but it has limitations.
The advantage with either of the above solutions is that they all support synchronization - so if you mark an item as read in an offline environment, the change will get propagated when you go online next so there’s no double work.
Saving Blogs & Web Pages for Mobile Phones
If you plan to save web pages for offline viewing on a mobile device (with a small screen), I would recommend Web2Book - it not only downloads multiple web pages and blogs in one go but also converts them into formats like HTML or PDF that are supported on almost every mobile device.
Web pages saved with Web2Book can be easily read on ebook devices like the Microsoft Reader or the new Sony Reader. Another option for mobile devices is Plucker - it’s an offline browser available both for Windows Mobile and Palm based PDAs.
If you are an iPod owner (the old models, not the latest iPod touch), you can even turn your MP3 player into a notes reader and read web pages as plain text.
Drawloop, an online service that I mentioned in the previous Adobe PDF guide, too can join multiple web pages and save them in a single PDF file like in this example where you have the home pages of three news websites saved in a single file.
Find this article at: http://www.labnol.org/internet/save-webpages-for-offline-reading/6352/
web: http://www.labnol.org/ email: amit@labnol.org


Reader Comments
I’ve using HTTrack for past couple of years, however its not very good for updating old archives. It messes up on that front.
Written by Supreet on 01.05.09
How about using Webaroo from those guys running smsgupshup?? . I have been using version 1 of their software for 2 years . Earlier there was web packs allowing users to store even Wikipedia with 6GB space . Anyways , thanks for sharing this extensive list of applications .
Written by techknowl on 01.05.09
I just use the save as webpage in the browser and uses the html to read later. Also I use RSS owl, that way too simpler than Feed Demon and its opensource
Cheers
Arun, from kerala :)
Written by Arun Basil Lal on 01.05.09
Also check out link which has iPhone & iPod touch versions as well.
Written by Bill on 01.05.09
don’t forget Opera Mini which can save your web pages offline. I use a combination of LaterLoop and Opera Mini on my J2ME phone.
Plus sending your notes to Evernote also helps.
And Google Reader with Gears.
Written by Adarsh on 01.05.09
I think internet explorer’s synchronize is better way to save and read in offline than other from simplicity point of view .
Written by Vinod Kumar on 01.05.09
Nice post Sir Amit, i was looking for this kind of process but have not found a better one, hope this post will help a lot specially those who are not connected on net regularly.
Written by Hezron on 01.05.09
Also try Offline explorer metaproducts.com/MD.html - it is a real good product.
Written by Vedhas on 01.05.09
I am pretty comfortable with Google Reader and it does have an offline feature that takes the last 1000 feeds offline.
Written by Praveen on 01.05.09
I use UnMHT (http://www.unmht.org/unmht/en_index.html) to download full pages to my desktop as single files for later reading.
Written by Chris Abernethy on 01.05.09
@Supreet - If you are having trouble with HTTrack, try the Wget based alternatives like WinWGet.
@techknowl - I have covered Webaroo before but they closed shop last year and their latest product webaroo 2.0 is nothing but a tool for downloading videos from YouTube.
@Arun - A lot of people put “opensource” as an advantage but for a non-techie, does it make any difference? FeedDemon is developed by Nick Bradbury and I know exactly where to go in case there’s a problem. Plus FeedDemon has a version for mobile phones as well.
@Bill - Thanks for mentioning Instapaper - I guess it is useful for saving up to 10 web pages in offline mode on the iphone or ipod touch.
@Adarsh - EverNote is a great idea but the only problem is that you need to open the page in order to save it as a note inside EverNote. Scrapbook allows capture of other levels automatically.
@Vedhas - Products like Offline Explorer or Teleport Pro are definitely good but do you really want to spend money when such good free alternatives are available.
@Praveen - Google Reader offline mode is great for blog feeds but not for regular webpages.
@Chris - Thanks for mentioning UnMHT. Good alternative if you want to save all open web tabs in a single MHT file that can even be opened with IE.
Written by Amit on 01.05.09
Scrapbook is the best .i have been using it for a long long time and must say i am very satisfied with it.
Written by conqrr on 01.05.09
Thanks for writing about Web2Book.
Written by ArpitNext on 01.06.09
Scrapbook is no doubt one of the coolest thing that has ever happened to Firefox! I love it. Thanks for an exhaustive list :)
Written by Praval Singh on 01.06.09
I’m using Scrapbook, but when I want to use more than one browser the best option is pdf download, a must have for Firefox, Flock and Internet Explorer to save websites as pdf fles to read them later.
link
Written by Oscar Antonio MoralĂ Torres on 01.06.09
Hi Amit,
Thanks for the comprehensive report on this topic. I use Httrack to save webpages, as i can control in that how the spider should crawl, find pages and save them. For blogs i’d love to know how to save the entire blog as CHM file, which i prefer the most. Thanks Again!
Esh
Written by Esh on 01.06.09
Really great collection! Bookmarked it for further reference!
Oh by the way I wrote about link a while back. Will update it with what you mentioned here!
PS: I find FeedBurner’s subscription through mail pretty simple to use. Oh and link has a simple interface too!
Written by RaSh on 01.06.09
Use Windows Live Mail or even Outlook. Both programs download blogs and treat content as an email message.
I prefer Windows Live Mail, because it is availabe for free and is a great software.
The advantage is obvious for this type of solution. I can copy messages I like. Windows Vista automatically indexes content. I can forward message to my friends.
Written by Sergei on 01.06.09
I often save gmail messages to my MacBook running OSX 10.5.6 (Intel). When I go back to read one of those messages, it opens and a few seconds later disappears behind an appearance of a gmail screen with a list of messages. Is there a way to avoid this and still read the message that I have saved??
Written by Marion Kenworthy on 04.02.09