If you ever tried converting an Excel sheet or a Word document into an HTML web page, you know the amount of useless tags and styles that are added to the output making the HTML file bulky and complex.
Similar is the situation when your copy-paste rich text from an existing web page into another text editor. CSS formatting styles that become part of the copied text actually make no sense once that HTML snippet is used elsewhere.

Word Off removed all the inline styles from the HTML table of Word doc. file
To keep your HTML clean and beautiful, Tom Dyson has created an online utility at WordOff.org that takes your dirty HTML and strips out all the junk while preserving the links and basic formatting.
Word Off will delete all HTML elements that are empty, removes every <span> and <div> tag and reduces the number of line breaks.
Developers can integrate WordOff into their web application using the CURL API. The tool will also come handy for people who use WYSIWYG HTML editors like Dreamweaver as they too can sometimes generate bloated code.
If you are into web design, do check this set of tools to strip junk from CSS Stylesheets.
Find this article at: http://www.labnol.org/internet/clean-html-files-remove-word-tags/4686/
web: http://www.labnol.org/ email: amit@labnol.org


Reader Comments
WordOff is a great little tool! Word consistently creates mangled/bloated HTML and I have a lot of users who cut-and-paste from Word into rich-text web forms — often with disastrous results.
I’m going to forward the link to the WordOff API to a few of my developer friends. Nice find!
Written by Jeff Hester on 09.26.08
I am a long time fan of link , which is a lot more powerful than anything else I have ever come across on Windows, and it’s not limited to just HTML files. If I take the above case, this is how it can be done using BK Replace Em:
Right click on a row of field > Advance Edit. Select Range search provide start range: say “” (without quotes), and you can clean up the whole stuff in one go. I use one row per change of type (so for td, I use another row). You can do this for 100 files (or a folder full of files), if you like in one go.
Written by Chetan Kunte on 10.02.08