scattrbrain
Darshan Patil's personal blog
Word to Text Converter
This tool allows you to convert a Microsoft Word document to plain text. This is the only tool I know of its kind that is available for free over the internet. To convert a file, select a file using the entry below. Once the file has been selected, click on convert. Depending on how big the file is and how busy my server is, this may take a few seconds. Enjoy this and spread the word.
Note that the files you upload are NOT stored on the server. The server processes the files and sends the text back in the HTTP response for the request. Please keep your document sizes small.Implementation Details
For a project I am working on, I needed to batch convert word documents to text files. On a windows box with MS Word installed, this is trivial to do. On a linux box this is hard. Viewing a word document on linux requires OpenOffice and the like. I thought about using Jakarta POI to do this, but running a Java process to read doc files on my piddly server/dev box was overkill. I looked online to find if there were any linux utils that did this. I came across this great utility called antiword.
You can download it from it's homepage here
antiword word.doc > word.txtdid exactly what I needed.
My project requires users to be able to upload their files and then the backend converts the word documents to text for further processing. I coded this and realized that it would be mighty useful if I put the .doc to text converter online.
So lo and behold my antiword powered MS Word to text converter.Add to del.icio.us | Reddit | Digg It