How To Read DOC file Using Java and Apache POI

One of the visitors of my blog asked me write how to read a document file using Java. I wrote the following program to demonstrate how Apache POI can be used for this purpose.

I have used the following API to write this program. If you have downloaded the Apache POI, you should fine this jar file within the bundle.

  • poi-scratchpad-3.2-FINAL-20081019.jar

The tutorial demonstrates the following features:

–How to read a simple Microsoft word document file using Java and Apache POI (.docx not supported)
–This includes the ability to read total number of paragraph and the paragraph content
–How to read the document headers
–How to read the document footers
–How to read the document summary

Apache POI is not robust yet. It has a long way to go through to handle complex document formats. Moreover I figured out that from one version to another, the classes are moving from one package to another. So if you are using the older/newer version of POI, in case of any compilation error for imports, try finding the classes in some other packages.

/**
 * @author Kushal Paudyal
 * www.sanjaal.com/java
 * Last Modified On: 03/23/2009
 */
package com.kushal.utils;

import org.apache.poi.poifs.filesystem.*;
import org.apache.poi.hpsf.DocumentSummaryInformation;
import org.apache.poi.hwpf.*;
import org.apache.poi.hwpf.extractor.*;
import org.apache.poi.hwpf.usermodel.HeaderStories;

import java.io.*;

public class ReadDocFileFromJava {

	public static void main(String[] args) {
		/**This is the document that you want to read using Java.**/
		String fileName = "C:\Documents and Settings\kushalp\Desktop\Test.doc";

		/**Method call to read the document (demonstrate some useage of POI)**/
		readMyDocument(fileName);

	}
	public static void readMyDocument(String fileName){
		POIFSFileSystem fs = null;
		try {
			fs = new POIFSFileSystem(new FileInputStream(fileName));
			HWPFDocument doc = new HWPFDocument(fs);

			/** Read the content **/
			readParagraphs(doc);

			int pageNumber=1;

			/** We will try reading the header for page 1**/
			readHeader(doc, pageNumber);

			/** Let's try reading the footer for page 1**/
			readFooter(doc, pageNumber);

			/** Read the document summary**/
			readDocumentSummary(doc);

		} catch (Exception e) {
			e.printStackTrace();
		}
	}	

	public static void readParagraphs(HWPFDocument doc) throws Exception{
		WordExtractor we = new WordExtractor(doc);

		/**Get the total number of paragraphs**/
		String[] paragraphs = we.getParagraphText();
		System.out.println("Total Paragraphs: "+paragraphs.length);

		for (int i = 0; i < paragraphs.length; i++) {

			System.out.println("Length of paragraph "+(i +1)+": "+ paragraphs[i].length());
			System.out.println(paragraphs[i].toString());

		}

	}

	public static void readHeader(HWPFDocument doc, int pageNumber){
		HeaderStories headerStore = new HeaderStories( doc);
		String header = headerStore.getHeader(pageNumber);
		System.out.println("Header Is: "+header);

	}

	public static void readFooter(HWPFDocument doc, int pageNumber){
		HeaderStories headerStore = new HeaderStories( doc);
		String footer = headerStore.getFooter(pageNumber);
		System.out.println("Footer Is: "+footer);

	}

	public static void readDocumentSummary(HWPFDocument doc) {
		DocumentSummaryInformation summaryInfo=doc.getDocumentSummaryInformation();
		String category = summaryInfo.getCategory();
		String company = summaryInfo.getCompany();
		int lineCount=summaryInfo.getLineCount();
		int sectionCount=summaryInfo.getSectionCount();
		int slideCount=summaryInfo.getSlideCount();

		System.out.println("---------------------------");
		System.out.println("Category: "+category);
		System.out.println("Company: "+company);
		System.out.println("Line Count: "+lineCount);
		System.out.println("Section Count: "+sectionCount);
		System.out.println("Slide Count: "+slideCount);

	}

}

What is the difference between java and javaw?

Everyday is a new learning opportunity. I have been a programmer for more than 5 years now, but I realized sometimes people tend to ignore smaller nice features – which you later repent you should have learned years ago. Here is what I learned today about java/javaw.

The javaw command is identical to java, except that with javaw there is no associated console window. Use javaw when you don’t want a command prompt window to appear. The javaw launcher will, however, display a dialog box with error information if a launch fails for some reason.

Just giving you  a background on why it was useful to me: I was working on a shared machine with my co-workers and I used to leave one of my tools running in the console. This tool was supposed to stay alive all the time to serve its purpose of polling the logged users – but sometimes it used to get killed by some other users (and sometimes myself too) in ignorance. Then when I learned using ‘javaw’ I didnt have to care even if I myself or my co-workers killed the console, the program ran on the background anyways.

You could run your programs as a service anyways – but I am not going to advocate about those ideas in this post today.

Complete Tutorial On Using SOAP-UI to Mock Web Service Request / Response

I found SOAP-UI tool while I was looking for ways to mock the web services temporarily while I and my team were waiting on the real web-service to be ready for doing integration test with our front end application. SoapUI is an open source web service testing application for service-oriented architectures (SOA). Its functionality covers web service inspection, invoking, development, simulation and mocking, functional testing, load and compliance testing.

This tutorial covers some basic stuffs of using this tool to create Mock request/response from sample WSDL (Web Service Description Language). Although the tool offers advanced options and features, this scope of this article would be enable the readers to download, install and run a mock service using a simple WSDL file.

Here is the WSDL file that I will be using in this tutorial.

<definitions name="SampleService" targetNamespace="http://www.sanjaal.com/wsdl/SampleService.wsdl" 
	xmlns="http://schemas.xmlsoap.org/wsdl/" 
	xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" 
	xmlns:tns="http://www.sanjaal.com/wsdl/SampleService.wsdl" 
	xmlns:xsd="http://www.w3.org/2001/XMLSchema">

	<message name="SampleRequest">
		<part name="firstName" type="xsd:string"/>
	</message>
	<message name="SampleResponse">
		<part name="greeting" type="xsd:string"/>
	</message>
	<portType name="Sample_PortType">
		<operation name="sampleOperation">
			<input message="tns:SampleRequest"/>
			<output message="tns:SampleResponse"/>
		</operation>
	</portType>
	<binding name="SampleBinding" type="tns:Sample_PortType">
		<soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/>
		<operation name="sampleOperation">
			<soap:operation soapAction="sampleOperation"/>
			<input>
				<soap:body encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" 
					namespace="urn:sanjaal:sample-service" use="encoded"/>
			</input>
			<output>
				<soap:body encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" 
					namespace="urn:sanjaal:sample-service" use="encoded"/>
			</output>
		</operation>
	</binding>
	<service name="Sample_Service">
		<documentation>WSDL File for SampleService</documentation>
		<port binding="tns:Sample_Binding" name="Sample_Port">
			<soap:address location="http://www.sanjaal.com/sample-service"/>
		</port>
	</service>
</definitions>


Download SOAP-UI:

SOAP UI tool is free and open source tool and can be downloaded from the website www.soapui.org. I prefer to download in the zip format, which you can just unzip to any location of your choice and are ready to start to work with. But they have other binary distributions with installers. No matter which option you use, there will be folder where SOAP UI is installed or unzipped.

As you can see in the screenshot below, I had unzipped the SOAP-UI tool to C:soapui-4.5.1 folder.

To start the tool, navigate to the bin folder and double click on soapui.bat file. You can also run this bat file from command prompt.

SOAP-UI Webservice Mock Tutorial

Once the bat file runs, you will see a blank Soap-UI tool running.
SOAP-UI Webservice Mock Tutorial

To start a new project, click on File > New soapUI Project or just hit Ctrl + N
SOAP-UI Webservice Mock Tutorial

You will be prompted with a New soapUI Project dialog box where you can provide a name for your project. You will also see that you can browse your WSDL file using this dialog.
SOAP-UI Webservice Mock Tutorial

Go ahead and browse your WSDL file. If you don’t have a WSDL file, you can copy the content of the sample WSDL we have provided at the beginning of this tutorial and save it to a file as .wsdl extension – such as sample.wsdl
SOAP-UI Webservice Mock Tutorial

Once the WSDL file is loaded, you will see something similar to the following. The view depends on how many operations you have defined in the WSDL file, what names you have chosen etc. For each operation, you will see the Requests created by default.
SOAP-UI Webservice Mock Tutorial

To mock a service for this request, right click on the Binding (again depends on what names you choose for your binding), and click on Generate Mock Service
SOAP-UI Webservice Mock Tutorial

You will be prompted with a Generate MockService dialog where you can choose operations, path (which will be a part of your endpoint definition) and you can even tell the tool what port to use for this service. Make sure to choose the port that has already not been taken.
SOAP-UI Webservice Mock Tutorial

Once you hit OK on the above dialog,  it will ask for what you want to name this WebService Mock. Give it a name – it can be anything.
SOAP-UI Webservice Mock Tutorial

You will then see that on the navigator, a MockService with the name that you provided on the prior screenshot is created. When you double clock on the operation name of your choice, it will open a window on the right hand side where you have option to run the Mock Service (The small green icon). Click on that icon to start the MockService
SOAP-UI Webservice Mock Tutorial

The following window shows that the MockService is running (red button stops the service), it also displays what port it is running on.
SOAP-UI Webservice Mock Tutorial

Double click on the request from the left hand navigation. If you want to change the end point, you can edit the URL on this screen. Once you have selected the right end point, click on the small green icon on the request window.
SOAP-UI Webservice Mock Tutorial

If the mock service is running correctly, you will see the response on the right window.
SOAP-UI Webservice Mock Tutorial
Once the service is running, you can also connect to it from your browser. My endpoint is defined at: http://localhost:8088/sampleService which is my webservice URL.

Just type in the url without the service name (in my case I typed just http://localhost:8088). You will see a page seeing what all services are running.
SOAP-UI Webservice Mock Tutorial

If you click on the service of your choice, you will be shown with the WSDL.
SOAP-UI Webservice Mock Tutorial

You have successfully completed learning how to mock a webservice using SOAP-UI.