How to extract plain Text from HTML Website easily in Java

I was looking for ways to crawl websites, and to be able to only extract text. The reason I was trying to do this was to get the text from various websites to prepare Text Corpus for Natural Language Processing for a Nepali Language. There were several solutions on the internet, but nothing could be as simple as this one. I wrote this using a JSoup Library. In the example below, I have extracted text from the entire body, but if you want you can extract text for a desired node (and children) easily.

/**
 * 
 * @author Kushal Paudyal
 * Created on: 3/9/2017
 * Last Modified on: 3/9/2017
 *
 */
package com.icodejava.research.nlp;

import java.io.IOException;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

public class HtmlTextExtractor {
	
	public static void main (String args []) throws IOException {
		Document doc = Jsoup.connect("http://swasthyakhabar.com/news-details/3356/2017-03-09").get();
		
		System.out.println(doc.body().text());
	}

}