Java Program to Count Number of Words in String

Introduction

Counting the number of words in a string is a common task in text processing. There are multiple ways to accomplish this in Java, each with its own advantages and use cases. In this blog post, we'll explore different methods to count the number of words in a given string.

Table of Contents

  1. Using split() Method
  2. Using StringTokenizer
  3. Using Regular Expressions
  4. Using Apache Commons Lang Library
  5. Complete Example Program
  6. Conclusion

1. Using split()() Method

The split() method in Java is one of the simplest ways to count the number of words in a string. This method splits the string based on a given regular expression and returns an array of substrings.

Example:

public class WordCountUsingSplit {
    public static void main(String[] args) {
        String input = "Java is great and Java is fun.";
        String[] words = input.split("\\s+");
        System.out.println("Number of words using split(): " + words.length);
    }
}

Explanation:

  • \\s+ is a regular expression that matches one or more whitespace characters.
  • The split() method splits the string into an array of words based on the given regular expression.

Output:

Number of words using split(): 7

2. Using StringTokenizer

The StringTokenizer class is a legacy class that provides a way to break a string into tokens. It is simple to use and does not require regular expressions.

Example:

import java.util.StringTokenizer;

public class WordCountUsingStringTokenizer {
    public static void main(String[] args) {
        String input = "Java is great and Java is fun.";
        StringTokenizer tokenizer = new StringTokenizer(input);
        System.out.println("Number of words using StringTokenizer: " + tokenizer.countTokens());
    }
}

Explanation:

  • StringTokenizer splits the string based on default delimiters (whitespace, tab, newline, etc.).
  • The countTokens() method returns the number of tokens.

Output:

Number of words using StringTokenizer: 7

3. Using Regular Expressions

You can use regular expressions with the Pattern and Matcher classes to count the number of words in a string.

Example:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class WordCountUsingRegex {
    public static void main(String[] args) {
        String input = "Java is great and Java is fun.";
        Pattern pattern = Pattern.compile("\\b\\w+\\b");
        Matcher matcher = pattern.matcher(input);

        int count = 0;
        while (matcher.find()) {
            count++;
        }
        System.out.println("Number of words using regex: " + count);
    }
}

Explanation:

  • \\b\\w+\\b is a regular expression that matches words.
  • The Matcher class is used to find matches of the pattern in the string.

Output:

Number of words using regex: 7

4. Using Apache Commons Lang Library

The Apache Commons Lang library provides a utility class StringUtils that can be used to count the number of words in a string.

Maven Dependency:

Add the following dependency to your pom.xml file:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-lang3</artifactId>
    <version>3.12.0</version>
</dependency>

Example:

import org.apache.commons.lang3.StringUtils;

public class WordCountUsingStringUtils {
    public static void main(String[] args) {
        String input = "Java is great and Java is fun.";
        int count = StringUtils.split(input, ' ').length;
        System.out.println("Number of words using StringUtils: " + count);
    }
}

Explanation:

  • StringUtils.split() splits the string based on the given delimiter and returns an array of substrings.

Output:

Number of words using StringUtils: 7

5. Complete Example Program

Here is a complete program that demonstrates all the methods discussed above to count the number of words in a string.

Example Code:

import java.util.StringTokenizer;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.commons.lang3.StringUtils;

public class WordCountExample {
    public static void main(String[] args) {
        String input = "Java is great and Java is fun.";

        // Using split() method
        String[] wordsUsingSplit = input.split("\\s+");
        System.out.println("Number of words using split(): " + wordsUsingSplit.length);

        // Using StringTokenizer
        StringTokenizer tokenizer = new StringTokenizer(input);
        System.out.println("Number of words using StringTokenizer: " + tokenizer.countTokens());

        // Using Regular Expressions
        Pattern pattern = Pattern.compile("\\b\\w+\\b");
        Matcher matcher = pattern.matcher(input);
        int countUsingRegex = 0;
        while (matcher.find()) {
            countUsingRegex++;
        }
        System.out.println("Number of words using regex: " + countUsingRegex);

        // Using Apache Commons Lang StringUtils
        int countUsingStringUtils = StringUtils.split(input, ' ').length;
        System.out.println("Number of words using StringUtils: " + countUsingStringUtils);
    }
}

Output:

Number of words using split(): 7
Number of words using StringTokenizer: 7
Number of words using regex: 7
Number of words using StringUtils: 7

6. Conclusion

Counting the number of words in a string can be accomplished in multiple ways in Java. The split() method, StringTokenizer, regular expressions, and Apache Commons Lang library are all effective methods, each with its own advantages. By understanding and using these different methods, you can choose the most appropriate one for your specific use case.

Happy coding!

Comments