Java Program to Remove Duplicate Words from String

In this blog post, we will write a Java program that removes duplicate words from a given string. This program utilizes string manipulation techniques, regular expressions, and data structures to efficiently identify and remove duplicate words. Let's dive into the code and see how it works!

Java Program to Remove Duplicate Words from String

import java.util.HashSet;
import java.util.Set;

public class DuplicateWordRemover {
    public static void main(String[] args) {
        String inputString = "Java is a programming language and Java is widely used in the software industry.";

        // Removing duplicate words
        String result = removeDuplicateWords(inputString);

        System.out.println("String after removing duplicate words: " + result);
    }

    private static String removeDuplicateWords(String inputString) {
        // Splitting the string into words
        String[] words = inputString.split("\\s+");

        // Creating a set to store unique words
        Set<String> uniqueWords = new HashSet<>();

        // Removing duplicate words
        StringBuilder resultBuilder = new StringBuilder();
        for (String word : words) {
            if (uniqueWords.add(word)) {
                resultBuilder.append(word).append(" ");
            }
        }

        // Converting the StringBuilder to a string
        String result = resultBuilder.toString().trim();

        return result;
    }
}

Output:

String after removing duplicate words: Java is a programming language and widely used in the software industry.

Explanation: 

1. The program starts by initializing the inputString variable with the desired text. 

2. The removeDuplicateWords() method takes in the input string as a parameter and returns a new string with duplicate words removed. 

3. Inside the removeDuplicateWords() method, the string is split into words using the split() method, which splits the string based on one or more whitespace characters (\\s+). 

4. A HashSet called uniqueWords is created to store unique words. 

5. The program iterates through each word in the word array. For each word, it attempts to add it to the uniqueWords set using the add() method. If the word is added successfully (indicating it is unique), it is appended to the resultBuilder along with a space. 

6. Finally, the resultBuilder is converted to a string, and any leading or trailing whitespace is removed using trim()

7. The resulting string with duplicate words removed is returned from the removeDuplicateWords() method. 

8. The program prints the updated string to the console. 

Feel free to modify the inputString variable to test the program with different strings. 

Java 8 Program to Remove Duplicate Words from String

Let's write the same program using Java 8:
import java.util.Arrays;
import java.util.stream.Collectors;

public class DuplicateWordRemover {
    public static void main(String[] args) {
        String inputString = "Java is a programming language and Java is widely used in the software industry.";

        // Removing duplicate words using Java 8
        String result = removeDuplicateWords(inputString);

        System.out.println("String after removing duplicate words: " + result);
    }

    private static String removeDuplicateWords(String inputString) {
        String[] words = inputString.split("\\s+");

        // Using Java 8 stream and distinct to remove duplicates
        String result = Arrays.stream(words)
                .distinct()
                .collect(Collectors.joining(" "));

        return result;
    }
}

Output:

String after removing duplicate words: Java is a programming language and widely used in the software industry.

Explanation: 

1. The removeDuplicateWords() method takes in the input string as a parameter and returns a new string with duplicate words removed. 

2. Inside the removeDuplicateWords() method, the string is split into words using the split() method, which splits the string based on one or more whitespace characters (\\s+). 

3. Using Java 8's Stream API, we create a stream from the array of words using Arrays.stream(words)

4. We apply the distinct() operation on the stream to remove duplicates and retain only unique words. 

5. Finally, we collect the distinct words using the Collectors.joining(" ") operation, which joins the words back into a single string separated by spaces. 

6. The resulting string with duplicate words removed is returned from the removeDuplicateWords() method. 

7. The program prints the updated string to the console.

Conclusion

Congratulations! You have learned how to write a Java program to remove duplicate words from a given string. By utilizing string manipulation techniques, regular expressions, and the HashSet data structure, we efficiently identify and remove duplicate words. We also wrote the same program using Java 8.

Feel free to incorporate this code into your Java projects or customize it to suit your specific requirements. Happy coding!

Related Java String Programs with Output

Comments