How Do You Find the Longest Common Subsequence of Two Strings in Java?

Strings are nothing but a combination of characters, and working on strings is a common part of a programmer’s life.

Questions related to strings are often asked in interviews. Whether it is about finding the next permutation or finding the longest common subsequence, you will find one or more questions related to strings in an interview.

Here, we are going to discuss one common problem i.e, printing the longest common subsequence.

One thing that you have to note here is that there is a difference when it comes to finding a common subsequence and printing longest common subsequence.

The Problem Statement

You will be given two strings and you will have to determine the longest common subsequence between the given strings. Also, keep in mind that the strings should be in the same order only.

A subsequence of a string is defined as the sequence of characters that you can derive from the string without changing any element of the string. The order of the other remaining string should be the same.

To understand the problem, consider the following example:

You are given two strings: string 1: AECFDEF and String 2: BACGDGBF

Now, the output of the program will be ACDF because, among all the other subsequences, it is the largest one.

So, now you may have an idea of what a subsequence of a string is. Let’s proceed with how you can find the longest common subsequence in two strings.

How To Find The Longest Common Subsequence In Two Strings

To find the longest common subsequence, you can use two methods. They are -

Simple approach
Dynamic Programming

Simple Approach

To find the longest common subsequence, a simple approach is followed, you need to check for each subsequence of string 1 and ascertain whether or not it is a subsequence of string 2.

Consider that the lengths of S1 and S2 subsequences are m and n respectively.

Here, the function will accept two sequences as its parameters and will use two iterators to traverse through the sequence.

Also, in this algorithm, things can get tricky when any of the lengths are not valid.

You will then have to check the present element in both iterators and if they are equal, you need to include it in the largest common subsequence. Also, recursively call the previous element in each string.

However, in case the iterators are not equal, you can not obtain LCS and include any present element in it.

Complexity Analysis

- Time Complexity: The time complexity of this method is O(2^(m+n)). Here, m and n are the lengths of the given strings.

- Space Complexity: Now that the algorithm uses no additional space, the space complexity of this method is calculated as O(1).

Dynamic Programming

While using a simple approach, there are times when the same subproblem is analyzed multiple times or when some problems may get overlapped. In that case, an effective solution is used.

To overcome all the drawbacks of the simple approach, dynamic programming is used to find the longest common subsequence.

The process includes the following steps:

To begin with, you need to first create a table having dimensions (m+1)*(n+1). Here, m and n are the sizes of the given strings. Now, you will have to set 0 in the first column and row of the table.

After this, you need to follow the following steps to fill the other cells of your created table:

In case the characters present in the corresponding row and column are the same, you will have to add 1 to the current cell which is diagonal to the element. Also, point an arrow toward the diagonal cell.

In case the characters are not the same, we need to fill the present cell with the same values as the previous one. The arrow must point to a cell that has a maximum value. Moreover, when values in both cells become equal, you can point the arrow to any one of them.

You will have to repeat step 2 until the table is filled completely.

Now, the last column and row will determine the length of the required common subsequence.

Moreover, to find the longest common subsequence, you need to follow the arrow’s direction from the ending element. All the elements adjacent to the brackets will create the required subsequence.

One thing that you have to remember about these methods is that these methods will help you find the length of the subsequence. You will not be able to print the actual value of the subsequence. So, if you wish to print the value, check out what you need to do for printing the longest common subsequence.

How To Print The Longest Common Subsequence

For printing the longest common subsequence, you need to carry out a different algorithm. Here are all the steps involved in the process.

To begin with, you will have to create L[m+1][n+1].

Here, L[m][n] will contain the length of the longest common subsequence. You will have to character array of the length the same as that of LCS+1.

You will have to then iterate through the 2D array beginning from L[m][n]. For each cell of L[i][j], you need to perform the following steps.

In case the characters adjoining to L[i][j] are the same, you will have to include that character as part of the longest common subsequence.

Otherwise, you will have to compare the value of L[i][j-1] and L[i-1][j]. You will then have to go in the direction of the larger value.

Conclusion

The subsequence of a string is simply a sequence of characters present in the main array. When it comes to finding the longest common subsequence, you find it either using dynamic programming or a simple recursive method.

However, when it comes to printing the longest common subsequence, the algorithm that you need to follow is different. We have explained to you the complete approach that you have to follow.