The longest common sequence (longest common sequence) and the longest common substring (longest common substring) are not the same thing. The following article mainly introduces you to the relevant information about the implementation of the longest common subsequence in JavaScript. , friends in need can refer to it.
Introduction
The Longest Common Subsequence LCS is to extract all the possible subsequences from the given two sequences X and Y. The possible extra characters are arranged in the order in which they are arranged in the original sequence. The algorithm for LCS problems has a wide range of uses. For example, in the management of different versions of software, the LCS algorithm is used to find the similarities and differences between the old and new versions; in software testing, the LCS algorithm is used to compare recorded and played back sequences. In the field of genetic engineering, the LCS algorithm is used The algorithm checks the similarities and differences between the patient's DNA strand and the bond's DNA strand; in the anti-plagiarism system, the LCS algorithm is used to check the plagiarism rate of the paper. The LCS algorithm can also be used for program code similarity measurement, human running sequence retrieval, video segment matching, etc., so research on the LCS algorithm has high application value.
Basic concepts
Subsequence: A subsequence of a specific sequence is zero or more elements in a given sequence The result obtained after removing it (without changing the relative order between elements). For example, the subsequences of the sequence are: , ,
If X = "ABCDAB" and Y = "BDCABA", each takes out the shortest sequence, that is, compares the empty string with the empty string . The solution of the LCS equation is a number, so this table can only be filled with numbers. The length of the common area of two empty strings is 0.
x
""
B
D
C
A
B
A
##""
0
A
B
C
D
A
B
Then we don’t move X and continue to let the empty string come out of the array, and Y lets “B” come out of the array. Obviously, the length of their common areas is 0. Y is replaced with other characters, D, C, or, they Continuous combinations of DC and DDC, the situation has not changed, it is still 0. Therefore, the first row is 0. Then we do not move Y, and Y only produces empty strings, then it is the same as the above analysis, all are 0, the first The columns are all 0.
x
""
B
D
C
A
B
A
#""
0
0
0
0
0
0
0
##A
0
##B
0
C
0
D
0
A
0
B
0
The LCS problem is a little different from the backpack problem. The backpack problem can also be set to -1 OK, and the longest common subsequence has the left and upper sides fixed from the beginning because of the occurrence of empty subsequences.
Then we enlarge the problem a little further. This time both sides produce a character. Obviously, only when both are the same, can there be a common subsequence that is not an empty string, and the length can also be understood as 1. A is "X", Y is any subsequence of "BDCA"##x
""
B
D
C
A
B
A
##""
0
0
0
0
0
0
0
##A
0
0
0
0
1
##B
0
C
0
D
0
##A
0
B
0
Continue to fill in the blanks to the right, how to fill in? Obviously, LCS cannot be greater than the length of X. How can the subsequence of Y starting from the A string be equal to 1 compared with the A sequence of B.
x
""B
D
C
A
B
A
#""
0
0
0
0
0
0
0
##A
0
0
0
0
1
1
1
##B
0
C
0
D
0
A
0
B
0
If Then let's look at B first, ${X_1} == ${Y_0}, we get a new public substring, and we should add 1. why? Because our matrix is a state table, describing the state migration process from left to right and top to bottom, and these states are accumulated based on existing states. What we need to confirm now is the relationship between the value of the grid we want to fill in and the values of the grids around it that have already been filled in. At present, there is too little information and it is just an isolated point, so just fill in 1.
x
""
B
D
C
A
B
A
#""
0
0
0
00
0
0
##A
0
0
0
0
1
1
1
##B
0
1
##C
0
D
0
A
0
B
0
Then we let Y have an extra D as a helper, {"",A,B,AB} vs {"",B,D,BD}. Obviously, continue to fill in 1. Fill in until the second one of Y Before B, it is all 1. Because when it comes to BDCAB, they have another common subsequence, AB.
x
""
B
D
C
A
B
A
#""
0
0
0
0
0
0
0
##A
0
0
0
0
1
1
1
##B
0
1
1
1
1
2
#C
0
D
0
A
0
B
0
At this step, we can summarize some rules. Then we will verify our ideas through calculations and add new rules or limiting conditions to improve them.
Y Send all the characters, X is still 2 characters, after careful observation, still fill in 2.
Look at the five lines, send more X If the subsequence set of C and ABC is larger than the subsequence set of AB, then it and the B subsequence set of Y are larger. Even if they are not larger, they cannot be smaller than the original ones. Obviously the newly added C cannot become a combat power and is not a common character between the two, so the value should be equal to the subsequence set of AB.
×
""
B
D
C
A
B
A
##""
0
0
0
0
0
0
0
##A
0
0
0
0
1
1
1
##B
0
1
1
1
1
2
2
#C
0
1
##D
0
A
0
B
0
And we can be sure that if the characters to be compared between the two strings are different, then the grid to be filled is related to the left or upper side, and the larger side will be used.
If the compared characters are the same, don’t worry, it happens that the C of X needs to be compared with the C of Y, that is, the subsequence set of ABC {"",A,B,C,AB,BC, ABC} is compared with the subsequence set {"",B,D,C,BD,DC,BDC} of BDC, and the common substrings obtained are "",B,D. At this time, the conclusion is still the same as before. When the characters are equal, its corresponding grid value is equal to the value of the left, right, and upper left corners, and the left, upper, and upper left sides are always equal. These mysteries require more rigorous mathematical knowledge to demonstrate.
Suppose there are two arrays, A and B. A[i] is the i-th element of A, and A(i) is the prefix consisting of the first element to the i-th element of A. m(i, j) is the longest common subsequence length of A(i) and B(j).
Due to the recursive nature of the algorithm itself, in fact, we only need to prove that for a certain i and j:
The first formula is easy to prove, that is, when A[i] = B[j]. You can use counter-proof, assuming m(i, j) > m(i-1, j-1) 1 (m(i, j) cannot be less than m(i-1, j-1) 1, the reason is obvious) , then we can deduce the contradictory result that m(i-1, j-1) is not the longest.
The second one is a bit tricky. When A[i] != B[j], it is still a disproof, assuming m(i, j) > max( m(i-1, j), m(i, j-1) ).
By disproving the hypothesis, it can be obtained that m(i, j) > m(i-1, j). This can be deduced that A[i] must be in the LCS sequence corresponding to m(i, j) (contradictory evidence is available). And since A[i] != B[j], B[j] must not be in the LCS sequence corresponding to m(i, j). So it can be deduced that m(i, j) = m(i, j-1). This leads to results that contradict the hypothesis anyway.
Get certified.
We now use the following equation to continue filling in the table.
Program implementation
//by 司徒正美
function LCS(str1, str2){
var rows = str1.split("")
rows.unshift("")
var cols = str2.split("")
cols.unshift("")
var m = rows.length
var n = cols.length
var dp = []
for(var i = 0; i < m; i++){
dp[i] = []
for(var j = 0; j < n; j++){
if(i === 0 || j === 0){
dp[i][j] = 0
continue
}
if(rows[i] === cols[j]){
dp[i][j] = dp[i-1][j-1] + 1 //对角+1
}else{
dp[i][j] = Math.max( dp[i-1][j], dp[i][j-1]) //对左边,上边取最大
}
}
console.log(dp[i].join(""))//调试
}
return dp[i-1][j-1]
}
Copy after login
LCS can be further simplified, just by moving the position, eliminating the need to generate a new array
//by司徒正美
function LCS(str1, str2){
var m = str1.length
var n = str2.length
var dp = [new Array(n+1).fill(0)] //第一行全是0
for(var i = 1; i <= m; i++){ //一共有m+1行
dp[i] = [0] //第一列全是0
for(var j = 1; j <= n; j++){//一共有n+1列
if(str1[i-1] === str2[j-1]){
//注意这里,str1的第一个字符是在第二列中,因此要减1,str2同理
dp[i][j] = dp[i-1][j-1] + 1 //对角+1
} else {
dp[i][j] = Math.max( dp[i-1][j], dp[i][j-1])
}
}
}
return dp[m][n];
}
Copy after login
Print an LCS
#We will give the printing function and first look at how to print one. We start looking from the lower right corner and end at the top line. Therefore the target string is constructed in reverse order. In order to avoid using troublesome intermediate quantities such as stringBuffer, we can implement it recursively. Each time the program is executed, only one string is returned, otherwise an empty string is returned. PrintLCS(x,y,...) str[i ] Add them together to get the string we require.
We write another method to verify whether the string we get is a real LCS string. As a person who is already working, I cannot write code like a student in school and put it online without doing unit testing for others to step on.
//by 司徒正美,打印一个LCS
function printLCS(dp, str1, str2, i, j){
if (i == 0 || j == 0){
return "";
}
if( str1[i-1] == str2[j-1] ){
return printLCS(dp, str1, str2, i-1, j-1) + str1[i-1];
}else{
if (dp[i][j-1] > dp[i-1][j]){
return printLCS(dp, str1, str2, i, j-1);
}else{
return printLCS(dp, str1, str2, i-1, j);
}
}
}
//by司徒正美, 将目标字符串转换成正则,验证是否为之前两个字符串的LCS
function validateLCS(el, str1, str2){
var re = new RegExp( el.split("").join(".*") )
console.log(el, re.test(str1),re.test(str2))
return re.test(str1) && re.test(str2)
}
Copy after login
Use:
function LCS(str1, str2){
var m = str1.length
var n = str2.length
//....略,自行补充
var s = printLCS(dp, str1, str2, m, n)
validateLCS(s, str1, str2)
return dp[m][n]
}
var c1 = LCS( "ABCBDAB","BDCABA");
console.log(c1) //4 BCBA、BCAB、BDAB
var c2 = LCS("13456778" , "357486782" );
console.log(c2) //5 34678
var c3 = LCS("ACCGGTCGAGTGCGCGGAAGCCGGCCGAA" ,"GTCGTTCGGAATGCCGTTGCTCTGTAAA" );
console.log(c3) //20 GTCGTCGGAAGCCGGCCGAA
Copy after login
Print all LCS
The idea is similar to the above , let us note that there is a Math.max value in the LCS method, which actually integrates three situations, so three strings can be forked. Our method will return an es6 collection object for automatic removal. Then each time the new set is used to merge the strings of the old set.
//by 司徒正美 打印所有LCS
function printAllLCS(dp, str1, str2, i, j){
if (i == 0 || j == 0){
return new Set([""])
}else if(str1[i-1] == str2[j-1]){
var newSet = new Set()
printAllLCS(dp, str1, str2, i-1, j-1).forEach(function(el){
newSet.add(el + str1[i-1])
})
return newSet
}else{
var set = new Set()
if (dp[i][j-1] >= dp[i-1][j]){
printAllLCS(dp, str1, str2, i, j-1).forEach(function(el){
set.add(el)
})
}
if (dp[i-1][j] >= dp[i][j-1]){//必须用>=,不能简单一个else搞定
printAllLCS(dp, str1, str2, i-1, j).forEach(function(el){
set.add(el)
})
}
return set
}
}
Copy after login
Using:
function LCS(str1, str2){
var m = str1.length
var n = str2.length
//....略,自行补充
var s = printAllLCS(dp, str1, str2, m, n)
console.log(s)
s.forEach(function(el){
validateLCS(el,str1, str2)
console.log("输出LCS",el)
})
return dp[m][n]
}
var c1 = LCS( "ABCBDAB","BDCABA");
console.log(c1) //4 BCBA、BCAB、BDAB
var c2 = LCS("13456778" , "357486782" );
console.log(c2) //5 34678
var c3 = LCS("ACCGGTCGAGTGCGCGGAAGCCGGCCGAA" ,"GTCGTTCGGAATGCCGTTGCTCTGTAAA" );
console.log(c3) //20 GTCGTCGGAAGCCGGCCGAA
Copy after login
Space optimization
Using rolling array:
function LCS(str1, str2){
var m = str1.length
var n = str2.length
var dp = [new Array(n+1).fill(0)],now = 1,row //第一行全是0
for(var i = 1; i <= m; i++){ //一共有2行
row = dp[now] = [0] //第一列全是0
for(var j = 1; j <= n; j++){//一共有n+1列
if(str1[i-1] === str2[j-1]){
//注意这里,str1的第一个字符是在第二列中,因此要减1,str2同理
dp[now][j] = dp[i-now][j-1] + 1 //对角+1
} else {
dp[now][j] = Math.max( dp[i-now][j], dp[now][j-1])
}
}
now = 1- now; //1-1=>0;1-0=>1; 1-1=>0 ...
}
return row ? row[n]: 0
}
Copy after login
Dangerous recursive solution
A subsequence of str1 corresponds to a subsequence of the subscript sequence {1, 2, …, m} sequence, therefore, str1 has a total of ${2^m}$ different subsequences (the same is true for str2, such as ${2^n}$), so the complexity reaches an astonishing exponential time (${2^m * 2^ n}$).
//警告,字符串的长度一大就会爆栈
function LCS(str1, str2, a, b) {
if(a === void 0){
a = str1.length - 1
}
if(b === void 0){
b = str2.length - 1
}
if(a == -1 || b == -1){
return 0
}
if(str1[a] == str2[b]) {
return LCS(str1, str2, a-1, b-1)+1;
}
if(str1[a] != str2[b]) {
var x = LCS(str1, str2, a, b-1)
var y = LCS(str1, str2, a-1, b)
return x >= y ? x : y
}
}
Copy after login
The above is what I compiled for everyone. I hope it will be helpful to everyone in the future.
The above is the detailed content of How to implement the longest common subsequence in javascript. For more information, please follow other related articles on the PHP Chinese website!
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How to use JS and Baidu Map to implement map pan function Baidu Map is a widely used map service platform, which is often used in web development to display geographical information, positioning and other functions. This article will introduce how to use JS and Baidu Map API to implement the map pan function, and provide specific code examples. 1. Preparation Before using Baidu Map API, you first need to apply for a developer account on Baidu Map Open Platform (http://lbsyun.baidu.com/) and create an application. Creation completed
Face detection and recognition technology is already a relatively mature and widely used technology. Currently, the most widely used Internet application language is JS. Implementing face detection and recognition on the Web front-end has advantages and disadvantages compared to back-end face recognition. Advantages include reducing network interaction and real-time recognition, which greatly shortens user waiting time and improves user experience; disadvantages include: being limited by model size, the accuracy is also limited. How to use js to implement face detection on the web? In order to implement face recognition on the Web, you need to be familiar with related programming languages and technologies, such as JavaScript, HTML, CSS, WebRTC, etc. At the same time, you also need to master relevant computer vision and artificial intelligence technologies. It is worth noting that due to the design of the Web side
How to use PHP and JS to create a stock candle chart. A stock candle chart is a common technical analysis graphic in the stock market. It helps investors understand stocks more intuitively by drawing data such as the opening price, closing price, highest price and lowest price of the stock. price fluctuations. This article will teach you how to create stock candle charts using PHP and JS, with specific code examples. 1. Preparation Before starting, we need to prepare the following environment: 1. A server running PHP 2. A browser that supports HTML5 and Canvas 3
Essential tools for stock analysis: Learn the steps to draw candle charts in PHP and JS. Specific code examples are required. With the rapid development of the Internet and technology, stock trading has become one of the important ways for many investors. Stock analysis is an important part of investor decision-making, and candle charts are widely used in technical analysis. Learning how to draw candle charts using PHP and JS will provide investors with more intuitive information to help them make better decisions. A candlestick chart is a technical chart that displays stock prices in the form of candlesticks. It shows the stock price
Overview of how to use JS and Baidu Maps to implement map click event processing: In web development, it is often necessary to use map functions to display geographical location and geographical information. Click event processing on the map is a commonly used and important part of the map function. This article will introduce how to use JS and Baidu Map API to implement the click event processing function of the map, and give specific code examples. Steps: Import the API file of Baidu Map. First, import the file of Baidu Map API in the HTML file. This can be achieved through the following code:
How to use JS and Baidu Maps to implement the map heat map function Introduction: With the rapid development of the Internet and mobile devices, maps have become a common application scenario. As a visual display method, heat maps can help us understand the distribution of data more intuitively. This article will introduce how to use JS and Baidu Map API to implement the map heat map function, and provide specific code examples. Preparation work: Before starting, you need to prepare the following items: a Baidu developer account, create an application, and obtain the corresponding AP
With the rapid development of Internet finance, stock investment has become the choice of more and more people. In stock trading, candle charts are a commonly used technical analysis method. It can show the changing trend of stock prices and help investors make more accurate decisions. This article will introduce the development skills of PHP and JS, lead readers to understand how to draw stock candle charts, and provide specific code examples. 1. Understanding Stock Candle Charts Before introducing how to draw stock candle charts, we first need to understand what a candle chart is. Candlestick charts were developed by the Japanese
The relationship between js and vue: 1. JS as the cornerstone of Web development; 2. The rise of Vue.js as a front-end framework; 3. The complementary relationship between JS and Vue; 4. The practical application of JS and Vue.