**Given a string, please get the length of the longest substring which does not have duplicated characters. Supposing all characters in the string are in the range from ‘a’ to ‘z’.**

__Problem:__**It’s not difficult to get all substrings of a string, and to check whether a substring has duplicated characters. The only concern about this brute-force strategy is performance. A string with**

__Analysis:__*n*characters has O(

*n*) substrings, and it costs O(

^{2}*n*) time to check whether a substring has duplication. Therefore, the overall cost is O(

*n*).

^{3}
We may improve the efficiency with dynamic programming. Let’s
denote the length of longest substring ending with the

*i*character by^{th}*L*(*i*).
We scan the string one character after another. When the

*i*character is scanned,^{th}*L*(*i*-1) is already know. If the*i*character has not appeared before,^{th}*L*(*i*) should be*L*(*i*-1)+1. It’s more complex when the*i*character is duplicated. Firstly we get the distance between the^{th}*i*character and its previous occurrence. If the distance is greater than^{th}*L*(*i*-1), the character is not in longest substring without duplication ending with the (*i*-1)*character, so*^{th}*L*(*i*) should also be*L*(*i*-1)+1. If the distance is less than*L*(*i*-1),*L*(*i*) is the distance, and it means between the two occurrence of the*i*character there are no other duplicated characters.^{th}
This solution can be implemented in Java as the following
code:

public static int longestSubstringWithoutDuplication(String str) {

int curLength = 0;

int maxLength = 0;

int position[] = new int[26];

for(int i = 0; i < 26; ++i) {

position[i] = -1;

}

for(int i = 0; i < str.length(); ++i) {

int prevIndex = position[str.charAt(i) - 'a'];

if(prevIndex < 0 || i - prevIndex > curLength) {

++curLength;

}

else {

if(curLength > maxLength) {

maxLength = curLength;

}

curLength = i - prevIndex;

}

position[str.charAt(i) - 'a'] = i;

}

if(curLength > maxLength) {

maxLength = curLength;

}

return maxLength;

}

*L*(

*i*) is implemented as curLength in the code above. An integer array is used to store the positions of each character.

More coding interview questions are discussed in my book
<Coding Interviews: Questions, Analysis & Solutions>. You may find
the details of this book on Amazon.com, or Apress.

The
author Harry He owns all the rights of this post. If you are going to use part
of or the whole of this ariticle in your blog or webpages, please add a
reference to http://codercareer.blogspot.com/.
If you are going to use it in your books, please contact him via
zhedahht@gmail.com . Thanks.

Hello,

ReplyDeletehere my solution in C++:

string find_longest_sub_str(const string &s) {

unordered_set ht;

string max_str;

string s1;

for(int i = 0; i < s.size(); i++) {

auto it = ht.find(s[i]);

bool dublicate = false;

// no dublicate so far

if(it == ht.end()) {

s1 += s[i];

ht.insert(s[i]);

}

// dublicate found

else {

ht.clear();

dublicate = true;

}

// if current substr > max_substr

if(s1.size() > max_str.size()) {

max_str = s1;

}

// clear ht and string s1 and set start for next substr

if(dublicate) {

ht.clear();

s1.clear();

ht.insert(s[i]);

s1 += s[i];

}

}

return max_str;

}

public static void main(String[] args) {

ReplyDeleteSystem.out.println(longestSubstringWithoutDuplication("abcde"));

System.out.println(longestSubstringWithoutDuplication("ababa"));

System.out.println(longestSubstringWithoutDuplication("aabbc"));

System.out.println(longestSubstringWithoutDuplication("abcdabcd"));

System.out.println(longestSubstringWithoutDuplication("ababcde"));

}

public static int longestSubstringWithoutDuplication(String s) {

int[] prevP = new int[26];

int[] dp = new int[s.length() + 1];

int maxLen = 0;

for (int i = 0; i < prevP.length; i++)

prevP[i] = -1;

for (int i = 1; i <= s.length(); i++) {

if (prevP[s.charAt(i - 1) - 'a'] < 0

|| i - 1 - prevP[s.charAt(i - 1) - 'a'] > dp[i - 1]) {

dp[i] = dp[i - 1] + 1;

} else {

dp[i] = i - 1 - prevP[s.charAt(i - 1) - 'a'];

}

prevP[s.charAt(i - 1) - 'a'] = i - 1;

maxLen = Math.max(maxLen, dp[i]);

}

return maxLen;

}

Great blogs. I was wondering that if we can update the program to handle space, upper character etc. Example: "this is". In this case as space is not handle, it throw error. Thanks

ReplyDeleteCheck this for c# interview questions @ http://skillgun.com

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteGreat article. This is a very popular interview question. I also wrote an in-depth analysis at https://goo.gl/UaLw1E.

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteThis solution is incorrect. The line "If the ith character has not appeared before, L(i) should be L(i-1)+1" is not always true. What if there was a repetition just before i, and L(i) represents a sub string from (say) 0 to (i-5)?

ReplyDeleteSimple Bottom Up DP solution O(N^2) :

ReplyDelete#include

using namespace std;

int main() {

//code

int t;

cin>>t;

while(t--)

{

string temp;

cin>>temp;

int n = temp.length();

bool dp[n][n];

memset(dp,false,sizeof(dp));

int max_len = 1;

for(int i = 0; i < n - 1; i++)

{

if(temp[i] == temp[i+1])

dp[i][i+1] = true;

else

max_len = 2;

}

for(int l = 3; l <= n; l++)

{

for(int i = 0; i <= (n-l); i++)

{

int j = i+l-1;

if((temp[i] != temp[j])&&(!dp[i][j-1]&& !dp[i+1][j]))

{

dp[i][j] = false;

max_len = max(max_len,l);

}

else

dp[i][j] = true;

}

}

cout<<max_len<<endl;

}

return 0;

}