Finding Longest substring having exactly ‘k’ unique characters

We can start with a simple case k = 2 then we can generalize our solution.

Algorithm

1) Take 3 pointers to keep track of two ends of substring and start of last substring.
2) Increment end pointer until we find second different character while start points to first char.
3) If end pointer equals length of string return as string has one type of char only (base case).
4) Else store current two different characters in an array.
5) Loop until string length
6) Check if current character is same or different
7) If we have one of the same character then check if current character is different than previous character then update lastGroupStart variable to keep track of new potential longest substring.
8) If we have different character then compare local maximal substring with global maximal substring length. Update the start, lastGroupStart variable and correspondingly char array for new potential substring.
9) After loop check for one more potential substring from start to string length.

#include<iostream>
#include<string>

using namespace std;

string longestTwoUniqueChar(string str)
{
    int start = 0;
    int i = 1;
    string res = "";
    string maxs = "";

    while (i < str.length() && str[i] == str[start])
        i++;

    if (i == str.length())
        return str;

    char chars[2] = {str[start], str[i]};
    int lastGroupStart = 0;

    while (i < str.length())
    {
        if (str[i] == chars[0] || str[i] == chars[1])
        {
            if (str[i] != str[i - 1])
                lastGroupStart = i;
        }
        else
        {
            res = str.substr(start,i-start);
            if(res.length() > maxs.length())
                maxs = res;
        
            start = lastGroupStart;
            lastGroupStart = i;
            chars[0] = str[start];
            chars[1] = str[lastGroupStart];
        }
        i++;
    }
      res = str.substr(start,str.length()-start);
      if(res.length() > maxs.length())
        maxs = res;
      return res;
}

int main()
{
    string s  = "aabaaddaa";
    cout<<longestTwoUniqueChar(s)<<endl; 
    return 0;
}

Now we can come to generic solution of ‘k’ unique character substring.

Algorithm :-

1) Take an unordered map with key as character and its last position as value.
2) Loop through the length of string.
3) If size of map is less than ‘k’ or we have previously seen present character then update map entry.
4) Else we have found new character (k+1 th),
4a) Update the global maxima.
4b) Delete the earliest character from map which is not previous character.
4c) update the start pointer and map entry.
5) update the global maxima after loop.

#include<iostream>
#include<string>
#include<unordered_map>
#include<limits.h>

using namespace std;

string longestKCharSubstring(string s, int k)
{
    int i, max_len = 0, start = 0;
    // either unique char & its last pos
    unordered_map<char, int> ht;
    string res = "";
    string maxs = "";
    char prev_char;

    for (i = 0; i < s.size(); i++)
    {
        if (ht.size() < k || ht.find(s[i]) != ht.end())
        {
            ht[s[i]] = i;
            prev_char = s[i];
        }
        else
        { 
            // map size greater than k and character does not appear in first k unique character         
            // (k + 1)-th char
            res = s.substr(start,i-start);
            if(res.length() > maxs.length())
                maxs = res;
           
            // start points to the next of the earliest char
            char earliest_char;          
            for (auto key : ht)
                if (key.second < INT_MAX && key.first!=prev_char)
                    earliest_char = key.first;          
            start = ht[earliest_char] + 1;
            // replace earliest_char         
            ht.erase(earliest_char);
            ht[s[i]] = i;
        }
    }
    res = s.substr(start,i-start);
     if(res.length() > maxs.length())
                maxs = res;
    return maxs;
}


int main()
{
    string s  = "aaabbbbaaaacccccccc";  
    cout<<longestKCharSubstring(s,2)<<endl;
    return 0;
}

About neer1304

An INTJ Curious Geek
This entry was posted in Algorithms and tagged , . Bookmark the permalink.

Leave a comment