Question: My current code does not account for multiple white spaces and needs to reduce anytime there is more than one white space back to one.
My current code does not account for multiple white spaces and needs to reduce anytime there is more than one white space back to one. (Currently when reading a string with two white spaces in a row it assigns an extra token)
The main objectives of this lab include:
set up a 2d array (or matrix) with proper initial values using vector of vectors
given a string, implement a tokenizer to identify all the unique tokens contained within this string and the number of times (i.e., frequency) a given token appears in this string.
create a new datatype using struct to hold a token and its frequency; further store all the tokens and their frequencies into a vector
implement the insertion sort algorithm to sort the list of tokens in increasing order of frequency
implement the selection sort algorithm to sort the list of tokens in decreasing order of frequency
Submit a single .cpp program for grading. You are required to develop your program on a local IDE first. Then submit it for grading. You can submit your solution for grading up to 25 times.
1. Implement the following function to create a matrix of dimensionality numRows x numCols, where matrix starts with an initial size of 0. Furthermore, initialize the value at matrix[i][j] to the product of i and j.
void matrixInit( vector< vector>& matrix, int numRows, int numCols);
For example, if numRows is 3, numCols is 4, you are going to allocate space to matrix so that it contains three row vectors, where the size of each row vector will be 4. At the end of this function call, the content of matrix will be:
size of matrix is: 3x4 matrix[0][0]=0 matrix[0][1]=0 matrix[0][2]=0 matrix[0][3]=0 matrix[1][0]=0 matrix[1][1]=1 matrix[1][2]=2 matrix[1][3]=3 matrix[2][0]=0 matrix[2][1]=2 matrix[2][2]=4 matrix[2][3]=6
2. Given a string variable string s; , implement a tokenizer to identify the unique tokens contained in this string, identify all the unique tokens and their frequencies, and store such information into a vector. Tokens are sequences of contiguous characters separated by any of the specified delimiters (e.g., white spaces). In this lab, only white spaces will be considered as delimiters. For instance, the string "Hello, what's that thing? " contains four tokens: "Hello", "what's", "that" and "thing?". The frequency of a token is the number of times this token appears in this string. In this example, each token has a frequency of 1. Note that in this lab, these tokens are case insensitive. For example, "Hello" and "hello" are considered to be the same token.
Specifically, you are required to
declare a struct TokenFreq that consists of two data members: (1) string token; and (2) int freq; Obviously, an object of this struct will be used to store a specific token and its frequency. For example, the following object word stores the token "dream" and its frequency 100
TokenFreq word; word.token="dream"; word.freq=100;
implement the following function, where istr is the input string, and tfVec will be used to store the list of unique and case insensitive tokens and their corresponding frequencies identified within istr. You might find it's very convenient to use stringstream objects to tokenize a string.
void getTokenFreqVec( string& istr, vector& tfVec)
Assume that the value of istr is
And no, I'm not a walking C++ dictionary. I do not keep every technical detail in my head at all times. If I did that, I would be a much poorer programmer. I do keep the main points straight in my head most of the time, and I do know where to find the details when I need them. by Bjarne Stroustrup
After calling the above function, tfVec is expected to contain the following values (where order of appearances doesn't matter):
size=46 { [0] = (token = "and", freq = 2) [1] = (token = "no,", freq = 1) [2] = (token = "i'm", freq = 1) [3] = (token = "not", freq = 2) [4] = (token = "a", freq = 2) [5] = (token = "walking", freq = 1) [6] = (token = "c++", freq = 1) [7] = (token = "dictionary.", freq = 1) [8] = (token = "i", freq = 6) [9] = (token = "do", freq = 3) [10] = (token = "keep", freq = 2) [11] = (token = "every", freq = 1) [12] = (token = "technical", freq = 1) [13] = (token = "detail", freq = 1) [14] = (token = "in", freq = 2) [15] = (token = "my", freq = 2) [16] = (token = "head", freq = 2) [17] = (token = "at", freq = 1) [18] = (token = "all", freq = 1) [19] = (token = "times.", freq = 1) [20] = (token = "if", freq = 1) [21] = (token = "did", freq = 1) [22] = (token = "that,", freq = 1) [23] = (token = "would", freq = 1) [24] = (token = "be", freq = 1) [25] = (token = "much", freq = 1) [26] = (token = "poorer", freq = 1) [27] = (token = "programmer.", freq = 1) [28] = (token = "the", freq = 3) [29] = (token = "main", freq = 1) [30] = (token = "points", freq = 1) [31] = (token = "straight", freq = 1) [32] = (token = "most", freq = 1) [33] = (token = "of", freq = 1) [34] = (token = "time,", freq = 1) [35] = (token = "know", freq = 1) [36] = (token = "where", freq = 1) [37] = (token = "to", freq = 1) [38] = (token = "find", freq = 1) [39] = (token = "details", freq = 1) [40] = (token = "when", freq = 1) [41] = (token = "need", freq = 1) [42] = (token = "them.", freq = 1) [43] = (token = "by", freq = 1) [44] = (token = "bjarne", freq = 1) [45] = (token = "stroustrup", freq = 1) Implement the selection sort algorithm to sort a vector in ascending order of token frequency. The pseudo code of the selection algorithm can be found at this webpage. You can also watch an animation of the sorting process at http://visualgo.net/sorting -->under "select". This function has the following prototype:
void selectionSort( vector& tokFreqVector ); //This function receives a vector of TokenFreq objects by reference and applies the selections sort algorithm to sort this vector in increasing order of token frequencies.
Implement the insertion sort algorithm to sort a vector in descending order of token frequency. The pseudo code of the selection algorithm can be found at [http://www.algolist.net/Algorithms/Sorting/Insertionsort}(http://www.algolist.net/Algorithms/Sorting/Insertionsort). Use the same link above to watch an animation of this algorithm. This function has the following prototype:
void insertionSort( vector& tokFreqVector );
(CODE)
#include
#include
#include
#include
#include
using namespace std;
/* below is the struct TokenFreq which
will store token and it's corresponding frequency count*/
struct TokenFreq {
string token;
int freq = 1;
};
/* below function will create a matrix of size numRows x numCols
and also initialize with i x j*/
void matrixInit(vector< vector
int i;
int j;
matrix.resize(numRows, vector
for (i = 0; i< numRows; i++) {
for (j = 0; j < numCols; j++)
matrix[i][j] = i*j;
}
}
void getTokenFreqVec(const string &istr, vector
string token;
int i;
int flag;
TokenFreq t;
istringstream isStream(istr);
while (getline(isStream, token, ' ')) {
transform(token.begin(), token.end(), token.begin(), ::tolower);
flag = 1;
for (i = 0; i if (tfVec[i].token == token) { tfVec[i].freq += 1; flag = 0; } } if (flag) { t.token = token; tfVec.push_back(t); } } } void selectionSort(vector int i; int j; int minT; TokenFreq token; for (i = 0; i minT = i; for (j = i + 1; j if (tokFreqVector[j].freq < tokFreqVector[minT].freq) { token = tokFreqVector[minT]; tokFreqVector[minT] = tokFreqVector[j]; tokFreqVector[j] = token; } } } } void insertionSort(vector int i; int j; TokenFreq token; for (i = 0; i token = tokFreqVector[i]; for (j = i - 1; j >= 0 && tokFreqVector[j].freq < token.freq; j--) tokFreqVector[j + 1] = tokFreqVector[j]; tokFreqVector[j + 1] = token; } } /* below main function will test the correctness of our above defined function*/ int main() { cout << "Please enter the string you want to tokenise : "; string istr; //istr = "And no, I'm not a walking C++ dictionary. I do not keep every technical detail in my head at all times. If I did that, I would be a much poorer programmer. I do keep the main points straight in my head most of the time, and I do know where to find the details when I need them. by Bjarne Stroustrup"; getline(cin, istr); vector vector int i; int j; matrixInit(matrix, 3, 4); cout << " Size of matrix is: 3x4 "; for (i = 0; i< 3; i++) { for (j = 0; j < 4; j++) cout << " matrix[" << i << "][" << j << "]=" << matrix[i][j]; } getTokenFreqVec(istr, tfVec); cout << " Testing Tokenizer. Size = " << tfVec.size() << " {" << endl; for (i = 0; i < tfVec.size(); i++) cout << " [" << i << "]"<< " = (token = " << tfVec[i].token << ",Freq = " << tfVec[i].freq << ")" << endl; selectionSort(tfVec); cout << " Testing selection sort. "; for (i = 0; i < tfVec.size(); i++) cout << " [" << i << "]" << " = (token = " << tfVec[i].token << ",Freq = " << tfVec[i].freq << ")" << endl; insertionSort(tfVec); cout << " Testing insertion sort. "; for (i = 0; i < tfVec.size(); i++) cout << " [" << i << "]" << " = (token = " << tfVec[i].token << ",Freq = " << tfVec[i].freq << ")" << endl; return 0; }
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
