The two problems of this lab concern Java strings and substrings. Contrary to how things tend to
Question:
The two problems of this lab concern Java strings and substrings. Contrary to how things tend to work in other languages such as Python, Java 7 took advantage of the immutability of its String type by having the new object returned by the substring method share the same underlying character array with the original string. Since the contents of this shared char[] will never change, no harm can result in having multiple String objects share that same array. Immutability, huzzah! This policy allowed the substring method to work in the guaranteed constant time needed to allocate the object from the heap and initialize its Yields, regardless of the length of the original string and the substring being extracted into a separate object. On the other hand, the big downside of this policy is that every String object in Java must know its length and the starting offset from the beginning of its character array. This causes every String object in the heap to be eight bytes larger than would be necessary, creating an instance of a curious phenomenon for which we could perhaps coin the termspace-space tradeoff. Furthermore, sharing character arrays between strings and substrings prevent garbage collection of the large character array of the original string that went out of scope, still kept alive by some small substring object that uses only a small part of that big character array. In your labs project,two static methods
public static int countDistinctSubstrings(String text)
public static String reverseSubstringsBetweenParentheses(String text)
The Yirst method should count how many distinct non-empty substrings exist in the given text, and return that count. For example, the string "lollo" contains a total of eleven distinct substrings: "lollo" of length 5, "loll" and "ollo" of length 4, "lol", "oll" and "llo" of length 3, "lo", "ol" and "ll" of length 2, and "l" and "o" of length 1. This problem could be solved in a straightforward "Shlemiel" fashion with two nested for-loops to iterate through all pairs of possible starting points and substring lengths, and using some kind of Set to maintain the distinctiveness of the strings so discovered. Even though the constant factors in the running time of this method are small enough to make it fast when most of the substrings are distinct from each other and it become necessary to iterate over all of them anyway, this technique is subject to accidentally quadratic running time, provided that your worst enemy gets to choose the text. He can make the text consist ofnrepeated copies of the exact same character, the running time becoming painfully noticeable oncenreaches into the thousands, as it will in the JUnit test for this problem. Such a string contains preciselyndifferent distinct substrings, one for each possible length from 1 ton, but this algorithm would loop through a quadratic number of combinations to discover these substrings only to immediately throw away most of them! A more complicatedadversary-proofsolution guaranteed to be invulnerable to accidentally quadratic running time, but pays this guarantee with a higher constant factor in its running time over average-case inputs, stems from the realization that the substrings of lengthkcan be constructed from the substrings of lengthk+1, by removing either the First or the last character from each one of these longer substrings. Your method should therefore generate the distinct substrings in the descending order of length. Use two separate instances of HashSet, the First to remember the known distinct substrings of the previous lengthk, the second to store the substrings of lengthk-1 that are currently being generated. Initially, the First set is a singleton that contains only the original text itself. In each round of the loop, iterate through the known strings ofkcharacters currently in the First set. Add to the second set the two substrings of lengthk-1 constructed by dropping the First and last characters of that string of lengthk. Once you have done this for every string in the First set, add the size of the second set to your current tally of distinct substrings found. You now have the green light to clear the First set, and exchange the roles of the two sets for the next round of the loop to generate all the distinct substrings of lengthk-2. The second method of this lab also concerns substrings, but also illustrates the use and methods of the StringBuilder mutable string to efFiciently construct a new string piecemeal by repeatedly appending characters to the end, and modifying this string while it is under construction. Given a text that consists of letters grouped with possibly nested parentheses, this method should return a new string where the contents between each matching pair of parentheses have been reversed, leaving out the parentheses in this process. For example, given the string "(ab)c(de)", this method should return the string "baced". Whenever the text contains nested parentheses, these substring reversals must be performed inside out. For example, the string "(a(bc)d)" should become "dbca". The JUnit tester guarantees that the parentheses inside the text given to your method are alwaysproperly nested and balanced, so your method does not have to concern itself of handling malformed strings whose parenthesis structure is internally inconsistent. To perform this operation efFiciently, use a StringBuilder to accumulate the result, along with a Stack instance to keep track of the positions of the left parentheses encountered along the way. Loop through the characters of text from left to right, and append each character that is neither kind of parenthesis to the result. At each open parenthesis, push the current length of the result buffer into the stack. At each closing parenthesis, pop the position of its left counterpart from the stack. Extract and delete the substring of the result from that position into a separate substring that you then reverse and append back to the result. Wham, bam, thank you, ma'am... except that even after almost three decades of evolution of the Java programming language, its String objects still do not have the simple reverse method that we take for granted in basically any other programming language used for serious professional work. The canonical way to make a reversed version of some Java string object str is by going through the expression new StringBuilder(str).reverse().toString(). In general, Java enjoys so few advantages over Python 3 that we are almost obligated to point them out every time, and the guaranteed constant time operation of the substring extraction certainly qualiFied. This idea could potentially be taken much further especially if the String data type were not defined to be final, but allowed different implementations to store the immutable text in different ways. Then, text editors and other programs that need immutable strings that are not merely sliced and diced into smaller pieces but also concatenated to make new strings from these pieces, could use the rope data structure where these operations work in guaranteed constant time, as illustrated in the labs 64 and 65 of this document, The price of storing additional references inside each rope node would be well countered by the fact that each piece needs to exist only once in the memory, regardless of inside how many larger concatenated pieces of text that piece has been used as a building block.
Junit Test
public class P2J13Test { private static final String ALPHABET = "abcdefghijklmnopqrstuvwxyz"; @Test public void reverseSubstringsBetweenParenthesesExplicit() { assertEquals("Z", P2J13.reverseSubstringsBetweenParentheses("Z")); assertEquals("Z", P2J13.reverseSubstringsBetweenParentheses("(Z)")); assertEquals("YXXY", P2J13.reverseSubstringsBetweenParentheses("((XY)(YX))")); assertEquals("abcd", P2J13.reverseSubstringsBetweenParentheses("abcd")); assertEquals("acbd", P2J13.reverseSubstringsBetweenParentheses("a(bc)d")); assertEquals("dbca", P2J13.reverseSubstringsBetweenParentheses("(a(bc)d)")); assertEquals("XYZ", P2J13.reverseSubstringsBetweenParentheses("()X(())Y((()()))Z(())")); assertEquals("", P2J13.reverseSubstringsBetweenParentheses("(()(()())())")); assertEquals("ebcda", P2J13.reverseSubstringsBetweenParentheses("(((((a(b(c)d)e)))))")); } @Test public void testReverseSubstringsBetweenParenthesesHundred() { testReverseSubstringsBetweenParentheses(100, 2217732059L); } @Test public void testReverseSubstringsBetweenParenthesesTenThousand() { testReverseSubstringsBetweenParentheses(10_000, 1796567666L); } @Test public void testReverseSubstringsBetweenParenthesesHundredThousand() { testReverseSubstringsBetweenParentheses(100_000, 1760148165L); } private void testReverseSubstringsBetweenParentheses(int n, long expected) { Random rng = new Random(12345 + n); CRC32 check = new CRC32(); int count = 0, goal = 2, m = 7; for(int i = 0; i < n; i++) { StringBuilder text = new StringBuilder(); int openCount = 0; while(text.length() < m) { int choice = rng.nextInt(100); if(choice < 30) { text.append('('); openCount++; } else if(openCount > 0 && choice < 60) { text.append(')'); openCount--; } else { text.append(ALPHABET.charAt(rng.nextInt(ALPHABET.length()))); } } while(openCount > 0) { text.append(')'); openCount--; } String result = P2J13.reverseSubstringsBetweenParentheses(text.toString()); for(int j = 0; j < result.length(); j++) { check.update((int)(result.charAt(j))); } if(++count == goal) { count = 0; goal += 2; m++; } } assertEquals(expected, check.getValue()); } @Test public void testCountDistinctSubstringsExplicit() { assertEquals(1, P2J13.countDistinctSubstrings("Z")); assertEquals(3, P2J13.countDistinctSubstrings("aaa")); assertEquals(15, P2J13.countDistinctSubstrings("banana")); assertEquals(19, P2J13.countDistinctSubstrings("uuuXuuu")); assertEquals(300, P2J13.countDistinctSubstrings("baabaaboobaabaabaaboobaabaabaaboo")); } @Test public void testCountDistinctSubstringsHundred() { testCountDistinctSubstrings(100, 3997080533L); } @Test public void testCountDistinctSubstringsThousand() { testCountDistinctSubstrings(1000, 1195123337L); } @Test public void testCountDistinctSubstringsTenThousand() { testCountDistinctSubstrings(10000, 2175212746L); } @Test public void testCountDistinctSubstringDontBeShlemiel() { StringBuilder text = new StringBuilder(); for(int i = 0; i < 1000; i++) { text.append('$'); } for(int i = 1000; i < 2000; i++) { assertEquals(i, P2J13.countDistinctSubstrings(text.toString())); text.append('$'); } } private void testCountDistinctSubstrings(int n, long expected) { Random rng = new Random(12345 + n); CRC32 check = new CRC32(); int count = 0, goal = 3, m = 3; for(int i = 0; i < n; i++) { StringBuilder text = new StringBuilder(); int a = Math.min(ALPHABET.length(), 1 + i % m); while(text.length() < m) { if(text.length() > 2 && rng.nextBoolean()) { int s = rng.nextInt(text.length() - 2); int e = s + rng.nextInt(text.length() - s); text.append(text.substring(s, e)); } else { text.append(ALPHABET.charAt(rng.nextInt(a))); } } int result = P2J13.countDistinctSubstrings(text.toString()); check.update(result); if(++count == goal) { count = 0; goal += 2; m++; } } assertEquals(expected, check.getValue()); } }
International Marketing And Export Management
ISBN: 9781292016924
8th Edition
Authors: Gerald Albaum , Alexander Josiassen , Edwin Duerr