Question: In the Java language. The source code to modify: import java.io.File; import java.util.Scanner; import java.util.Map.Entry; import java.util.AbstractMap; import java.util.Arrays; import java.util.Comparator; public class WordCountHeap {

In the Java language.

In the Java language. The source code to modify: import java.io.File; import

The source code to modify:

import java.io.File; import java.util.Scanner; import java.util.Map.Entry; import java.util.AbstractMap; import java.util.Arrays; import java.util.Comparator; public class WordCountHeap { /** Merge two strings. See the textbook for explanation. **/ public static void merge(String[] S1, String[] S2, String[] S) { int i = 0, j = 0; while (i + j = 1; k--) sink(a, k,n); } private static void heapSort(String[] pq) { int n = pq.length; constructHeap(pq); while (n > 1) { exch(pq, 1, n); n--; sink(pq, 1,n); } } private static void sink(String[] pq, int k, int n) { //your code goes here } /** Note that less and exch are defined to offset the 1-based array **/ private static boolean less(String[] pq, int i, int j) { return pq[i-1].compareTo( pq[j-1]) count_ARRAY_SORT(String[] tokens, String sortMethod) { int CAPACITY = 1000000; String[] words = new String[CAPACITY]; int[] counts = new int[CAPACITY]; if (sortMethod.equals("HEAP")) heapSort(tokens); else if (sortMethod.equals("MERGE")) mergeSort(tokens); else System.out.println(sortMethod + " sorting method does not exist."); int j = 0, k = 0; int len = tokens.length; while (j  maxCount) { maxWord = words[i]; maxCount = counts[i]; } } // HeapInt h=new HeapInt(counts); // for (int i=0; i(maxWord, maxCount); } static String[] readText(String PATH) throws Exception { Scanner doc = new Scanner(new File(PATH)).useDelimiter("[^a-zA-Z]+"); // tokenize text. any characters other than English letters(a-z and A-Z // ) are delimiters. int length = 0; while (doc.hasNext()) { doc.next(); length++; } String[] tokens = new String[length]; Scanner s = new Scanner(new File(PATH)).useDelimiter("[^a-zA-Z]+"); length = 0; while (s.hasNext()) { tokens[length] = s.next().toLowerCase(); length++; } doc.close(); return tokens; } public static void main(String[] args) throws Exception { String PATH = "/Users/PATH/code/dblp"; String[] METHODS = { "HEAP","MERGE"}; String[] DATASETS = { "200", "500", "1k", "5k", "10k", "100k", "1m", "" }; String[] tokens; // run the experiments on different data sets for (int j = 0; j  entry = count_ARRAY_SORT(tokens, METHODS[i]); long endTime = System.currentTimeMillis(); String time = String.format("%12d", endTime - startTime); System.out.println(METHODS[i] + " method\t time=" + time + ". Most popular word is " + entry.getKey() + ":" + entry.getValue()); } } } }

The problern is the sane as in A2, i.e., to count the frequency of words in a text file using the sorting m ethod. This time, the sorting method is heapSort, and you will be required to write more code. More specifically, your tasks are: 1. (2 marks) Sort the words using HeapSort algorithm. You need to write the code for the heapSort. Here is the template for the code, including the MergeSort so that you can have a comparison. Note that this time the template can not run-you need to provide the missing code first for the sink procedure 2. (3 marks) in A2 we can only print out the largest frequency. Heap is partially sorted, hence easier to get/remove the top values. Please construct a heap for frequency counts, and printout the top 20 largest counts when the data size is 1 million lines. Here is the link for the 1 million data dblplm. More specifically, you need to . construct a heap for frequency counts; implement removeMax) method; invoke removeMax 20 times to obtain the top counts You can also use other programming languages of your choice. For the frequency heap, a hint is that you can write a Heap class like belo public class HeapInt private int [) heap; int n-0; //Heap constructor public HeapInt (int C] a) ( n a.length; for (intk- n/2; k >-1; k--) sink (a, k,n); heap-a; public int removeMax () ( //your code goes here The problern is the sane as in A2, i.e., to count the frequency of words in a text file using the sorting m ethod. This time, the sorting method is heapSort, and you will be required to write more code. More specifically, your tasks are: 1. (2 marks) Sort the words using HeapSort algorithm. You need to write the code for the heapSort. Here is the template for the code, including the MergeSort so that you can have a comparison. Note that this time the template can not run-you need to provide the missing code first for the sink procedure 2. (3 marks) in A2 we can only print out the largest frequency. Heap is partially sorted, hence easier to get/remove the top values. Please construct a heap for frequency counts, and printout the top 20 largest counts when the data size is 1 million lines. Here is the link for the 1 million data dblplm. More specifically, you need to . construct a heap for frequency counts; implement removeMax) method; invoke removeMax 20 times to obtain the top counts You can also use other programming languages of your choice. For the frequency heap, a hint is that you can write a Heap class like belo public class HeapInt private int [) heap; int n-0; //Heap constructor public HeapInt (int C] a) ( n a.length; for (intk- n/2; k >-1; k--) sink (a, k,n); heap-a; public int removeMax () ( //your code goes here

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!