Question: In the Java language. The source code to modify: import java.io.File; import java.util.Scanner; import java.util.Map.Entry; import java.util.AbstractMap; import java.util.Arrays; import java.util.Comparator; public class WordCountHeap {
In the Java language.

The source code to modify:
import java.io.File; import java.util.Scanner; import java.util.Map.Entry; import java.util.AbstractMap; import java.util.Arrays; import java.util.Comparator; public class WordCountHeap { /** Merge two strings. See the textbook for explanation. **/ public static void merge(String[] S1, String[] S2, String[] S) { int i = 0, j = 0; while (i + j = 1; k--) sink(a, k,n); } private static void heapSort(String[] pq) { int n = pq.length; constructHeap(pq); while (n > 1) { exch(pq, 1, n); n--; sink(pq, 1,n); } } private static void sink(String[] pq, int k, int n) { //your code goes here } /** Note that less and exch are defined to offset the 1-based array **/ private static boolean less(String[] pq, int i, int j) { return pq[i-1].compareTo( pq[j-1]) count_ARRAY_SORT(String[] tokens, String sortMethod) { int CAPACITY = 1000000; String[] words = new String[CAPACITY]; int[] counts = new int[CAPACITY]; if (sortMethod.equals("HEAP")) heapSort(tokens); else if (sortMethod.equals("MERGE")) mergeSort(tokens); else System.out.println(sortMethod + " sorting method does not exist."); int j = 0, k = 0; int len = tokens.length; while (j maxCount) { maxWord = words[i]; maxCount = counts[i]; } } // HeapInt h=new HeapInt(counts); // for (int i=0; i(maxWord, maxCount); } static String[] readText(String PATH) throws Exception { Scanner doc = new Scanner(new File(PATH)).useDelimiter("[^a-zA-Z]+"); // tokenize text. any characters other than English letters(a-z and A-Z // ) are delimiters. int length = 0; while (doc.hasNext()) { doc.next(); length++; } String[] tokens = new String[length]; Scanner s = new Scanner(new File(PATH)).useDelimiter("[^a-zA-Z]+"); length = 0; while (s.hasNext()) { tokens[length] = s.next().toLowerCase(); length++; } doc.close(); return tokens; } public static void main(String[] args) throws Exception { String PATH = "/Users/PATH/code/dblp"; String[] METHODS = { "HEAP","MERGE"}; String[] DATASETS = { "200", "500", "1k", "5k", "10k", "100k", "1m", "" }; String[] tokens; // run the experiments on different data sets for (int j = 0; j entry = count_ARRAY_SORT(tokens, METHODS[i]); long endTime = System.currentTimeMillis(); String time = String.format("%12d", endTime - startTime); System.out.println(METHODS[i] + " method\t time=" + time + ". Most popular word is " + entry.getKey() + ":" + entry.getValue()); } } } }The problern is the sane as in A2, i.e., to count the frequency of words in a text file using the sorting m ethod. This time, the sorting method is heapSort, and you will be required to write more code. More specifically, your tasks are: 1. (2 marks) Sort the words using HeapSort algorithm. You need to write the code for the heapSort. Here is the template for the code, including the MergeSort so that you can have a comparison. Note that this time the template can not run-you need to provide the missing code first for the sink procedure 2. (3 marks) in A2 we can only print out the largest frequency. Heap is partially sorted, hence easier to get/remove the top values. Please construct a heap for frequency counts, and printout the top 20 largest counts when the data size is 1 million lines. Here is the link for the 1 million data dblplm. More specifically, you need to . construct a heap for frequency counts; implement removeMax) method; invoke removeMax 20 times to obtain the top counts You can also use other programming languages of your choice. For the frequency heap, a hint is that you can write a Heap class like belo public class HeapInt private int [) heap; int n-0; //Heap constructor public HeapInt (int C] a) ( n a.length; for (intk- n/2; k >-1; k--) sink (a, k,n); heap-a; public int removeMax () ( //your code goes here
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
