Question: Fill in the TODO blanks in ModelMatcher.java and MatcherController.java //MarkovModel,java import java.util.Set; /** * Construct a Markov model of order /k/ based on an input

Fill in the TODO blanks in ModelMatcher.java and MatcherController.java

//MarkovModel,java

import java.util.Set; /** * Construct a Markov model of order /k/ based on an input string. * * @author * @version */ public class MarkovModel {

/** Markov model order parameter */ int k; /** ngram model of order k */ NgramAnalyser ngram; /** ngram model of order k+1 */ NgramAnalyser n1gram;

/** * Construct an order-k Markov model from string s * @param k int order of the Markov model * @param s String input to be modelled */ public MarkovModel(int k, String s) { ngram = new NgramAnalyser(k, s); n1gram = new NgramAnalyser((k+1), s); }

/** * @return order of this Markov model */ public int getK() { return k; }

/** Estimate the probability of a sequence appearing in the text * using simple estimate of freq seq / frequency front(seq). * @param sequence String of length k+1 * @return double probability of the last letter occuring in the * context of the first ones or 0 if front(seq) does not occur. */ public double simpleEstimate(String sequence) { double prob; String seqNotLast = sequence.substring(0, sequence.length()-1);

if (ngram.getDistinctNgrams().contains(seqNotLast)) { double n1g = n1gram.getNgramFrequency(sequence); double ng = ngram.getNgramFrequency(seqNotLast); try{ prob = (n1gg); } catch(ArithmeticException e){ return 0.0; } return prob; } else { return 0.0; }

} /** * Calculate the Laplacian probability of string obs given this Markov model * @input sequence String of length k+1 */ public double laplaceEstimate(String sequence) { String context = sequence.substring(0, sequence.length()-1); double npc = n1gram.getNgramFrequency(sequence); double np = ngram.getNgramFrequency(context); double laplace; laplace = (npc + 1)/(np + ngram.getAlphabetSize()); return laplace; }

/** * @return String representing this Markov model */ public String toString() { String toRet = ""; String k = Integer.toString(getK()); toRet += (k + " "); toRet += (Integer.toString(ngram.getAlphabetSize()) + " "); toRet += ngram.toString() + n1gram.toString(); return toRet; }

}

--------------------------------------------------------------------------------------------------------------------------

//ModelMatcher.java

import java.util.HashMap; import java.util.Collection; import java.util.ArrayList; import java.util.Arrays;

/** * Report the average log likelihood of a test String occuring in a * given Markov model and detail the calculated values behind this statistic. * * @author * @version */ public class ModelMatcher {

/** log likelihoods for a teststring under a given model */ private HashMap logLikelihoodMap; /** summary statistic for this setting */ private double averageLogLikelihood; /** * Constructor to initialise the fields for the log likelihood map for * a test string and a given Markov model and * the average log likelihood summary statistic * @param MarkovModel model a given Markov model object * @param String teststring */ public ModelMatcher(MarkovModel model, String testString) { //TODO }

/** Helper method that calculates the average log likelihood statistic * given a HashMap of strings and their Laplace probabilities * and the total number of ngrams in the model. * * @param logs map of ngram strings and their log likelihood * @param ngramCount int number of ngrams in the original test string * @return average log likelihood: the total of loglikelihoods * divided by the ngramCount */ private double averageLogLikelihood(HashMap logs, int ngramCount) { //TODO return 0.1; } /** Helper method to calculate the total log likelihood statistic * given a HashMap of strings and their Laplace probabilities * and the total number of ngrams in the model. * * @param logs map of ngram strings and their log likelihood * @return total log likelihood: the sum of loglikelihoods in logs */ private double totalLogLikelihood(HashMap logs) { //TODO return 0.1; }

/** * @return the average log likelihood statistic */ public double getAverageLogLikelihood() { return averageLogLikelihood; } /** * @return the log likelihood value for a given ngram from the input string */ public double getLogLikelihood(String ngram) { return (logLikelihoodMap.get(ngram)); } /** * Make a String summarising the log likelihood map and its statistics * @return String of ngrams and their loglikeihood differences between the models * The likelihood table should be ordered from highest to lowest likelihood */ public String toString() { //TODO return null; }

}

--------------------------------------------------------------------------------------------------------------------------

input string. * * @author * @version */ public class MarkovModel {

//MatcherController.java

import java.io.File; import java.util.ArrayList; import java.util.HashMap; import java.util.Set; import java.io.*;

/** Create and manipulate Markov models and model matchers for lists of training data * a test data String and generate output from it for convenient display. * * @author * @version * */ public class MatcherController {

/** list of training data string used to generate markov models */ ArrayList trainingDataList; /** test data to be matched with the models */ String testData; /** order of the markov models*/ int k; /** generated list of markov models for the given training data*/ ArrayList modelList; /** generated list of matchers for the given markov models and test data*/ ArrayList matcherList;

/** Generate models for analysis * @param k order of the markov models to be used * @param testData String to check against different models * @throw unchecked exceptions if the input order or data inputs are invalid */ public MatcherController(int k, ArrayList trainingDataList, String testData) { //TODO }

/** @return a string containing all lines from a file * ff file contents can be got, otherwise null * This method should process any exceptions that arise. */ private static String getFileContents(String filename) { //TODO return null; }

/** * @return the ModelMatcher object that has the highest average loglikelihood * (where all candidates are trained for the same test string */ public ModelMatcher getBestMatch(ArrayList candidates) { //TODO return null; }

/** @return String an *explanation* of * why the test string is the match from the candidate models */ public String explainBestMatch(ModelMatcher best) { //TODO return null; }

/** Display an error to the user in a manner appropriate * for the interface being used. * * @param message */ public void displayError(String message) { // LEAVE THIS METHOD EMPTY }

}

Task 3- Matching test strings to a model (ModelMatcher) models better matches a test string. Given two For this task, you will need to complete the Mod class, which determines which of two Markov Markov models built from text samples taken from two different sources, we can use them to estimate which source it is more likely a test s was drawn from. Even using only zero-th order models constructed using English and Russian text, we should be able to tell, for instance, that the string bopLLI oe3EKyceH is more likely to be Russian than English. We will computer a measure of fit of test strings against models, called the likelihood of a sequence under the model. The likelihood of a sequence under a k-th order Markov model is calculated as follows For each symbol c in the sequence, compute the probability of observing c under the model, given its k-letter context p (assuming the sequence to wrap around", as described above) using the Laplace-smoothed estimate of probability we used for the MarkovModel class Compute the likelihood of the entire sequence as the product of the likelihoods of each character. mathematical notation: In Let s be an input sequence of length n, and let M be a k-th order Markov model. In order to calculate the likelihood of the sequence s under the model M, for each symbol ci in s (where 1 S is n), let pi be the k-length context of the symbol ci uming wraparound). The likelihood of the sequence s under the model is II laplace(ci) i 1 where laplace is the Laplace-smoothed probability of ci occurring ven its context) as described in the previous task. The probability we obtain from this calculation may be very smal in fact, potentially so small as to be indistinguishable from zero when using Java's built-in floating-point arithmetic. Therefore we will calculate and express the likelihood using log probabilities. which do not suffer from this problem. (A weakness of log probabilities is that they cannot straightforwardly represent probabilities of zero, but our use of Laplace smoothing allows us to avoid this problem. The product of two probabilities p and g can be calculated by adding their log probabilities, logpand log g By way of example, suppose we have constructed a 2nd-order Markov model using the input string "aabcabaacaac", as described in the example for Task 2. If we are then given a test string "aabbcaac we can compute its log likelihood as follows For each character in the test string, obtain its length-2 context (assuming wraparound). Note that the tri-gram "caa" occurs twice in the (wrapped) test string Context Character Frequency bb bc Ca

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Fill in the TODO blanks in ModelMatcher.java and MatcherController.java //ModelMatcher.java import java.util.HashMap; import java.util.Collection; import java.util.ArrayList; import java.util.Arrays;...

Task 2 Building a Markov model of a text sample (MarkovModel) For this task, you will need to complete the MarkovModel class, which generates a Markov model from an input string, and also write a...

You are provided with two Java files that you must use to develop your solution: MarkovModel.java and TextGenerator.java . The constructors of MarkovModel build the order-k model of the source text....

Create a markov model: You are provided with two Java files that you must use to develop your solution: MarkovModel.java and TextGenerator.java. The constructors of MarkovModel build the order-k...

Tasks The goal of the project is to complete the code for the NgramAnalyser, MarkovModel, ModelMatcher and MatcherController classes, as detailed below, and to add test code to a new JUnit test...

Predictive models of text: performing text analysis > import java.util.Set; /** * Construct a Markov model of order /k/ based on an input string. * * @author * @version */ public class MarkovModel {...

] > import java.util.ArrayList; import java.util.HashMap; import java.util.Set; import java.util.HashSet; import java.util.Arrays; /** * Perform n-gram analysis of a string. * * Analyses the...

Predictive models of text: performing text analysis > import java.util.ArrayList; import java.util.HashMap; import java.util.Set; import java.util.HashSet; import java.util.Arrays; /** * Perform...

This is the test class code that I need help with: @Test public void testTask3example() { MarkovModel model = new MarkovModel(2,"aabcabaacaac"); ModelMatcher match = new...

Ysom, Inc. was founded recently by Professor Shin at Yonsei University and Professor Lee at Severance Hospital which is an affiliate institute of Yonsei University. The company manufactures sleep...

The question is based on PIC Architecture in microprocessors. Floating point representations start by reducing the number to scientific notation in binary. Then the sign, the exponent, and the...

According to The Economist, China's economy is expected to overtake the USA's by as early as 2018. Multinational companies hoping to make the most of this opportunity have been investing in China via...

Questions Q1. Write a Python program to retrieve the first and last colors from the following list: color_list = ["red", "green", "white", "blue", "black") Q2. Given the following dictionary,...