Question: Predictive models of text: performing text analysis > import java.util.Set; /** * Construct a Markov model of order /k/ based on an input string. *

Predictive models of text: performing text analysis

Predictive models of text: performing text analysis > import java.util.Set; /** * Construct a Markov model of order /k/ based on an input string. * * @author * @version */ public class MarkovModel { /** Markov

import java.util.Set; /** * Construct a Markov model of order /k/ based on an input string. * * @author * @version */ public class MarkovModel {

/** Markov model order parameter */ int k; /** ngram model of order k */ NgramAnalyser ngram; /** ngram model of order k+1 */ NgramAnalyser n1gram;

/** * Construct an order-k Markov model from string s * @param k int order of the Markov model * @param s String input to be modelled */ public MarkovModel(int k, String s) { //TODO replace this line with your code }

/** * @return order of this Markov model */ public int getK() { return k; }

/** Estimate the probability of a sequence appearing in the text * using simple estimate of freq seq / frequency front(seq). * @param sequence String of length k+1 * @return double probability of the last letter occuring in the * context of the first ones or 0 if front(seq) does not occur. */ public double simpleEstimate(String sequence) { //TODO replace this line with your code return -1.0;

} /** * Calculate the Laplacian probability of string obs given this Markov model * @input sequence String of length k+1 */ public double laplaceEstimate(String sequence) { //TODO replace this line with your code return -1.0; }

/** * @return String representing this Markov model */ public String toString() { //TODO replace this line with your code return null; }

}

--------------------------------------------------------------------------------------------------------------------------

import java.util.HashMap; import java.util.Collection; import java.util.ArrayList; import java.util.Arrays;

/** * Report the average log likelihood of a test String occuring in a * given Markov model and detail the calculated values behind this statistic. * * @author * @version */ public class ModelMatcher {

/** log likelihoods for a teststring under a given model */ private HashMap logLikelihoodMap; /** summary statistic for this setting */ private double averageLogLikelihood; /** * Constructor to initialise the fields for the log likelihood map for * a test string and a given Markov model and * the average log likelihood summary statistic * @param MarkovModel model a given Markov model object * @param String teststring */ public ModelMatcher(MarkovModel model, String testString) { //TODO }

/** Helper method that calculates the average log likelihood statistic * given a HashMap of strings and their Laplace probabilities * and the total number of ngrams in the model. * * @param logs map of ngram strings and their log likelihood * @param ngramCount int number of ngrams in the original test string * @return average log likelihood: the total of loglikelihoods * divided by the ngramCount */ private double averageLogLikelihood(HashMap logs, int ngramCount) { //TODO return 0.1; } /** Helper method to calculate the total log likelihood statistic * given a HashMap of strings and their Laplace probabilities * and the total number of ngrams in the model. * * @param logs map of ngram strings and their log likelihood * @return total log likelihood: the sum of loglikelihoods in logs */ private double totalLogLikelihood(HashMap logs) { //TODO return 0.1; }

/** * @return the average log likelihood statistic */ public double getAverageLogLikelihood() { return averageLogLikelihood; } /** * @return the log likelihood value for a given ngram from the input string */ public double getLogLikelihood(String ngram) { return (logLikelihoodMap.get(ngram)); } /** * Make a String summarising the log likelihood map and its statistics * @return String of ngrams and their loglikeihood differences between the models * The likelihood table should be ordered from highest to lowest likelihood */ public String toString() { //TODO return null; }

}

--------------------------------------------------------------------------------------------------------------------------

/** * Construct an order-k Markov model from string s * @param

import java.io.File; import java.util.ArrayList; import java.util.HashMap; import java.util.Set; import java.io.*;

/** Create and manipulate Markov models and model matchers for lists of training data * a test data String and generate output from it for convenient display. * * @author * @version * */ public class MatcherController {

/** list of training data string used to generate markov models */ ArrayList trainingDataList; /** test data to be matched with the models */ String testData; /** order of the markov models*/ int k; /** generated list of markov models for the given training data*/ ArrayList modelList; /** generated list of matchers for the given markov models and test data*/ ArrayList matcherList;

/** Generate models for analysis * @param k order of the markov models to be used * @param testData String to check against different models * @throw unchecked exceptions if the input order or data inputs are invalid */ public MatcherController(int k, ArrayList trainingDataList, String testData) { //TODO }

/** @return a string containing all lines from a file * ff file contents can be got, otherwise null * This method should process any exceptions that arise. */ private static String getFileContents(String filename) { //TODO return null; }

/** * @return the ModelMatcher object that has the highest average loglikelihood * (where all candidates are trained for the same test string */ public ModelMatcher getBestMatch(ArrayList candidates) { //TODO return null; }

/** @return String an *explanation* of * why the test string is the match from the candidate models */ public String explainBestMatch(ModelMatcher best) { //TODO return null; }

/** Display an error to the user in a manner appropriate * for the interface being used. * * @param message */ public void displayError(String message) { // LEAVE THIS METHOD EMPTY }

}

--------------------------------------------------------------------------------------------------------------------------

import static org.junit.Assert.*; import org.junit.After; import org.junit.Before; import org.junit.Test;

/** * The test class ProjectTest for student test cases. * Add all new test cases to this task. * * @author * @version */ public class ProjectTest { /** * Default constructor for test class ProjectTest */ public ProjectTest() { }

/** * Sets up the test fixture. * * Called before every test case method. */ @Before public void setUp() { }

/** * Tears down the test fixture. * * Called after every test case method. */ @After public void tearDown() { } //TODO add new test cases from here include brief documentation @Test(timeout=1000) public void testLaplaceExample() { assertEquals(0,1); //TODO replace with test code } @Test(timeout=1000) public void testSimpleExample() { assertEquals(0,1); //TODO replace with test code }

@Test public void testTask3example() { MarkovModel model = new MarkovModel(2,"aabcabaacaac"); ModelMatcher match = new ModelMatcher(model,"aabbcaac"); assertEquals(0,1); //TODO replace with test code } }

Task 2 -Building a Markov model of a text sample MarkovModel) For this task, you will need to complete the MarkovModel class, which generates a Markov model from an input string, and also write a JUnit test for your model. Markov models are probabilistic models (i.e., they model the chances of particular events occurring) and are used for a broad range of natural language processing tasks (including computer speech recognition). They are widely used to model all sorts of dynamical processes in engineering, mathematics, finance and many other areas. They can be used to estimate the probability that a symbol will appear in a source of text, given the symbols that have preceded it A zero-th order Markov model of a text-source is one that estimates the probability that the next character in a sequence is, say, an "a", based simply on how frequently it occurs in a sample. Higher-order Markov models generalize on this idea. Based on a sample text, they estimate the likelihood that a particular symbol will occur in a sequence of symbols drawn from a source of text, where the probability of each symbol occurring can depend upon preceding symbols. In a first order Markov model, the probability of a symbol occurring depends only on the previous symbol. Thus, for English text, the probability of encountering a "u" can depend on whether the previous letter was a "q If it was indeed a "q then the probability of encountering a "u" should be quite high. For a second order Markov model, the probability of encountering a particular symbol can depend on the previous two symbols. and generally, the probabilities used by a t-th order Markov model can depend on the preceding k symbols. A Markov model can be used to estimate the probability of a symbol appearing, given its k predecessors, in a simple way, as follows For each context of characters of length k we estimate the probability ofthat context being followed by each letter c in our alphabet as the number of times the context appears followed by c, divided by the number of times the context appear in total. As with our NgramAnalyser class, we consider our input string to "wrap round" when analysing contexts near its end. Call this way of estimating probabilities simple estimation For instance, consider the string "aabcabaacaac". The 2 and 3-grams in it (assuming wrap-around are as follows 2-gram frequency gram frequency aab aa ab aaC aba. aC ba abc aca baa Ca boca. 2-gram frequencies Caa Cab 3-gram frequencies Given the context "aa", we can simply estimate the probability that the next character is "b" as (number of occurrences of "aab") P(next character is a "b" if last 2 were kaa" number of occurrences of "aa 3

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Predictive models of text: performing text analysis > import java.util.ArrayList; import java.util.HashMap; import java.util.Set; import java.util.HashSet; import java.util.Arrays; /** * Perform...

Fill in the TODO blanks in ModelMatcher.java and MatcherController.java //MarkovModel,java import java.util.Set; /** * Construct a Markov model of order /k/ based on an input string. * * @author *...

Tasks The goal of the project is to complete the code for the NgramAnalyser, MarkovModel, ModelMatcher and MatcherController classes, as detailed below, and to add test code to a new JUnit test...

] > import java.util.ArrayList; import java.util.HashMap; import java.util.Set; import java.util.HashSet; import java.util.Arrays; /** * Perform n-gram analysis of a string. * * Analyses the...

Task 2 Building a Markov model of a text sample (MarkovModel) For this task, you will need to complete the MarkovModel class, which generates a Markov model from an input string, and also write a...

You are provided with two Java files that you must use to develop your solution: MarkovModel.java and TextGenerator.java . The constructors of MarkovModel build the order-k model of the source text....

This is the test class code that I need help with: @Test public void testTask3example() { MarkovModel model = new MarkovModel(2,"aabcabaacaac"); ModelMatcher match = new...

Fill in the TODO blanks in ModelMatcher.java and MatcherController.java //ModelMatcher.java import java.util.HashMap; import java.util.Collection; import java.util.ArrayList; import java.util.Arrays;...

Create a markov model: You are provided with two Java files that you must use to develop your solution: MarkovModel.java and TextGenerator.java. The constructors of MarkovModel build the order-k...

Bosch is about to launch three new windshield wiper blades to the market: Envision, Icon, and Micro. Bosch expects to sell the wiper blades in the ratio 3:5:9 respectively. The prices of the blades...

A researcher wishes to determine if the number of viewers for 10 returning television shows has not changed since last year. The data are given in millions of viewers. At Î± = 0.01, test...

9 a Copy and complete the function machine for each table of values. b Write each function in part a as an equation. i D ii D -5 -3 2.5 4.4 x -24-6 4.4 12.8 y -9 -7 -1.5 0.4 y -6 -1.5 1.1 3.2

CT Corp Comprehensive Question Canadian Tire Corporation, Limited (Canadian Tire) is a family of companies that includes a retail segment and a financial services division, among others. The retail...

5. List the forces that shape a groups decisions

4. Identify how culture affects appropriate leadership behavior

3. Identify the qualities that make leaders effective at enacting change