Question: Speech recognition systems need to match audio streams that represent the same words spoken at different speeds. Suppose, therefore, that you are given two sequences

Speech recognition systems need to match audio streams that represent the same words spoken at different speeds. Suppose, therefore, that you are given two sequences of numbers, X = (x₁, x₂,...,x_n), and Y = (y₁, y₂,...,y_m), representing two different audio streams that need to be matched. A mapping between X and Y is a list, M, of distinct pairs, (i, j), that is ordered lexicographically, such that, for each i ∈ [1, n], there is at least one pair, (i, j), in M, and for each j ∈ [1, m], there is at least one pair, (i, j), in M. Such a mapping is monotonic if, for any (i, j) and (k,l) in M, with (i, j) coming before (k,l) in M, we have i ≤ k and j ≤ l. For example, given

X = (3, 9, 9, 5) and Y = (3, 3, 9, 5, 5),

one possible monotonic mapping between X and Y would be

M = [(1, 1),(1, 2),(2, 3),(3, 3),(4, 4),(4, 5)].

The dynamic time warping problem is to find a monotonic mapping, M, between X and Y , that minimizes the distance, D(X, Y ), between X and Y , subject to M, which is defined as

D(X,Y) = |T; – yjl, (i,j)EM

where this minimization is taken over all possible monotonic mappings between X and Y . For instance, in the example X, Y , and M, given above, we have D(X, Y )=0. Describe an efficient algorithm for solving the dynamic time warping problem. What is the running time of your algorithm?

D(X,Y) = |T; yjl, (i,j)EM

Step by Step Solution

★★★★★

3.51 Rating (171 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock

Define a cost function Ci j which is the best way to match the first i numbers in X to the f... View full answer

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Data Structures Algorithms Questions!

A substring of some character string is a contiguous sequence of characters in that string (which is different than a subsequence, of course). Suppose you are given two character strings, X and Y ,...

True or false: a. If stored video is streamed directly from a Web server to a media player, then the application is using TCP as the underlying transport protoco\' b. When using RTP, it is possible...

Give an example of your own of a relation with (a) One relation-valued attribute and (b) Two such attributes. Also, give two more relation that represent the same information as those relations but...

SECTION A: J data is pro cessed, stored, and transmitted as a series of 1s and Os. Each 1 or 0 is called a(n) 7. A series of eight Os and 1s, called a(n) J. represents one character-a letter, number,...

1. Describe the communication process. 2. Understand the importance of feedback in the communication process. 3. Understand various verbal and nonverbal methods of communication. 4. Understand the...

Decide if the speech adjustments below are about dialect or register differences. I say when I'm in Mexico, but when I'm in Argentina. I say when my friends tell me an incredible story, but if a...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Please read chapter 8 and answer the questions and see the (guide to answer number 3) For each case study, you will view the material as the student's teacher, read the information provided and write...

Please read chapter 8 and answer the questions and see the ( guide to answer number 3) For each case study, you will view the material as the student's teacher, read the information provided and...

What is content analysis, and why is it useful to a forensic accountant?

In the adjusted trial balance of Stargate Consulting Co. for the current year, expenses are $192,700 and the revenues total $239,300. In preparing the income statement from the adjusted trial...

Some years ago T - Mobile and Sprint merged to form what is currently T - Mobile. Was this merger a horizontal or a vertical merger? Please explain.

11. Which of the following is not true about the MACRS depreciation system: a. A salvage value must be determined before depreciation percentages are applied to depreciable real estate. b....

Write a C++ class that is derived from the Progression class to produce a progression where each value is the square root of the previous value. You should include a default constructor that starts...

Draw a class inheritance diagram for the following set of classes. Class Goat extends Object and adds a member variable tail and functions milk and jump. Class Pig extends Object and adds a member...

Write a C++ program that has a Polygon interface that has abstract functions, area(), and perimeter(). Implement classes for Triangle, Quadrilateral, Pentagon, Hexagon, and Octagon, which implement...

Suppose rRF = 8%, rM = 12%, and bi = 1.8. a. What is ri , the required rate of return on Stock i? Round your answer to two decimal places. % b. Now suppose rRF (1) increases to 9%. The slope of the...

What would the missing values be and if you could explain that would be great $ (a) 0 110,000 2 Equity, December 31, 2015 3 Owner investments for stock during the year 4 Dividends during the year 5...

Biotech Corporation stock had its last dividend paid at 0.80 per share and the managers of the company expect the growth rate of the firm to be 4 percent. The market portfolio return has an annual...