Question: 6.19 Sequence kernels. Let X = fa; c; g; tg. To classify DNA sequences using SVMs, we wish to de ne a kernel between sequences

6.19 Sequence kernels. Let X = fa; c; g; tg. To classify DNA sequences using SVMs, we wish to de ne a kernel between sequences de ned over X. We are given a nite set I  X of non-coding regions (introns). For x 2 X, denote by jxj the length of x and by F(x) the set of factors of x, i.e., the set of subsequences of x with contiguous symbols. For any two strings x; y 2 X de ne K(x; y) by K(x; y) =

X z 2(F(x)\F(y))????I

jzj; (6.32)

where   1 is a real number.

(a) Show that K is a rational kernel and that it is positive de nite symmetric.

(b) Give the time and space complexity of the computation of K(x; y) with respect to the size s of a minimal automaton representing X ???? I.

(c) Long common factors between x and y of length greater than or equal to n are likely to be important coding regions (exons). Modify the kernel K to assign weight jzj 2 to z when jzj  n, jzj 1 otherwise, where 1  1  2.
Show that the resulting kernel is still positive de nite symmetric.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Pattern Recognition And Machine Learning Questions!

Q:

a