Question: Write a Python function seq_set_sim(seq_set1, seq_set2, k) that takes as arguments two sets of strings seq_set1 and seq_set2 and an integer k , and returns
Write a Python function seq_set_sim(seq_set1, seq_set2, k) that takes as arguments two sets of strings seq_set1 and seq_set2 and an integer k, and returns a floating point value between 0 and 1 (inclusive) giving the similarity between the sets of strings seq_set1 and seq_set2. Compute the similarity value as follows:
Use the Jaccard index to compute the similarity between individual strings.
Compute the distance between the sets of strings seq_set1 and seq_set2 as the maximum similarity between a string in seq_set1 and one in seq_set2.
You can use the code from the previous short problems as helper functions for this problem.
You can assume that seq_set1 and seq_set2 are both non-empty and that the strings in these sets all have length at least k.
Examples
Call: seq_set_sim(set(['aaaa','aabb']), set(['aaab']), 3) Return value: 0.5
Call: seq_set_sim(set(['aaabba','aabbcc']), set(['aaab','abbc']), 4) Return value: 0.3333333333333333
Call: seq_set_sim(set(['aaabba','abbc']), set(['aaab','aabbcc']), 2) Return value: 0.6
Call: seq_set_sim(set(['ababab','acacac']),set(['bababa','cacaca']), 3) Return value: 1.0
Call: seq_set_sim(set(['abbbbba','bcccccb']), set(['aaaaab','aaaaac']), 3) Return value: 0.0
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
