Consider a dataset with data points each having 3 features, e.g., x 1 = { Atlanta ,
Fantastic news! We've Found the answer you've been seeking!
Question:
Consider a dataset with data points each having 3 features, e.g., x 1 = {"Atlanta", "house", 500k},
and x 2 = {"Houston", "house", 300k}. Define a proper similarity function d(x i , xj ) for this kind of data,
and argue why it is a reasonable choice. (Hint: The feature vector consists of categorial and real-valued features; for categorical variables, it is better to convert them into one-hot-keying binary vectors and use Hamming distance, and for real-valued features, you may use Euclidean distance, for instance. And then you can combine the similarity measure in some way.)
Related Book For
Posted Date: