Question . Logistic Regression: How 'unfair' can it be? We have seen that the inductive bias of
Question:
Question . Logistic Regression: How 'unfair' can it be?
We have seen that the inductive bias of an SVC guarantees that when the dataset is linearly separable, SVC will return a hyperplane that is at exactly the same distance from the two classes. But what about logistic regression? Can we guarantee that it can also be at least partially fair?
The answer is no. We can demonstrate how logistic regression can be 'unfair' by constructing a dataset with the properties that:
a. the data set is linearly separable, and
b. the optimal logistic regression model corresponds to a hyperplane that nearly 'touches' one of the two classes - that is, it has a very big margin with respect to one of the two classes, and a very small margin with respect to the other class.
Demonstrate your answer as follows:
Q3-1. Plot the data points, as we did above for the Iris data set. This will show that your data set is linearly separable.
Q3-2. Calculate the optimal logistic neuron weights using the function LogisticRegressionGD from Question 1.
Q3-3. Plot the decision regions to demonstrate how the learned separation line is unfair.
Hint: Try small datasets.
Note: It's best to use fresh variables for your dataset, since the previous values of ????,???? will be reused in Question 4.
Q3-4. The standard scikit-learn implementation of logistic regression uses regularization by default (????=1). Can you come up with a linearly separable dataset that makes that default implementation fail?
[Note: This is an experimental question. You should be able to use the example from above, or modify it, and make the default implementation fail.]
Discrete and Combinatorial Mathematics An Applied Introduction
ISBN: 978-0201726343
5th edition
Authors: Ralph P. Grimaldi