Question: This will give you two different representations of the network: fr.n is a network object, while fr is a 71 71 matrix where a
This will give you two different representations of the network: fr.n is a network object, while fr is a 71 71 matrix where a 1 in cell (i, j) indicates lawyer i likes lawyer j (directed edge). (a) We would like to test whether popularity (indegree) is statistically different between different lawyers, in other words whether all indegree differences may be due to chance. We will do it in two ways: i. First, fit both a p statistical model with indegree, outdegree, mutuality, and one with outde- gree, mutuality only. Assuming the likelihood calculations are correct, perform a generalized likelihood ratio test (GLRT) between them and conclude whether the inclusion of the indegree is statistically useful. ii. Second, perform a permutation test using the matrix fr. You can choose how to do it, here's a suggestion: condition the analysis on the number of outgoing edges from each lawyer, and under the null these outgoing edges are randomly divided between the other 70 lawyers. Thus, each permuted matrix should: Preserve row sums Randomly shuffle columns for each row Avoid self-friending (no 1's on diagonal) Perform 104 permutations, or as many as you can. You will also have to choose a test statistic to represent how non-uniform the incoming edges distribution is, then calculate it once on the real matrix, and on each permuted matrix. The p-value is the percentage of permuted matrices that give a higher value than the original matrix. You can try more than one statistic. Justify your choice. Summarize your conclusion from both approaches: is popularity (in-degree) uniform or non- uniform for this data? Confirm your conclusion using the plot of the network or other relevant simple analysis. (b) The second task we would like to perform on this data is identify groups and structure by using a latent variable model. i. Fit at least four different models, with varying dimension, with/without a gaussian mixture structure with different numbers of components, etc. State your conclusions about: Number of clusters in the data Other structures and patterns you observe that are of interest Make sure to use both statistical and heuristic/graphical arguments to support your claims. ii. The objects fr.big, fr.big.n contain the same network, with 8 rows corresponding to lawyers with 0 ties (either outgoing or incoming) removed. Repeat the previous analysis on this data. Do your conclusions change?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
