Question: it takes forever to run the question 5 . 3 and 5 . 4 due to the large data, please re - do the codes,
it takes forever to run the question and due to the large data, please redo the codes, so that it takes shorter time. In addition, please add a code to check that there are no edges between nodes of the same type. def loadgithubdata TuplenxGraph, Liststr Liststr:
Returns:
G: NetworkX graph object
uidlist list: list of users
pidlist list: list of projects
G nxGraph
uidlistset
pidlistset
with opengithubdata.txtr as file:
for line in file:
userid projectid line.split
Gaddnodeuserid bipartite
Gaddnodeprojectid bipartite
Gaddedgeuserid projectid
uidlist.adduserid
pidlist.addprojectid
# NOTE: We are also returning a list of users and projects. This will be helpful
# when getting the correct user and project indicies from the projections.
uidlist sorteduidlist
pidlist sortedpidlist
return G uidlist, pidlist # Create a function to create the UserProject Matrix
def calculateprojectionsG uidlist, pidlist Tuplespsparse.spmatrix, spsparse.spmatrix:
Inputs:
G: NetworkX graph object
uidlist list: list of users
pidlist list: list of projects
Returns:
usermatrix spsparse.spmatrix: one mode projection for users
projectmatrix spsparse.spmatrix: one mode projection for projects
usersprojection nparray
projectsprojection nparray
adjacencymatrix bipartite.biadjacencymatrixG roworderuidlist, columnorderpidlist
usersprojection adjacencymatrix.dotadjacencymatrix.T
projectsprojection adjacencymatrix.Tdotadjacencymatrix
return usersprojection, projectsprojection # Write a function that will return the pair of users that share the highest number of Github projects between them.
def getuserpairM uidlist Tuplestr str:
Inputs:
M: projected matrix
uidlist list: list of users
Returns:
ustr first user
ustr second user
maxvalue
n lenuidlist
for i in rangen:
for j in rangei n:
if Mi j maxvalue:
maxvalue Mi j
uu uidlisti uidlistj
return u u # Write a function that will return the pair of projects that share the highest number of users between them.
def getprojectpairM pidlist Tuplestr str:
Inputs:
M: projected matrix
pidlist list: list of projects
Returns:
pstr first project
pstr second project
#maxvalue npmaxM
#indices npwherepidlistmaxvalue
#p pidlistindices
#p pidlistindices
maxvalue
M spsparse.csrmatrixM
n lenpidlist
for i in rangen:
for j in rangei n:
if Mi j maxvalue:
maxvalue Mi j
pp pidlisti pidlistj
return p p
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
