it takes forever to run the question 5 3 and 5 4 due to the large data, please re do the codes, so that it takes shorter time In addition, please add a code to check that there are no edges between nodes of the same type def load github data ( ) Tuple nx Graph, List str , List str Returns G NetworkX graph object uid list ( list ) list of users pid list ( list ) list of projects G nx Graph ( ) uid list set ( ) pid list set ( ) with open ( ' github data txt ' , ' r ' ) as file for line in file user id , project id line split ( ) G add node ( user id , bipartite 0 ) G add node ( project id , bipartite 1 ) G add edge ( user id , project id ) uid list add ( user id ) pid list add ( project id ) NOTE We are also returning a list of users and projects This will be helpful when getting the correct user and project indicies from the projections uid list sorted ( uid list ) pid list sorted ( pid list ) return G , uid list, pid list Create a function to create the User Project Matrix def calculate projections ( G , uid list, pid list ) Tuple sp sparse spmatrix, sp sparse spmatrix Inputs G NetworkX graph object uid list ( list ) list of users pid list ( list ) list of projects Returns user matrix ( sp sparse spmatrix ) one mode projection for users project matrix ( sp sparse spmatrix ) one mode projection for projects users projection np array ( 0 , 0 ) projects projection np array ( 0 , 0 ) adjacency matrix bipartite biadjacency matrix ( G , row order uid list, column order pid list ) users projection adjacency matrix dot ( adjacency matrix T ) projects projection adjacency matrix T dot ( adjacency matrix ) return users projection, projects projection Write a function that will return the pair of users that share the highest number of Github projects between them def get user pair ( M , uid list ) Tuple str , str Inputs M projected matrix uid list ( list ) list of users Returns u 1 ( str ) first user u 2 ( str ) second user max value 1 n len ( uid list ) for i in range ( n ) for j in range ( i 1 , n ) if M i , j max value max value M i , j u 1 , u 2 uid list i , uid list j return u 1 , u 2 Write a function that will return the pair of projects that share the highest number of users between them def get project pair ( M , pid list ) Tuple str , str Inputs M projected matrix pid list ( list ) list of projects Returns p 1 ( str ) first project p 2 ( str ) second project max value np max ( M ) indices np where ( pid list max value ) p 1 pid list indices 0 p 2 pid list indices 1 max value 1 M sp sparse csr matrix ( M ) n len ( pid list ) for i in range ( n ) for j in range ( i 1 , n ) if M i , j max value max value M i , j p 1 , p 2 pid list i , pid list j return p 1 0 , p 2 0

The Answer is in the image, click to view ...

Question: it takes forever to run the question 5 . 3 and 5 . 4 due to the large data, please re - do the codes,

it takes forever to run the question

5.3

and

5.4

due to the large data, please re

-

do the codes, so that it takes shorter time. In addition, please add a code to check that there are no edges between nodes of the same type. def load

_

github

_

data

() - >

Tuple

[

.

Graph, List

[

str

],

List

[

str

]]

" " "

Returns:

G: NetworkX graph object

uid

_

list

(

list

)

: list of users

pid

_

list

(

list

)

: list of projects

" " "

=

.

Graph

()

uid

_

list

=

set

()

pid

_

list

=

set

()

with open

('

github

_

data.txt

','

')

as file:

for line in file:

user

_

,

project

_

=

line.split

()

.

add

_

node

(

user

_

,

bipartite

= 0)

.

add

_

node

(

project

_

,

bipartite

= 1)

.

add

_

edge

(

user

_

,

project

_

)

uid

_

list.add

(

user

_

)

pid

_

list.add

(

project

_

)

# NOTE: We are also returning a list of users and projects. This will be helpful

# when getting the correct user and project indicies from the projections.

uid

_

list

=

sorted

(

uid

_

list

)

pid

_

list

=

sorted

(

pid

_

list

)

return G

,

uid

_

list, pid

_

list # Create a function to create the User

-

Project Matrix

def calculate

_

projections

(

,

uid

_

list, pid

_

list

) - >

Tuple

[

.

sparse.spmatrix, sp

.

sparse.spmatrix

]

" " "

Inputs:

G: NetworkX graph object

uid

_

list

(

list

)

: list of users

pid

_

list

(

list

)

: list of projects

Returns:

user

_

matrix

(

.

sparse.spmatrix

)

: one mode projection for users

project

_

matrix

(

.

sparse.spmatrix

)

: one mode projection for projects

" " "

" " "

users

_

projection

=

.

array

([0, 0])

projects

_

projection

=

.

array

([0, 0])

" " "

adjacency

_

matrix

=

bipartite.biadjacency

_

matrix

(

,

row

_

order

=

uid

_

list, column

_

order

=

pid

_

list

)

users

_

projection

=

adjacency

_

matrix.dot

(

adjacency

_

matrix.T

)

projects

_

projection

=

adjacency

_

matrix.T

.

dot

(

adjacency

_

matrix

)

return users

_

projection, projects

_

projection # Write a function that will return the pair of users that share the highest number of Github projects between them.

def get

_

user

_

pair

(

,

uid

_

list

) - >

Tuple

[

str

,

str

]

" " "

Inputs:

M: projected matrix

uid

_

list

(

list

)

: list of users

Returns:

1 (

str

) -

first user

2 (

str

) -

second user

" " "

max

_

value

= - 1

=

len

(

uid

_

list

)

for i in range

(

)

for j in range

(

+ 1,

)

if M

[

,

] >

max

_

value:

max

_

value

=

[

,

]

1,

2 =

uid

_

list

[

],

uid

_

list

[

]

return u

1,

2

# Write a function that will return the pair of projects that share the highest number of users between them.

def get

_

project

_

pair

(

,

pid

_

list

) - >

Tuple

[

str

,

str

]

" " "

Inputs:

M: projected matrix

pid

_

list

(

list

)

: list of projects

Returns:

1 (

str

) -

first project

2 (

str

) -

second project

" " "

#max

_

value

=

.

max

(

)

#indices

=

.

where

(

pid

_

list

= =

max

_

value

)

1 =

pid

_

list

[

indices

[0]]

2 =

pid

_

list

[

indices

[1]]

max

_

value

= - 1

=

.

sparse.csr

_

matrix

(

)

=

len

(

pid

_

list

)

for i in range

(

)

for j in range

(

+ 1,

)

if M

[

,

] >

max

_

value:

max

_

value

=

[

,

]

1,

2 =

pid

_

list

[

],

pid

_

list

[

]

return p

1 [0],

2 [0]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Page 1 of 2 ZOOM + Press Esc to exit full screen Buns Bakery Master Budget For the year ended 8. Pro Forma Income Statement Qtr 1 Qtr 2 atr 3 Qtr 4 Total Sales S 880,000.00 $ 924,000.00 S 970,200.00...

Please summarize below. Global competitive conditions driving the manufacturing location decision Wendy L. Tate , Lisa M. Ellramb, Tobias Schoenherr, Kenneth J. Petersen a a College of Business...

Hi. I need Chapter 11 part 1-4 Let me know. Thanks! (it wont letme add more than $8, but I will give $12+ tip) Personal Finance, Fifth Edition by Jeff Madura BUILDING YOUR OWN FINANCIAL PLAN WORKBOOK...

Location Income ($1,000) Urban 27 Rural 25 Suburban 25 Suburban 26 Rural 30 Urban 29 Rural 33 Urban 30 Suburban 32 Urban 34 Urban 35 Urban 40 Rural 30 Rural 33 Urban 42 Suburban 32 Urban 43 Urban 43...

I need chapters 18, 19, 20, and 21 for the workbook for Personal Finance by Madura!! Please help!!! Personal Finance, Fifth Edition by Jeff Madura BUILDING YOUR OWN FINANCIAL PLAN WORKBOOK INDEX...

Please provide answers only, no explanation is necessary. Make sure answer go with question # and part (i.e. 1a, 1b, 1c, etc...). MAKE FORMAT EASY TO READ. 1. 1. Introduction to the loanable funds...

Please solve :-- The butterfly symbolizes the notion of personal change. Increasingly, people are turning to butterflies to consecrate meaningful events, such as birthdays, weddings, and funerals. To...

Please answer all the questions:- Medicinal value of plants. Sea buckthorn (Hippophae), a plant that typically grows at high altitudes in Europe and Asia, has been found to have medicinal value. The...

need help with details:- The butterfly symbolizes the notion of personal change. Increasingly, people are turning to butterflies to consecrate meaningful events, such as birthdays, weddings, and...

Please answer question 5 with steps 2. Based on the data above, please calculate the mean of the monthly demand of year Y1 - Y5.(use excel) 3. Based on the data above, assuming the monthly demand is...

Use the limit definition of the derivative and the improper integral definition of (x) to find an integral definition of its derivative, (x) T(r) = lim t-le-t dt a00

Selected condensed data taken from a recent statement of financial position of Perkins Inc. are as follows. PERKINS INC. Statement of Financial Position (partial) Other current assets ............. ...

Some of the following points distinguish between the advantages and the disadvantages of flexible work patterns from the employer s perspective. Which options are correct? a The ability to combine...

Skills necessary for success in marketing include Multiple select question. analytical thinking advertising industry experience infallibility ability to work with others

5. Describe the pros and cons of five management development methods.

6. Do you think job rotation is a good method to use for developing management trainees? Why or why not?

2. John Santos is an undergraduate business student majoring in accounting. He just failed the first accounting course, Accounting 101. He is understandably upset. How would you use performance...