Question: Write in python.We will compute the PageRank of the articles of the Hawailan wikipedia, which is available at haw wikipedia org. Additional information of the
Write in python.We will compute the PageRank of the articles of the Hawailan wikipedia, which is available at haw wikipedia org. Additional information of the Hawailan wiki
can be found here.
Hints: If you dont speak Hawailan, you might want to learn the wiki iogic from the English wikipedia, and translate your findings. Also, caching is
recommended.
b i Write a function that scans an article given by its url and retrieves all links to other articles in the Hawailan wikipedia. Avoid links to special pages, images
or the ones that point to another website. Only count the proper article for links that point to a specific section. Use regular expressions to manage these
cases. ii Make sure to match redirections to their correct destiation article. To this end, find how wikipedia treats redirections and retrieve the true article.
Help: Try searching for uc davis' on
enwikipedia.org To this end, I used the collection or article urls obtained in a which I stored in a dict object to allow for
fast lookups. Then, for each new found link I checked whether that link appeared in the dict. If not, It might be a redirection and recelve special attention.
iii Request all articles and obtain all links to other articles.
How many links to other articles are there? I found
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
