Question: write code to scrape pages from Amazon and match certain information about books. I'm coding in HTML, CSS, and PHP. I got the first two

 write code to scrape pages from Amazon and match certain information about books. I'm coding in HTML, CSS, and PHP. I got the first two preg_match regex to work, but the third one I can't get. I have included all files I am using, the main script is in scraper.php. And the information I'm looking for in the source code of the Amazon pages is the author, title, and publisher. I currently have copied two source codes and saved into files to test my regex as I can't constantly scrape from Amazon

Here is the main part of my php script that is matchin phrases with preg_match:

 (.+)<\/span> <\/h1>)/",$line,$result)) { echo "Title: $result[1] "; } if (preg_match("/(.+)<\/a> /",$line,$author)) { echo "Author: $author[1] "; } if (preg_match("/Publisher.+(.+)<\/span>/",$line,$goal)) { echo "Publisher: $goal[1] "; } } ?> 

And here is the code snippet from the Amazon page that I'm trying to scrape:

Title:

A Tutorial Introduction to Occam Programming

Author:

    • s Publisher ‏ : ‎ McGraw-Hill (December 1, 1987)
  • Language ‏ : ‎

I tried on the publisher regex to look for the span class of a-text-bold that had Publisher right after it and ignore any other characters until the beginning of the next span element. At that opening span tag, scrape all characters after it up until the next closing span tag. The print the result. Unfortunately, I received no text to my page and also had no error messages.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Algorithms Questions!