write code to scrape pages from Amazon and match certain information about books. I'm coding in HTML,
Question:
write code to scrape pages from Amazon and match certain information about books. I'm coding in HTML, CSS, and PHP. I got the first two preg_match regex to work, but the third one I can't get. I have included all files I am using, the main script is in scraper.php. And the information I'm looking for in the source code of the Amazon pages is the author, title, and publisher. I currently have copied two source codes and saved into files to test my regex as I can't constantly scrape from Amazon
Here is the main part of my php script that is matchin phrases with preg_match:
(.+)<\/span> <\/h1>)/",$line,$result)) { echo "Title: $result[1] "; } if (preg_match("/(.+)<\/a> /",$line,$author)) { echo "Author: $author[1] "; } if (preg_match("/Publisher.+(.+)<\/span>/",$line,$goal)) { echo "Publisher: $goal[1] "; } } ?>
And here is the code snippet from the Amazon page that I'm trying to scrape:
Title:
Author:
- s Publisher : McGraw-Hill (December 1, 1987)
Language :
I tried on the publisher regex to look for the span class of a-text-bold that had Publisher right after it and ignore any other characters until the beginning of the next span element. At that opening span tag, scrape all characters after it up until the next closing span tag. The print the result. Unfortunately, I received no text to my page and also had no error messages.
Income Tax Fundamentals 2013
ISBN: 9781285586618
31st Edition
Authors: Gerald E. Whittenburg, Martha Altus Buller, Steven L Gill