Scrapy-Nordstrom

Web Scraping products from Nordstrom search results using Scrapy

Problem

When you search Nordstrom for a word, it gives you a URL like this: http://shop.nordstrom.com/sr?origin=keywordsearch&keyword=suitcase

When you try to scrape it, you get only the first 12 items.

Some may resort to Selenium to solve such issues. However, here we are offering a solution using Scrapy.

Solution

If you click the next page, you will discover that this is how the URL is formed; notice the ending "top=72". http://shop.nordstrom.com/sr?origin=keywordsearch&keyword=suitcase&page=1&top=72

The trick is to scrape only 12 per page. So for example, instead of having 11 pages, you will have 63 pages. However, the parameter "top" is not set to 72 but to 12 only.

So you should use this URL instead: http://shop.nordstrom.com/sr?origin=keywordsearch&keyword=suitcase&page=1&top=12

There are two Scrapy spiders, one of which uses LinkExtractor.

Usage

Change url to reflect the keyword you are searching for
Change range to reflect the number of pages + 1
In your Terminal, navigate to the nordstrom folder
To run the "nord", use: scrapy crawl nord -o nord.csv
To run the "nordrules" spider, use: scrapy crawl nordrules -o nord-rules.csv

Scrapy Online Course

Check this Scrapy tutorial to learn much more:

Scrapy: Powerful Web Scraping & Crawling with Python

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
nordstrom		nordstrom
README.md		README.md
nord-2017-04-11.csv		nord-2017-04-11.csv
nord-rules-2017-04-11.csv		nord-rules-2017-04-11.csv
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapy-Nordstrom

Problem

Solution

Usage

Scrapy Online Course

About

Releases

Packages

Languages

GoTrained/Scrapy-Nordstrom

Folders and files

Latest commit

History

Repository files navigation

Scrapy-Nordstrom

Problem

Solution

Usage

Scrapy Online Course

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages