When performing an eCommerce Content Audit, many times our Analysts will require the specific category page content and product page excerpts, so they can analyze for length, originality and more. Our current “go to” tool for crawling websites is Screaming Frog. Their somewhat new (summer 2015) “extraction” feature has made grabbing specific snippets of content for later analysis much easier than before.
Before extraction was available, we’d have to either build our crawler with a tool like import.io or Kimono. Or we’d have to request an export of the content from our client (who sometimes was unable to easily provide this), then use VLOOKUP in a spreadsheet to “link” the content data to the data from Screaming Frog, Google Analytics and URL Profiler, as well. Now, with the new extraction feature, we simply have to set it up before we crawl in Screaming Frog — and we have our content extracted all in one step.
Using Screaming Frog and XPath to Extract Product and Category Content
Screaming Frog provides three different ways to extract content from a Web page: CSSPath, XPath and RegEx. Don’t worry if you are not familiar with these technologies just yet. For our purpose, CSSPath and Xpath are usually the simplest to work with, and by using Chrome Developer tools you won’t have to learn any code.
Step 1: Inspect the Element Where Your Content Resides
Go to a category page like https://adcohearing.com/categories/tv-amplifiers. Right click in the area of the category content and select “Inspect Element” from the menu. Chrome Developer Tools will open up in your browser. This works the same for a product page as well.
Step 2: Copy the XPath
Highlight the content in the code and right click again. This time highlight “Copy XPath.”
Step 3: Set Up Your Extraction in Screaming Frog
Go to the “Extractions” setup under the “Custom” menu option. Paste in your Xpath copied from Chrome Developer tools. Choose “XPath” as the type and choose “Extract Text.”
Step 4: Run Screaming Frog
Then check out the “Custom Tab in Screaming Frog. Make sure you have “Extraction” selected.
Step 5: Export the Screaming Frog data
You can now export the extracted content along with the rest of your crawl data, including Google Analytics (if you set the tool up to pull that data with your crawl).
Now that you understand the basics of how to find the XPath to an “element” on a Web page and how to use that Xpath to extract the content of that element via Screaming Frog, the possibilities of how to apply this are endless.
What other creative ways have you found to use the extraction tool in Screaming Frog? Let us know in the comments below!
Want more great tips and tools to take control of your eCommerce site? Get access to our eCommerce Content Audit Toolkit.