🥳Join the Scrapeless Community and Claim Your Free Trial to Access Our Powerful Web Scraping Toolkit!
Back to Blog

How to Scrape Google Product Online Sellers with Python

Emily Chen
Emily Chen

Advanced Data Extraction Specialist

20-Mar-2025

Introduction

In today’s competitive eCommerce landscape, monitoring product listings and analyzing online sellers' performance on platforms like Google can provide valuable insights. Scraping Google Product listings allows businesses to gather real-time data to compare prices, track trends, and analyze competitors. In this article, we’ll show you how to scrape Google product online sellers with Python, using a variety of methods. We will also explain why Scrapeless is the best choice for businesses seeking reliable, scalable, and legal solutions.

Understanding the Challenges of Scraping Google Product Online Sellers

When attempting to scrape Google Product Online Sellers, several key challenges can arise:

  • Anti-scraping Measures: Websites implement CAPTCHAs and IP blocking to prevent automated scraping, making data extraction difficult.
  • Dynamic Content: Google Product pages often load data using JavaScript, which can be missed by traditional scraping methods like Requests & BeautifulSoup or Selenium.
  • Rate-limiting: Excessive requests in a short period can lead to throttled access, causing delays and interruptions in the scraping process.

Privacy Notice: We firmly protect the privacy of the website. All data in this blog is public and is only used as a demonstration of the crawling process. We do not save any information and data.

Method 1: Scraping Google Product Online Sellers with Scrapeless API (Recommended Solution)

Why Scrapeless is a Great Tool:

  • Efficient Data Extraction: Scrapeless can bypass CAPTCHAs and anti-bot measures, enabling smooth, uninterrupted data scraping.
  • Affordable Pricing: At just $0.1 per 1,000 queries, Scrapeless offers one of the most affordable solutions for Google scraping.
  • Multi-Source Scraping: In addition to scraping Google product online sellers, Scrapeless allows you to collect data from Google Maps, Google Hotels, Google Flights, Google News and more.
  • Speed and Scalability: Handle large-scale scraping tasks quickly, without slowing down, making it ideal for both small and enterprise-level projects.
  • Structured Data: The tool provides structured, clean data, ready for use in your analysis, reports, or integration into your systems.
  • Ease of Use: No complex setup—simply integrate your API key and start scraping data in minutes.

How to Use Scrapeless API:

  1. Sign Up: Register on Scrapeless and obtain your API key. At the same time, you can also get a free trial at the top of the Dashboard.
scrapeless api key
  1. Integrate API: Include the API key in your code to initiate requests to the service.
  2. Start Scraping: You can now send GET requests with product URLs or search queries, and Scrapeless will return structured data including product names, prices, reviews, and more.
  3. Use the Data: Leverage the retrieved data for competitor analysis, trend tracking, or any other project requiring Google data insights.

Complete Code Example:

python Copy
import json
import requests

class Payload:
    def __init__(self, actor, input_data):
        self.actor = actor
        self.input = input_data

def send_request():
    host = "api.scrapeless.com"
    url = f"https://{host}/api/v1/scraper/request"
    token = "your_token"

    headers = {
        "x-api-token": token
    }

    input_data = {
        "engine": "google_product",
        "product_id": "4172129135583325756",
        "gl": "us",
        "hl": "en",
    }

    payload = Payload("scraper.google.product", input_data)
    json_payload = json.dumps(payload.__dict__)

    response = requests.post(url, headers=headers, data=json_payload)

    if response.status_code != 200:
        print("Error:", response.status_code, response.text)
        return

    print("body", response.text)

if __name__ == "__main__":
    send_request()

Try Scrapeless for free and experience how our API can simplify your Google Product Online Seller scraping process. Start your free trial here.
Join our Discord community to get support, share insights, and stay updated on the latest features. Click here to join!

Method 2: Scraping Google Product Listings with Requests & BeautifulSoup

In this approach, we will take a deep dive into how to scrape Google product listings using two powerful Python libraries: Requests and BeautifulSoup. These libraries allow us to make HTTP requests to Google product pages and parse the HTML structure to extract valuable information.

Step 1. Setting up the environment

First, make sure you have Python installed on your system. And create a new directory to store the code for this project. Next you need to install beautifulsoup4 and requests. You can do this through PIP:

language Copy
$ pip install requests beautifulsoup4

Step 2. Use requests to make a simple request

Now, we need to crawl the data of Google products. Let's take the product with product_id 4172129135583325756 as an example and crawl some data of OnlineSeller.
Let's first simply use requests to send a GET request:

Copy
def google_product():
     headers = {
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/jpeg,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Cache-Control": "no-cache",
        "Pragma": "no-cache",
        "Priority": "u=0, i",
        "Sec-Ch-Ua": '"Not A(Brand";v="8", "Chromium";v="132", "Google Chrome";v="132"',
        "Sec-Ch-Ua-Mobile": "?0",
        "Sec-Ch-Ua-Platform": '"Windows"',
        "Sec-Fetch-Dest": "document",
        "Sec-Fetch-Mode": "navigate",
        "Sec-Fetch-Site": "none",
        "Sec-Fetch-User": "?1",
        "Upgrade-Insecure-Requests": "1",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36",
    }
    response = requests.get('https://www.google.com/shopping/product/4172129135583325756?gl=us&hl=en&prds=pid:4172129135583325756,freeship:1&sourceid=chrome&ie=UTF-8', headers=headers)

As expected, the request returns a complete HTML page. Now we need to extract some data from the following page:

Step 3. Get specific data

As shown in the figure, the data we need is under tr[jscontroller='d5bMlb']:

Get specific data
Copy
online_sellers = []
soup = BeautifulSoup(response.content, 'html.parser')
for i, row in enumerate(soup.find_all("tr", {"jscontroller": "d5bMlb"})):
    name = row.find("a").get_text(strip=True)
    payment_methods = row.find_all("div")[1].get_text(strip=True)
    link = row.find("a")['href']
    details_and_offers_text = row.find_all("td")[1].get_text(strip=True)
    base_price = row.find("span", class_="g9WBQb fObmGc").get_text(strip=True)
    badge = row.find("span", class_="XhDkmd").get_text(strip=True) if row.find("span", class_="XhDkmd") else ""
    link = f"https://www.google.com/{link}"
    parse = urlparse(link)
    direct_link = parse_qs(parse.query).get("q", [""])[0]
    online_sellers.append({
        'name' : name,
        'payment_methods' : payment_methods,
        'link' : link,
        'direct_link' : direct_link,
        'details_and_offers' : [{"text": details_and_offers_text}],
        'base_price' : base_price,
        'badge' : badge
        })

Then use BeautifulSoup to parse the HTML page and get the relevant elements:

Complete code

Copy
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse, parse_qs


class AdditionalPrice:
    def __init__(self, shipping, tax):
        self.shipping = shipping
        self.tax = tax


class OnlineSeller:
    def __init__(self, position, name, payment_methods, link, direct_link, details_and_offers, base_price,
                 additional_price, badge, total_price):
        self.position = position
        self.name = name
        self.payment_methods = payment_methods
        self.link = link
        self.direct_link = direct_link
        self.details_and_offers = details_and_offers
        self.base_price = base_price
        self.additional_price = additional_price
        self.badge = badge
        self.total_price = total_price


def sellers_results(url):
    headers = {
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/jpeg,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "Accept-Language": "en-US,en;q=0.9",
        "Cache-Control": "no-cache",
        "Pragma": "no-cache",
        "Priority": "u=0, i",
        "Sec-Ch-Ua": '"Not A(Brand";v="8", "Chromium";v="132", "Google Chrome";v="132"',
        "Sec-Ch-Ua-Mobile": "?0",
        "Sec-Ch-Ua-Platform": '"Windows"',
        "Sec-Fetch-Dest": "document",
        "Sec-Fetch-Mode": "navigate",
        "Sec-Fetch-Site": "none",
        "Sec-Fetch-User": "?1",
        "Upgrade-Insecure-Requests": "1",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36",
    }

    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, 'html.parser')
    online_sellers = []

    for i, row in enumerate(soup.find_all("tr", {"jscontroller": "d5bMlb"})):
        name = row.find("a").get_text(strip=True)
        payment_methods = row.find_all("div")[1].get_text(strip=True)
        link = row.find("a")['href']
        details_and_offers_text = row.find_all("td")[1].get_text(strip=True)
        base_price = row.find("span", class_="g9WBQb fObmGc").get_text(strip=True)
        badge = row.find("span", class_="XhDkmd").get_text(strip=True) if row.find("span", class_="XhDkmd") else ""
        link = f"https://www.google.com/{link}"
        parse = urlparse(link)
        direct_link = parse_qs(parse.query).get("q", [""])[0]
        online_sellers.append({
            'name' : name,
            'payment_methods' : payment_methods,
            'link' : link,
            'direct_link' : direct_link,
            'details_and_offers' : [{"text": details_and_offers_text}],
            'base_price' : base_price,
            'badge' : badge
            })
    return online_sellers


url = 'https://www.google.com/shopping/product/4172129135583325756?gl=us&hl=en&prds=pid:4172129135583325756,freeship:1&sourceid=chrome&ie=UTF-8'
sellers = sellers_results(url)
for seller in sellers:
    print(seller)

The console print results are as follows:

limitation

Of course, we can get the data back using the example above of getting partial data, but there is a risk of IP blocking. We cannot make large requests, which will trigger Google's product risk control.

Method 3: Scraping Google Product Online Sellers with Selenium

In this method, we will explore how to use Selenium, a powerful web automation tool, to scrape Google product listings from online sellers. Unlike Requests and BeautifulSoup, Selenium allows us to interact with dynamic pages that require JavaScript to execute, which makes it perfect for scraping Google product listings that load content dynamically.

Step 1: Set up the environment

First, make sure you have Python installed on your system. And create a new directory to store the code for this project. Next, you need to install selenium and webdriver_manager. You can do this through PIP:

Copy
pip install selenium
pip install webdriver_manager

Step 2: Initialize the selenium environment

Now, we need to add some configuration items of selenium and initialize the environment.

Copy
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)

Step 3: Get specific data

We use selenium to get the product with product_id 4172129135583325756 and grab some data of OnlineSeller

Copy
driver.get(url)
time.sleep(5)  #wait page

online_sellers = []

rows = driver.find_elements(By.CSS_SELECTOR, "tr[jscontroller='d5bMlb']")
for i, row in enumerate(rows):
    name = row.find_element(By.TAG_NAME, "a").text.strip()
    payment_methods = row.find_elements(By.TAG_NAME, "div")[1].text.strip()
    link = row.find_element(By.TAG_NAME, "a").get_attribute('href')
    details_and_offers_text = row.find_elements(By.TAG_NAME, "td")[1].text.strip()
    base_price = row.find_element(By.CSS_SELECTOR, "span.g9WBQb.fObmGc").text.strip()
    badge = row.find_element(By.CSS_SELECTOR, "span.XhDkmd").text.strip() if row.find_elements(By.CSS_SELECTOR, "span.XhDkmd") else ""

Complete code

Copy
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import time
from urllib.parse import urlparse, parse_qs

class AdditionalPrice:
    def __init__(self, shipping, tax):
        self.shipping = shipping
        self.tax = tax

class OnlineSeller:
    def __init__(self, position, name, payment_methods, link, direct_link, details_and_offers, base_price, additional_price, badge, total_price):
        self.position = position
        self.name = name
        self.payment_methods = payment_methods
        self.link = link
        self.direct_link = direct_link
        self.details_and_offers = details_and_offers
        self.base_price = base_price
        self.additional_price = additional_price
        self.badge = badge
        self.total_price = total_price

def sellers_results(url):
    options = webdriver.ChromeOptions()
    options.add_argument("--headless")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-dev-shm-usage")

    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)

    try:
        driver.get(url)
        time.sleep(5)

        online_sellers = []

        rows = driver.find_elements(By.CSS_SELECTOR, "tr[jscontroller='d5bMlb']")
        for i, row in enumerate(rows):
            name = row.find_element(By.TAG_NAME, "a").text.strip()
            payment_methods = row.find_elements(By.TAG_NAME, "div")[1].text.strip()
            link = row.find_element(By.TAG_NAME, "a").get_attribute('href')
            details_and_offers_text = row.find_elements(By.TAG_NAME, "td")[1].text.strip()
            base_price = row.find_element(By.CSS_SELECTOR, "span.g9WBQb.fObmGc").text.strip()
            badge = row.find_element(By.CSS_SELECTOR, "span.XhDkmd").text.strip() if row.find_elements(By.CSS_SELECTOR, "span.XhDkmd") else ""

            parse = urlparse(link)
            direct_link = parse_qs(parse.query).get("q", [""])[0]

            online_seller = {
                'name': name,
                'payment_methods': payment_methods,
                'link': link,
                'direct_link': direct_link,
                'details_and_offers': [{"text": details_and_offers_text}],
                'base_price': base_price,
                'badge': badge
            }
            online_sellers.append(online_seller)

        return online_sellers

    finally:
        driver.quit() 

url = 'https://www.google.com/shopping/product/4172129135583325756?gl=us&hl=en&prds=pid:4172129135583325756,freeship:1&sourceid=chrome&ie=UTF-8'
sellers = sellers_results(url)
for seller in sellers:
    print(seller)

The console print results are as follows:

Limitations

Selenium is a powerful tool for automating web browser operations and is widely used in automated testing and web data crawling. However, it needs to wait for the page to load, so it is relatively slow in the data crawling process.

FAQ

What are the best methods for scraping Google product listings at scale?

The most effective method for large-scale Google product scraping is using Scrapeless. It provides a fast and scalable API, handling dynamic content, IP blocking, and CAPTCHAs efficiently, making it ideal for businesses.

How do I bypass Google’s anti-scraping measures when scraping product listings?

Google employs several anti-scraping measures, including CAPTCHAs and IP blocking. Scrapeless provides an API that bypasses these measures and ensures smooth, uninterrupted data extraction.

Can I use Python libraries like BeautifulSoup or Selenium to scrape Google Product listings?

While BeautifulSoup and Selenium can be used for scraping Google Product listings, they come with limitations such as slow performance, detection risk, and the inability to scale. Scrapeless offers a more efficient solution that handles all these issues.

Conclusion

In this article, we've discussed three methods for scraping Google Product Online Sellers: Requests & BeautifulSoup, Selenium, and Scrapeless. Each method offers distinct advantages, but when it comes to handling large-scale scraping, Scrapeless is undoubtedly the best choice for businesses.

  • Requests & BeautifulSoup are suitable for small-scale scraping tasks but face limitations when dealing with dynamic content or when scaling up. These tools also run the risk of being blocked by anti-scraping measures.
  • Selenium is effective for JavaScript-rendered pages, but it’s resource-intensive and slower than other options, making it less ideal for large-scale scraping of Google Product listings.

On the other hand, Scrapeless addresses all the challenges associated with traditional scraping methods. It's fast, reliable, and legal, ensuring that you can efficiently scrape Google Product Online Sellers at scale without worrying about being blocked or running into other obstacles.

For businesses seeking a streamlined, scalable solution, Scrapeless is the go-to tool. It bypasses all the hurdles of conventional methods and provides a smooth, hassle-free experience for collecting Google Product data.

Try Scrapeless today with a free trial and discover how easy it is to scale your Google Product scraping tasks. Start your free trial now.

At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this blog is for demonstration purposes only and does not involve any illegal or infringing activities. We make no guarantees and disclaim all liability for the use of information from this blog or third-party links. Before engaging in any scraping activities, consult your legal advisor and review the target website's terms of service or obtain the necessary permissions.

Most Popular Articles

Catalogue