This browser does not support JavaScript

Web Scraping Walmart Product Rankings With Python and OkeyProxy

Tutorial
OkeyProxy

Walmart product ranking data offers an insight into market trends, consumer preferences, and competitive dynamics. Scraping these rankings can fuel strategic decisions but it can be quite challenging. 

In this tutorial, we’ll guide you through scraping Walmart product rankings using Python, bypassing anti-bot measures with proxies, and exporting data for analysis.

What Is Walmart?

walmart

Walmart is one of the world’s largest retailers, originally known for its vast network of physical stores across the United States and internationally. Founded in 1962 by Sam Walton in Rogers, Arkansas, Walmart has evolved from a traditional discount store into a retail and technology leader with a significant presence in the online marketplace.

Why Walmart Product Ranking Data Matters

1. Visibility Drives Sales

Just like on Amazon or Google, the top-ranking products on Walmart's search results receive the most impressions, clicks, and conversions. If your product isn’t on the first page or even the first few results, it’s virtually invisible to most shoppers. Tracking ranking data helps you measure your online shelf performance in real-time.

2. Real-Time Consumer Demand Signals

Search rankings on Walmart reflect actual shopping intent. If a product rises or falls in rankings, it's often a direct result of changes in customer demand, pricing, reviews, or supply, offering a valuable signal for product planning and inventory decisions.

3. Competitive Intelligence

 ●  Monitoring which products rank for which keywords reveals:Who your competitors are.

 ● How their listings are optimized.

 ● What pricing or review strategies they use.

This allows brands to reverse-engineer ranking performance and adjust their own tactics accordingly.

Note: Walmart's search algorithm considers price, availability, keyword relevance, reviews, and sales velocity, making ranking data a multi-dimensional performance indicator.

What You Can Do With Walmart Ranking Data

Product ranking data on Walmart.com can power a variety of eCommerce, analytical, and AI use cases:

Use Case Description
Digital Shelf Optimization Understand how your product ranks and what affects its visibility.
Pricing Strategy Analysis See how price changes affect ranking and sales for you and competitors.
Content Optimization Improve titles, bullet points, and descriptions based on top-performing listings.
Inventory & Demand Planning Track trends and shifts in ranking as demand rises or falls.
Competitor Monitoring Identify new market entrants or growing competitors for key categories.
AI & Recommendation Engines Use ranking data to train models that simulate shopper behavior.

Tip: Combine ranking data with reviews, pricing history, and inventory status for a full-funnel performance view.

Who Can Benefit from Walmart Product Ranking Data?

🛍️ Brands & Manufacturers

Understand how your SKUs perform in search across regions and categories. Use this data to justify ad spend, optimize listings, or plan promotions.

📦 Retailers & Aggregators

Track how partner products rank to optimize assortment and surface the most profitable items in your inventory.

📈 eCommerce Analysts & Data Scientists

Use ranking data for trend analysis, seasonal forecasting, or predictive modeling of online consumer behavior.

🤖 AI & Machine Learning Developers

Leverage structured product and ranking data to train search relevance models, personalized recommendation engines, and automated pricing bots.

📊 Marketing & SEO Teams

Gain insight into which keywords and listing features impact Walmart’s search ranking, and refine SEO strategies accordingly.

Walmart Ranking Data as a Real-Time KPI

walmart data

Walmart product ranking data allows businesses to react faster to market shifts, optimize product strategies dynamically, and reduce guesswork in promotional campaigns. Walmart’s ranking system is a live, algorithmic evaluation of how well your product meets consumer and platform expectations. 

By monitoring the product, it helps you stay competitive and visible in the market. 

Let’s get started on how to extract Walmart product rankings efficiently!

Setup Instructions

Let’s prepare your environment for scraping. You’ll need a few tools and libraries to ensure smooth execution.

Prerequisites

 ● Python 3.8+: Install from python.org if not already set up.

 ● Libraries: We’ll use requests for HTTP requests, BeautifulSoup for HTML parsing, and pandas for data export.

 ● OkeyProxy Account: Sign up to obtain an API key for proxy management.

 ● Text Editor/IDE: Use VS Code, PyCharm, or your preferred editor.

 ● Basic Knowledge: Familiarity with Python and HTML/CSS selectors is helpful.

Installing Libraries

Install the required Python libraries by running:

bash

pip install requests beautifulsoup4 pandas

Project Setup

1.  Create a project directory: 

 

bash

mkdir walmart_scraper

cd walmart_scraper

2.  Create a Python file named walmart_scraper.py for the code.

3.  Sign up for OkeyProxy and note your API key for proxy integration.

With your environment ready, let’s get started!

Step-by-Step Data Extraction

We’ll build a Python script to scrape product rankings from Walmart’s search results for a query like “laptops.” The data is embedded in a JSON object within the __NEXT_DATA__ script tag, which we’ll extract efficiently.

Step 1: Import Libraries

Start by importing the necessary libraries in walmart_scraper.py:

python

import requests

from bs4 import BeautifulSoup

import json

import pandas as pd

Step 2: Send an HTTP Request to Walmart

Fetch the HTML for a search query like “laptops” using a User-Agent header to mimic a browser and avoid basic bot detection.

python

import requests

from bs4 import BeautifulSoup

import json

import pandas as pd

 

# Define headers to mimic a browser

headers = {

 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"

}

 

# Walmart search URL for "laptops"

search_term = "laptop"

url = f"https://www.walmart.com/search?q={search_term}&sort=best_seller"

 

# Send GET request

response = requests.get(url, headers=headers)

response.raise_for_status() # Check for request errors

Step 3: Parse HTML and Extract JSON Data

Walmart embeds product data in a <script> tag with the ID __NEXT_DATA__. Use BeautifulSoup to locate and parse it.

python

import requests

from bs4 import BeautifulSoup

import json

import pandas as pd

 

# Define headers to mimic a browser

headers = {

 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"

}

 

# Walmart search URL for "laptops"

search_term = "laptop"

url = f"https://www.walmart.com/search?q={search_term}&sort=best_seller"

 

# Send GET request

response = requests.get(url, headers=headers)

response.raise_for_status()

 

# Parse HTML with BeautifulSoup

soup = BeautifulSoup(response.text, "html.parser")

 

# Find the __NEXT_DATA__ script tag

script_tag = soup.find("script", id="__NEXT_DATA__")

if not script_tag:

 raise ValueError("Could not find __NEXT_DATA__ script tag")

 

# Extract and parse JSON data

next_data = json.loads(script_tag.string)

Step 4: Extract Product Rankings

The JSON data contains products under next_data["props"]["pageProps"]["searchResult"]["itemStacks"]. Extract product name, price, and rank.

python

import requests

from bs4 import BeautifulSoup

import json

import pandas as pd

 

# Define headers to mimic a browser

headers = {

 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"

}

 

# Walmart search URL for "laptops"

search_term = "laptop"

url = f"https://www.walmart.com/search?q={search_term}&sort=best_seller"

 

# Send GET request

response = requests.get(url, headers=headers)

response.raise_for_status()

 

# Parse HTML with BeautifulSoup

soup = BeautifulSoup(response.text, "html.parser")

 

# Find the __NEXT_DATA__ script tag

script_tag = soup.find("script", id="__NEXT_DATA__")

if not script_tag:

 raise ValueError("Could not find __NEXT_DATA__ script tag")

 

# Extract and parse JSON data

next_data = json.loads(script_tag.string)

 

# Extract product data

products = []

item_stacks = next_data["props"]["pageProps"]["searchResult"]["itemStacks"]

for stack in item_stacks:

 for rank, item in enumerate(stack.get("items", []), 1):

 product = {

 "Rank": rank,

 "Name": item.get("name", "N/A"),

 "Price": item.get("price", {}).get("priceString", "N/A")

 }

 products.append(product)

Step 5: Handle Pagination

Walmart’s search results span multiple pages (up to 25). Loop through pages by adding the page parameter to the URL. Here’s an example for two pages:

python

import requests

from bs4 import BeautifulSoup

import json

import pandas as pd

 

# Define headers to mimic a browser

headers = {

 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"

}

 

# Initialize product list

products = []

 

# Loop through pages (e.g., 1 and 2)

for page in range(1, 3):

 # Walmart search URL for "laptops"

 search_term = "laptop"

 url = f"https://www.walmart.com/search?q={search_term}&sort=best_seller&page={page}"

 

 # Send GET request

 response = requests.get(url, headers=headers)

 response.raise_for_status()

 

 # Parse HTML with BeautifulSoup

 soup = BeautifulSoup(response.text, "html.parser")

 

 # Find the __NEXT_DATA__ script tag

 script_tag = soup.find("script", id="__NEXT_DATA__")

 if not script_tag:

 print(f"Could not find __NEXT_DATA__ on page {page}")

 continue

 

 # Extract and parse JSON data

 next_data = json.loads(script_tag.string)

 

 # Extract product data

 item_stacks = next_data["props"]["pageProps"]["searchResult"]["itemStacks"]

 for stack in item_stacks:

 for rank, item in enumerate(stack.get("items", []), 1):

 product = {

 "Rank": rank + (page - 1) * 40, # Adjust rank for page

 "Name": item.get("name", "N/A"),

 "Price": item.get("price", {}).get("priceString", "N/A")

 }

 products.append(product)

Data Export

Export the scraped rankings to a CSV file using pandas for easy analysis in tools like Excel or Jupyter Notebook.

python

import requests

from bs4 import BeautifulSoup

import json

import pandas as pd

 

# Define headers to mimic a browser

headers = {

 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"

}

 

# Initialize product list

products = []

 

# Loop through pages (e.g., 1 and 2)

for page in range(1, 3):

 # Walmart search URL for "laptops"

 search_term = "laptop"

 url = f"https://www.walmart.com/search?q={search_term}&sort=best_seller&page={page}"

 

 # Send GET request

 response = requests.get(url, headers=headers)

 response.raise_for_status()

 

 # Parse HTML with BeautifulSoup

 soup = BeautifulSoup(response.text, "html.parser")

 

 # Find the __NEXT_DATA__ script tag

 script_tag = soup.find("script", id="__NEXT_DATA__")

 if not script_tag:

 print(f"Could not find __NEXT_DATA__ on page {page}")

 continue

 

 # Extract and parse JSON data

 next_data = json.loads(script_tag.string)

 

 # Extract product data

 item_stacks = next_data["props"]["pageProps"]["searchResult"]["itemStacks"]

 for stack in item_stacks:

 for rank, item in enumerate(stack.get("items", []), 1):

 product = {

 "Rank": rank + (page - 1) * 40,

 "Name": item.get("name", "N/A"),

 "Price": item.get("price", {}).get("priceString", "N/A")

 }

 products.append(product)

 

# Export to CSV

df = pd.DataFrame(products)

df.to_csv("walmart_product_rankings.csv", index=False)

print("Data exported to walmart_product_rankings.csv")

This creates a walmart_product_rankings.csv file with Rank, Name, and Price columns.

Proxy Integration with OkeyProxy

Walmart’s anti-bot measures, such as CAPTCHAs and IP bans, can disrupt scraping. OkeyProxy’s residential proxies help bypass these by rotating IP addresses and mimicking human behavior.

Why Use Proxies?

 ● Avoid Blocks: Rotating IPs reduces detection risks.

 ● Geo-Spoofing: Access US-specific Walmart data.

 ● Scalability: Handle large-scale scraping without bans.

Integrating OkeyProxy

 1.  Get Your API Key: Sign up and copy your API key.

 2.  Update the Script: Add proxy settings to route requests through OkeyProxy.

python

import requests

from bs4 import BeautifulSoup

import json

import pandas as pd

 

# OkeyProxy configuration

OKEYPROXY_API_KEY = "YOUR_API_KEY" # Replace with your OkeyProxy API key

proxy = {

 "http": f"http://{OKEYPROXY_API_KEY}:@proxy.okeyproxy.com:8080",

 "https": f"http://{OKEYPROXY_API_KEY}:@proxy.okeyproxy.com:8080"

}

 

# Define headers to mimic a browser

headers = {

 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"

}

 

# Initialize product list

products = []

 

# Loop through pages (e.g., 1 and 2)

for page in range(1, 3):

 # Walmart search URL for "laptops"

 search_term = "laptop"

 url = f"https://www.walmart.com/search?q={search_term}&sort=best_seller&page={page}"

 

 # Send GET request with proxy

 response = requests.get(url, headers=headers, proxies=proxy)

 response.raise_for_status()

 

 # Parse HTML with BeautifulSoup

 soup = BeautifulSoup(response.text, "html.parser")

 

 # Find the __NEXT_DATA__ script tag

 script_tag = soup.find("script", id="__NEXT_DATA__")

 if not script_tag:

 print(f"Could not find __NEXT_DATA__ on page {page}")

 continue

 

 # Extract and parse JSON data

 next_data = json.loads(script_tag.string)

 

 # Extract product data

 item_stacks = next_data["props"]["pageProps"]["searchResult"]["itemStacks"]

 for stack in item_stacks:

 for rank, item in enumerate(stack.get("items", []), 1):

 product = {

 "Rank": rank + (page - 1) * 40,

 "Name": item.get("name", "N/A"),

 "Price": item.get("price", {}).get("priceString", "N/A")

 }

 products.append(product)

 

# Export to CSV

df = pd.DataFrame(products)

df.to_csv("walmart_product_rankings.csv", index=False)

print("Data exported to walmart_product_rankings.csv")

Replace YOUR_API_KEY with your OkeyProxy API key. This routes requests through OkeyProxy’s residential proxies, minimizing blocks.

Comparison Table: Manual Scraping vs. Proxy-Enabled vs. API-Based Approaches

Approach Pros Cons Ideal Use Cases
Manual Scraping Free, no external dependencies, simple for small-scale tasks High risk of IP bans, frequent CAPTCHAs, limited scalability, manual effort One-time, small-scale scraping with minimal data needs
Proxy-Enabled Scraping Bypasses anti-bot measures, scalable, supports geo-spoofing, reliable with OkeyProxy Requires subscription cost, setup complexity, potential proxy configuration issues Large-scale scraping, frequent requests, region-specific data
API-Based Approaches Official data access, reliable, structured data, no anti-bot issues Limited data availability, high cost, restricted to API endpoints, requires authentication Structured data needs, compliance-focused projects, low volume

What is OkeyProxy?

OkeyProxy provides residential and datacenter proxies to streamline web scraping. Its features include:

 ● Rotating Residential Proxies: Cycles IPs to avoid detection.

 ● Global Coverage: Access US-specific Walmart data or other regions.

 ● Easy Integration: Simple API for Python, Scrapy, or Selenium.

 ● Scalability: Supports high-volume scraping with minimal downtime.

OkeyProxy is a reliable choice for bypassing anti-bot measures on sites like Walmart. Check out our website today!

FAQs

Here are answers to common questions about scraping Walmart product rankings, based on real user concerns:

1. Why do I keep getting blocked or see CAPTCHAs when scraping Walmart?

Walmart uses advanced anti-bot measures to detect automated requests. Without proxies, your IP may be flagged due to frequent requests. Using OkeyProxy’s rotating residential proxies reduces this risk by cycling IPs and mimicking human behavior. 

2. How do I configure OkeyProxy correctly in my script?

Verify the key is correct and test with a single request. If you encounter connection errors, check your internet or contact OkeyProxy support for proxy endpoint details.

3. Can I use this scraper for other Walmart data, like customer reviews?

Yes, but you’ll need to modify the script to target different JSON fields or page structures. For reviews, inspect Walmart’s product pages and locate the relevant __NEXT_DATA__ fields or use additional endpoints. Adjust the parsing logic accordingly, and ensure OkeyProxy is configured for product page URLs.

4. What should I do if the __NEXT_DATA__ script tag is missing?

This may occur if Walmart changes its page structure or your request is blocked. First, verify your proxy settings and headers. If the issue persists, use a headless browser like Selenium to render JavaScript-heavy pages. Alternatively, check Walmart’s API endpoints or contact OkeyProxy support for advanced troubleshooting.

5. How can businesses use Walmart product ranking data?

Businesses can analyze rankings to identify trending products, monitor competitor pricing, or optimize inventory. For example, e-commerce retailers can adjust pricing strategies, while market analysts can track category trends. Ensure data use complies with Walmart’s terms and local regulations.

Conclusion

Scraping Walmart product rankings unlocks valuable insights for data analysts, developers, and businesses, but it requires navigating anti-bot protections. 

With OkeyProxy’s rotating proxies, you can scale your scraper reliably and access US-specific data. Don't forget to review Walmart’s terms of service and consult legal professionals to ensure compliance.

Start scraping today with OkeyProxy. Sign up today!