Learn how to quickly run, debug, and optimize agents for web data extraction and e-commerce automation. Includes step-by-step instructions, examples like Amazon scraping and multi-platform price comparison, and model tuning strategies using GPT-4
Core Principle: For an easy start, run tasks first, then optimize based on the gap between results and expectations.
Problem Type | Typical Symptoms | Specific Examples |
Step Errors/Redundancy | Invalid clicks, repetitive operations, retry loops | Repeatedly clicking the same product on Amazon listing pages, getting stuck in endless loops |
Extraction Errors | Missing data, incorrect data | Required product "name/price/rating" fields missing or content clearly abnormal |
Output Format Errors | Wrong file format, incorrect order | Need "name/price/rating" order but CSV exports as "price/name/rating" |
User Task Instructions:
Search Amazon for "wireless headphones" category, extract the first 10 products' names, prices, ratings, and links, then export as CSV file.
Problems Discovered:
Objective: Search Amazon for "wireless headphones", extract complete information for the first 10 products, complete within 30 steps.
Specific Requirements:
1. Search keyword: "wireless headphones"
2. Extract fields: Product Name, Price, Rating, Product Link
3. Product quantity: First 10 products
4. Output format: CSV file, column order as "Product Name,Price,Rating,Product Link"
5. Step limit: Complete within 30 steps
6. Data quality: Ensure each field has data or mark as "data missing"
You are a professional Amazon product data extraction specialist. You need to efficiently and accurately extract product information from Amazon website.
【Extraction Process Optimization】
When on Amazon product listing pages:
① Load complete listing page → ② Click each product card once in sequence → ③ Extract title/price/rating/link → ④ Return to previous page and continue until all specified products collected
Constraints: Prohibited from clicking advertisement positions, prohibited from repeatedly clicking same product
【Data Extraction Standards】
When entering Amazon product detail pages:
① Enter detail page → ② Read product title, current price, overall rating, product link in sequence → ③ If any field "name/price/rating/link" is missing, read page section by section and search once more; if still not found, mark as "data missing"
Constraints: Prohibited from estimating values to fill gaps; all data must reference actual page data
【Output Format Control】
When exporting all extraction results as CSV file:
① Compile all records → ② Generate CSV file with header row first, column order strictly as "Product Name,Price,Rating,Product Link" → ③ Ensure data integrity and format consistency
Constraints: Replace line breaks or extra commas in fields with spaces first, standardize price to numeric format
【Quality Assurance】
- Each product must include 4 fields: name, price, rating, link
- Price format: Keep only numbers and decimal points, remove currency symbols
- Rating format: X.X/5.0 or data missing
- Link format: Complete Amazon product URL
You are a professional e-commerce price comparison analyst. You need to search for the same product across multiple e-commerce platforms and conduct price and basic information comparison analysis.
【Platform Access Strategy】
When accessing different e-commerce platforms:
① Visit specified platforms in sequence → ② Search using same keywords → ③ Select most relevant product for data extraction → ④ Record platform name, product information, and price
Constraints: Select only 1 best matching product per platform, avoid staying too long on single platform
【Data Standardization】
When extracting data from different platforms:
① Standardize product name format (remove platform-specific identifiers) → ② Standardize price format (unified currency) → ③ Record platform-specific information (shipping methods, promotional info)
Constraints: Prices must be converted to numeric format, platform names use standard abbreviations
【Comparative Analysis】
When all platform data collection is complete:
① Organize all platform data → ② Calculate price differences and percentages → ③ Mark lowest and highest price platforms → ④ Generate comparison report
Constraints: Price comparisons must be based on same or similar products, note product differences
Task: Compare "iPhone 15 Pro 128GB" price information across Amazon, eBay, and Best Buy.
Execution Steps:
1. Visit Amazon, eBay, and Best Buy respectively
2. Search for "iPhone 15 Pro 128GB"
3. Select most matching product and extract information
4. Record: Platform name, product title, price, seller name, shipping info
5. Generate price comparison table
6. Mark best choice and price difference percentages
Output with csv file
Step Limit: Complete within 45 steps
Task Type | Recommended Model | Temperature Range | Reason |
Data Extraction (Consistency) | GPT-4.1-Mini | 0~0.3 | Low randomness, ensures data accuracy |
Multi-site Complex Parsing | GPT-4.1 | 0.3~0.6 | Moderate flexibility, handles page variations |
Analysis Report Generation | GPT-4.1 | 0.6~0.8 | Higher creativity, generates deep analysis |
High Uncertainty Scenarios | GPT-4.1 | 0.7~1 | High randomness, improves diversity |
Use GPT-4.1-Mini (10 credits/step):
Use GPT-4.1 (30 credits/step):
Initial State:
First Run Results:
First Optimization: Added step control system instructions:
When on Amazon product listing pages:
① Load listing page → ② Click each product once in sequence → ③ Extract info then return → ④ Avoid repetitive operations
Second Run Results:
Second Optimization: Added data quality control:
When extracting product information:
① Product name: Complete title → ② Price: Numeric format only → ③ Rating: X.X format → ④ Mark missing data as "N/A"
Third Run Results:
Final Optimization: Added output format control:
When exporting CSV:
① Header: Product Name,Price,Rating,Link → ② Data cleaning: Remove special characters → ③ Format validation: Ensure 4 fields per row
Final Results:
System Instructions: You are an e-commerce price tracking specialist...
Temperature: 0.2 (Low - for consistent data extraction)
Model: GPT-4.1-Mini (Cost-effective for structured tasks)
System Instructions: You are a content research analyst...
Temperature: 0.6 (Medium - for balanced analysis)
Model: GPT-4.1 (Better for complex reasoning)
System Instructions: You are a creative content strategist...
Temperature: 0.8 (High - for diverse creative outputs)
Model: GPT-4.1 (Better for nuanced creativity)
Remember: The goal is to systematically bridge the gap between expectation and reality through optimization, not to create perfect instructions from the start.