Price Monitoring Với Web Scraping: Theo Dõi Giá Đối Thủ

Trở lại Tin tức
Tin tức

Price Monitoring Với Web Scraping: Theo Dõi Giá Đối Thủ

Price monitoring là use case phổ biến nhất của web scraping. Bài viết hướng dẫn xây dựng hệ thống theo dõi giá.

Use Cases

  • E-commerce: Theo dõi giá đối thủ
  • Dropshipping: Monitor supplier prices
  • Consumers: Price drop alerts
  • Market research: Price trends analysis

Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Scraper   │────▶│   Database  │────▶│  Dashboard  │
└─────────────┘     └─────────────┘     └─────────────┘
       │                   │
       ▼                   ▼
┌─────────────┐     ┌─────────────┐
│    Proxy    │     │   Alerts    │
└─────────────┘     └─────────────┘

Scraper Code

import requests
from bs4 import BeautifulSoup
import sqlite3
from datetime import datetime

def scrape_product(url):
    response = requests.get(url, headers={'User-Agent': '...'})
    soup = BeautifulSoup(response.text, 'lxml')
    
    return {
        'url': url,
        'name': soup.select_one('.product-name').text.strip(),
        'price': float(soup.select_one('.price').text.replace('$', '')),
        'timestamp': datetime.now().isoformat()
    }

def save_price(product):
    conn = sqlite3.connect('prices.db')
    cursor = conn.cursor()
    
    cursor.execute('''
        INSERT INTO prices (url, name, price, timestamp)
        VALUES (?, ?, ?, ?)
    ''', (product['url'], product['name'], 
          product['price'], product['timestamp']))
    
    conn.commit()
    conn.close()

# Monitor multiple products
urls = [
    'https://shop.com/product-1',
    'https://shop.com/product-2',
]

for url in urls:
    product = scrape_product(url)
    save_price(product)
    print(f"{product['name']}: ${product['price']}")

Database Schema

CREATE TABLE products (
    id INTEGER PRIMARY KEY,
    url TEXT UNIQUE,
    name TEXT
);

CREATE TABLE prices (
    id INTEGER PRIMARY KEY,
    product_id INTEGER,
    price REAL,
    timestamp DATETIME,
    FOREIGN KEY (product_id) REFERENCES products(id)
);

CREATE INDEX idx_prices_timestamp ON prices(timestamp);

Price Change Detection

def check_price_change(url, new_price):
    conn = sqlite3.connect('prices.db')
    cursor = conn.cursor()
    
    cursor.execute('''
        SELECT price FROM prices 
        WHERE url = ? 
        ORDER BY timestamp DESC LIMIT 1
    ''', (url,))
    
    row = cursor.fetchone()
    if row:
        old_price = row[0]
        if new_price != old_price:
            change = ((new_price - old_price) / old_price) * 100
            return {
                'old': old_price,
                'new': new_price,
                'change_pct': change
            }
    return None

Alert System

import smtplib

def send_alert(product, price_change):
    if price_change['change_pct'] < -10:  # 10% drop
        message = f"""
        Price Drop Alert!
        Product: {product['name']}
        Old: ${price_change['old']}
        New: ${price_change['new']}
        Change: {price_change['change_pct']:.1f}%
        """
        # Send email/Telegram/Slack notification
        print(message)

Best Practices

  • Scrape at consistent intervals
  • Store historical data cho trends
  • Handle price not found gracefully
  • Detect website layout changes

VinaProxy + Price Monitoring

  • Monitor đối thủ 24/7 không bị block
  • Residential IPs cho e-commerce sites
  • Giá chỉ $0.5/GB

Dùng Thử Ngay →