SEO Monitoring Với Web Scraping: Theo Dõi Thứ Hạng Keywords

Trở lại Tin tức
Tin tức

SEO Monitoring Với Web Scraping: Theo Dõi Thứ Hạng Keywords

Theo dõi thứ hạng keywords là việc cần làm thường xuyên. Bài viết hướng dẫn xây dựng rank tracker với scraping.

Tại Sao Cần Rank Tracking?

  • Đo lường hiệu quả SEO
  • Phát hiện sớm drops
  • Monitor đối thủ
  • Track multiple keywords

Cảnh Báo Quan Trọng

⚠️ Google có thể block nếu scrape quá nhiều. Dùng proxy và delays!

Basic Rank Checker

import requests
from bs4 import BeautifulSoup

def check_rank(keyword, domain, max_pages=5):
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)...'
    }
    
    for page in range(max_pages):
        start = page * 10
        url = f'https://www.google.com/search?q={keyword}&start={start}'
        
        response = requests.get(url, headers=headers, 
                               proxies={'http': 'http://proxy.vinaproxy.com:8080'})
        soup = BeautifulSoup(response.text, 'lxml')
        
        results = soup.select('div.g')
        for i, result in enumerate(results):
            link = result.select_one('a')
            if link and domain in link.get('href', ''):
                return (page * 10) + i + 1
        
        time.sleep(random.uniform(2, 5))
    
    return None  # Not in top results

# Usage
rank = check_rank('proxy việt nam', 'vinaproxy.com')
print(f"Ranking: #{rank}" if rank else "Not found in top 50")

Track Multiple Keywords

import csv
from datetime import datetime

keywords = [
    'proxy việt nam',
    'residential proxy',
    'web scraping python',
]

results = []
for kw in keywords:
    rank = check_rank(kw, 'vinaproxy.com')
    results.append({
        'keyword': kw,
        'rank': rank,
        'date': datetime.now().strftime('%Y-%m-%d')
    })
    print(f"{kw}: #{rank}")
    time.sleep(5)  # Important delay!

# Save to CSV
with open('rankings.csv', 'a', newline='') as f:
    writer = csv.DictWriter(f, fieldnames=['date', 'keyword', 'rank'])
    writer.writerows(results)

SERP Feature Detection

def analyze_serp(keyword):
    # Scrape SERP
    response = requests.get(f'https://google.com/search?q={keyword}',
                           headers=headers, proxies=proxies)
    soup = BeautifulSoup(response.text, 'lxml')
    
    features = {
        'featured_snippet': soup.select_one('.kp-blk') is not None,
        'people_also_ask': soup.select_one('.related-question-pair') is not None,
        'local_pack': soup.select_one('.VkpGBb') is not None,
        'images': soup.select_one('.rg_meta') is not None,
        'videos': soup.select_one('.RzdJxc') is not None
    }
    
    return features

Competitor Tracking

competitors = ['competitor1.com', 'competitor2.com']

def compare_rankings(keyword, domains):
    rankings = {}
    for domain in domains:
        rank = check_rank(keyword, domain)
        rankings[domain] = rank
        time.sleep(3)
    return rankings

# Compare
comparison = compare_rankings('proxy việt nam', 
                              ['vinaproxy.com'] + competitors)
print(comparison)

Best Practices

  • Track weekly, không daily
  • Dùng residential proxy
  • Long delays (5-10s) giữa requests
  • Rotate User-Agents
  • Store historical data

VinaProxy + SEO Monitoring

  • Residential IPs cho Google scraping
  • Geo-targeted rankings
  • Giá chỉ $0.5/GB

Dùng Thử Ngay →