【报错】This user agent has been blocked due to abuse 爬虫被封IP解决
一、报错形式记录一下,今天继续做我的毕设,爬crt.sh站记录时,发现我的IP被封掉了……requests.get('https://crt.sh/?q=' + domain, headers=headers,verify=False)我得到了这样的结果,requests请求返回如下:This user agent has been blocked due to abuse. Can we int
·
一、报错形式
记录一下,今天继续做我的毕设,爬crt.sh站记录时,发现我的IP被封掉了……
requests.get('https://crt.sh/?q=' + domain, headers=headers,verify=False)
我得到了这样的结果,requests请求返回如下:
This user agent has been blocked due to abuse. Can we interest you direct access to the crt.sh DB to fetch the data you want? https://groups.google.com/g/crtsh/c/sUmV0mBz8bQ/m/K-6Vymd_AAAJ
以及手动请求:
二、报错原因
爬虫导致IP被封
解决方案
1.PPPOE路由器重置,自动更换本机公网IP(直接)
def public(): # 查看本机公网地址
with urlopen(r'http://ip.42.pl/raw') as fd:
f = fd.read().decode()
print(f)
if __name__ == '__main__':
public()
以上代码可直接查到本机的公网IP地址。
2.伪造User-Agent
1.代码
import random
import socket
import struct
from fake_useragent import UserAgent # pip install fake-useragent
HEADERS = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'User-Agent': "", # 构造用户代理
'Referer': 'https://www.google.com',
'X-Forwarded-For': "", # 构造
'X-Real-IP': "", # 构造
'Connection': 'keep-alive',
}
def get_ua():
ua = UserAgent()
ip = socket.inet_ntoa(struct.pack('>I', random.randint(1, 0xffffffff)))
HEADERS["User-Agent"] = ua.random
HEADERS["X-Forwarded-For"] = HEADERS["X-Real-IP"] = ip
pyHEADERS = [
'User-Agent: {}'.format(ua.random),
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Referer: https://www.google.com', 'X-Forwarded-For: {}'.format(ip),
'X-Real-IP: {}'.format(ip), 'Connection: close'
]
return HEADERS
if __name__ == "__main__":
print(get_ua())
返回结果:
可得:四次请求构造出不同请求头部。
2.User-Agent介绍
User-Agent又称UA,是用户代理。使得服务器能够识别客户使用的操作系统及版本、CPU类型、浏览器及版本、浏览器渲染引擎、浏览器语言、浏览器插件等等。
常见用户代理:
user_agents = ['Mozilla/5.0 (Windows NT 6.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1','Mozilla/5.0 (Windows; U; Windows NT 6.1; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50','Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11']
更多推荐
所有评论(0)