探测网站信息

Identifying the technology used by a website

we need to install builtwith first

1
2
import builtwith
builtwith.parse('http://example.webscrapying.com')

we can get the following information:

{u’javascript-frameworks’: [u’jQuery’, u’Modernizr’, u’jQuery UI’],
u’web-frameworks’: [u’Web2py’, u’Twitter Bootstrap’],
u’programming-languages’: [u’Python’], u’web-servers’: [u’Nginx’]}

This module will take a URL,download and analyze it, and then return
the technologies used by the website.

Finding the owner of a website

You need to install the module named python-whois

pip install python-whois

1
2
import whois
print whois.whois('weibo.com')

print whois.whois(‘weibo.com’)

print whois.whois(‘weibo.com’)
{
“updated_date”: [
“2016-03-20 00:00:00”,
“2016-03-20 14:56:29”
],
“status”: [
“clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited“,
“clientTransferProhibited https://icann.org/epp#clientTransferProhibited“,
“clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited
],
“name”: “Beijing Weibo Internet Technology Co.,Ltd”,
“dnssec”: “unsigned”,
“city”: “Bei jing”,
“expiration_date”: [
“2026-03-20 00:00:00”,
“2111-03-20 04:00:00”
],
“zipcode”: “100080”,
“domain_name”: [
“WEIBO.COM”,
“weibo.com”
],
“country”: “CN”,
“whois_server”: “whois.35.com”,
“state”: “Bei jing”,
“registrar”: “35 Technology Co., Ltd.”,
“referral_url”: “http://www.35.com“,
“address”: “Ideal Int’l Plaza, No. 58, West Of North Forth Ring Rd”,
“name_servers”: [
“NS1.SINA.COM.CN”,
“NS2.SINA.COM.CN”,
“NS3.SINA.COM”,
“NS3.SINA.COM.CN”,
“NS4.SINA.COM”,
“NS4.SINA.COM.CN”,
“ns1.sina.com.cn”,
“ns2.sina.com.cn”,
“ns3.sina.com.cn”,
“ns4.sina.com.cn”,
“ns3.sina.com”,
“ns4.sina.com”
],
“org”: “Beijing Weibo Internet Technology Co., Ltd.”,
“creation_date”: [
“1999-03-20 00:00:00”,
“1999-03-20 04:00:00”
],
“emails”: [
“abuse@35.cn”,
“domainname@staff.sina.com.cn”
]
}

文章目录
  1. 1. Identifying the technology used by a website
  2. 2. Finding the owner of a website
,