Identifying the technology used by a website
we need to install builtwith first1
2import builtwith
builtwith.parse('http://example.webscrapying.com')
we can get the following information:
{u’javascript-frameworks’: [u’jQuery’, u’Modernizr’, u’jQuery UI’],
u’web-frameworks’: [u’Web2py’, u’Twitter Bootstrap’],
u’programming-languages’: [u’Python’], u’web-servers’: [u’Nginx’]}
This module will take a URL,download and analyze it, and then return
the technologies used by the website.
Finding the owner of a website
You need to install the module named python-whois
pip install python-whois
1 | import whois |
print whois.whois(‘weibo.com’)
print whois.whois(‘weibo.com’)
{
“updated_date”: [
“2016-03-20 00:00:00”,
“2016-03-20 14:56:29”
],
“status”: [
“clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited“,
“clientTransferProhibited https://icann.org/epp#clientTransferProhibited“,
“clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited“
],
“name”: “Beijing Weibo Internet Technology Co.,Ltd”,
“dnssec”: “unsigned”,
“city”: “Bei jing”,
“expiration_date”: [
“2026-03-20 00:00:00”,
“2111-03-20 04:00:00”
],
“zipcode”: “100080”,
“domain_name”: [
“WEIBO.COM”,
“weibo.com”
],
“country”: “CN”,
“whois_server”: “whois.35.com”,
“state”: “Bei jing”,
“registrar”: “35 Technology Co., Ltd.”,
“referral_url”: “http://www.35.com“,
“address”: “Ideal Int’l Plaza, No. 58, West Of North Forth Ring Rd”,
“name_servers”: [
“NS1.SINA.COM.CN”,
“NS2.SINA.COM.CN”,
“NS3.SINA.COM”,
“NS3.SINA.COM.CN”,
“NS4.SINA.COM”,
“NS4.SINA.COM.CN”,
“ns1.sina.com.cn”,
“ns2.sina.com.cn”,
“ns3.sina.com.cn”,
“ns4.sina.com.cn”,
“ns3.sina.com”,
“ns4.sina.com”
],
“org”: “Beijing Weibo Internet Technology Co., Ltd.”,
“creation_date”: [
“1999-03-20 00:00:00”,
“1999-03-20 04:00:00”
],
“emails”: [
“abuse@35.cn”,
“domainname@staff.sina.com.cn”
]
}