Imagine and innovate: Fetch urls anonymously with Python urllib2 and Tor

Sunday, October 19, 2008

Fetch urls anonymously with Python urllib2 and Tor

For various purposes (web scrapping, spiders, data extraction, etc) you need to use anonimity... 1. Install Tor 2. Check if Tor is working 3. Write your script in Python

import urllib2
proxy_support = urllib2.ProxyHandler({"http":"http://127.0.0.1:8118"})
opener = urllib2.build_opener(proxy_support)
url='http://whatismyip.com/'
page = opener.open(url)
contents=page.read()
print contents

And that is all

For web scrapping you can use Beautiful Soup. Adding some Beautiful Soup:

import re
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(page)
h1Tags = soup.findAll('h1')
#ip address text is in a 2nd h1 tag:
ip = re.sub(r'<[^>]*?>', '', str(h1Tag[1]))
print ip

If you want to see some readable text, see the page source, there is a comment about which url you can access to see only the IP address...

Imagine and innovate

Pages

Sunday, October 19, 2008

Fetch urls anonymously with Python urllib2 and Tor

No comments:

Web Development PHP Frameworks News

About Me

Projects

Blog Archive

Technorati Profile

Imagine and innovate

Pages

Sunday, October 19, 2008

Fetch urls anonymously with Python urllib2 and Tor

No comments:

Subscribe To

Web Development PHP Frameworks News

About Me

Projects

Blog Archive

Technorati Profile