davehansen’s posterous

« Back to blog

First python script

My first functional Python script. Grabs the 'src' attribute of a specific image on a specific page, using xpath.

#!/usr/bin/python -u

from lxml import etree

parser = etree.HTMLParser()
html   = etree.parse('http://orgsci.journal.informs.org/', parser)

img = html.xpath('//a[(((count(preceding-sibling::*) + 1) = 3) and parent::*)]//img/@src')

for x in img:
print(x)

 

 

 

Loading mentions Retweet

Comments (0)

Leave a comment...

 
To leave a comment on this posterous, please login by clicking one of the following.
Posterous-login     Connect     twitter