python提取http头信息

python提取http头信息

想用python提取http头中比如Etag,Content-Length这两个头的值,有没有特定的库可以实现啊?!
不想用urllib2实现,太慢了,有C扩展的么?

pycurl又简单又强大又快速。
pycurl.Curl()有getinfo方法。
恩,我看了pycurl库,网络上也有些资料,不过详细的都还是E文的,
下面是一段例子代码
import pycurl
c = pycurl.Curl()
c.setopt(pycurl.URL, "http://www.baidu.com/")
c.setopt(pycurl.HTTPHEADER, ["Accept:"])
import StringIO
b = StringIO.StringIO()
c.setopt(pycurl.WRITEFUNCTION, b.write)
c.setopt(pycurl.FOLLOWLOCATION, 1)
c.setopt(pycurl.MAXREDIRS, 5)
c.perform()
c.getinfo(pycurl.CONTENT_LENGTH_DOWNLOAD)



CONTENT_LENGTH_DOWNLOAD就是可以得到Content-Length的头信息,可是getinfo()没有Etag的变量啊!如何获得
还有c.setopt(pycurl.HTTPHEADER, ["Accept:"]) 这段什么意思,文档上是这么解释的,不过不大理解:
CURLOPT_HEADER

A non-zero parameter tells the library to include the header in the body output. This is only relevant for protocols that actually have headers preceding the data (like HTTP).
pycurl能不能象urllib2一样以字典的方式返回所有http头信息啊!?
import httplib

conn=httplib.HTTPConnection("www.sina.com")
conn.request("GET", "/")
r=conn.getresponse()
r.getheaders() #获取所有的http头
r.getheader("content-length") #获取特定的头


>>> conn=httplib.HTTPConnection("www.sina.com.cn")
>>> conn.request("GET", "/")
>>> r=conn.getresponse()
>>> r.getheaders()
[('x-cache', 'HIT from sh-9.sina.com.cn'), ('x-powered-by', 'mod_xlayout_jh/0.0.
1vhs.markII.remix'), ('accept-ranges', 'bytes'), ('expires', 'Tue, 25 Mar 2008 0
8:43:33 GMT'), ('vary', 'Accept-Encoding'), ('server', 'Apache/2.0.54 (Unix)'),
('last-modified', 'Tue, 25 Mar 2008 08:32:57 GMT'), ('connection', 'close'), ('e
tag', '"b177fb-48d9a-cca4e040"'), ('cache-control', 'max-age=60'), ('date', 'Tue
, 25 Mar 2008 08:42:33 GMT'), ('content-type', 'text/html'), ('age', '54')]
>>> r.getheader("content-length")
>>>

谢谢楼上的,不过我的意思是想用pycurl库来实现上面的功能,pycurl性能和速度要比httplib快啊!
python版人气很低啊!!!