python中找一个函数？？？

lemonniu

UID: 34560
帖子: 2
积分: 4
在线时间: 10 分钟

1^# lemonniu 发表于 2008-09-27 15:23

python中找一个函数？？？

函数功能是能够获得一个网页的header信息。。。

哪位朋友知道跟我说声＠＠＠＠

谢谢。

alan_yang

UID: 5743
帖子: 14
积分: 32
在线时间: 10 分钟

2^# alan_yang 发表于 2008-09-27 15:58

import urllib
f=urllib.urlopen("http://www.google.com")
print(f.headers)

lemonniu

UID: 34560
帖子: 2
积分: 4
在线时间: 10 分钟

3^# lemonniu 发表于 2008-09-27 16:59

QUOTE:

原帖由 alan_yang 于 2008-9-27 15:58 发表
import urllib
f=urllib.urlopen("http://www.google.com")
print(f.headers)

thanks

ghostwwl

UID: 5349
帖子: 12
积分: 27
在线时间: 10 分钟

4^# ghostwwl 发表于 2008-09-29 03:02

你这样的话虽然取到了head实际上整个页面都下下来了

刚好我也碰到过要取head的问题不过我取得是状态和head

[Copy to clipboard] [ - ]

CODE:

#!/usr/bin/env python
#-*- coding: utf-8 -*-

'''
Get Url status and head
'''

#************************
# FileName: GetHeard.py
# Author: ghostwwl
# Email: [email]ghostwwl@gmail.com[/email]
#************************

import httplib

def getHead(url):
Head = None
Status = None
try:
      site = None
      page = None
      url = url.replace("http://", '')
      if not '/' in url:
         url += '/'
      site, page = url.split("/", 1)
      if site is not None and page is not None:
         if page == '':
            page = '/'
         if not page.startswith('/'):
            page = '/%s' % page
         httpconn = httplib.HTTPConnection(site)
         httpconn.request("HEAD", page)
         resp = httpconn.getresponse()
         print dir(resp)
         Status =  resp.status
         Head = resp.getheaders()
      if Head is not None:
         Head = dict(Head)
except Exception, e:
      raise Exception(str(e))
return (Status, Head)

if __name__ == '__main__':
s, r = getHead("http://www.baidu.com/s?ct=0&ie=gb2312&bs=http+head&sr=&z=&cl=3&f=8&wd=http+head%CA%FD%BE%DD%BD%E2%CE%F6")
print s
print r

23号

UID: 42996
帖子: 193
积分: 443
在线时间: 2 天 2 小时

5^# 23号发表于 2008-09-29 08:39

QUOTE:

原帖由 ghostwwl 于 2008-9-29 03:02 发表
你这样的话虽然取到了head实际上整个页面都下下来了

刚好我也碰到过要取head的问题不过我取得是状态和head

#!/usr/bin/env python
#-*- coding: utf-8 -*-

'''
Get Url status and head
''' ...

8错。

pumaboyd

UID: 38663
帖子: 71
积分: 163
在线时间: 7 小时

6^# pumaboyd 发表于 2008-09-29 12:05

不错，可以看出urllib，httplib的一些区别

lemonniu

UID: 34560
帖子: 2
积分: 4
在线时间: 10 分钟

7^# lemonniu 发表于 2008-09-29 15:04

QUOTE:

原帖由 ghostwwl 于 2008-9-29 03:02 发表
你这样的话虽然取到了head实际上整个页面都下下来了

刚好我也碰到过要取head的问题不过我取得是状态和head

#!/usr/bin/env python
#-*- coding: utf-8 -*-

'''
Get Url status and head
''' ...

刚刚使用了一下。
对163测试返回来：
[('x-cache', 'HIT from cache.163.com'), ('accept-ranges', 'bytes'), ('expires', 'Mon, 29 Sep 2008 07:01:06 GMT'), ('vary', 'Accept-Encoding'), ('server', 'Apache/2.2.6 (Unix)'), ('connection', 'close'), ('x-pad', 'avoid browser bug'), ('cache-control', 'max-age=180'), ('date', 'Mon, 29 Sep 2008 06:58:06 GMT'), ('content-type', 'text/html; charset=GB2312'), ('age', '173')]

但是我觉得还少了点东西，正常的headers应该还包括：HTTP OK 200这一行，不知道怎么少了。。。。
还是函数用的不对。

ghostwwl

UID: 5349
帖子: 12
积分: 27
在线时间: 10 分钟

8^# ghostwwl 发表于 2008-09-29 17:05

建一你从httplib开始因为urllib 和urllib2是在httplib等之上的

urllib 和urllib2碰到http协议的时候实际调用的httplib之类的呵呵

标准库之间是有关系的不用到处找源代码看有时间多看看标准库的代码

可以发现不少东西

lemonniu

UID: 34560
帖子: 2
积分: 4
在线时间: 10 分钟

9^# lemonniu 发表于 2008-09-29 17:13

QUOTE:

原帖由 ghostwwl 于 2008-9-29 17:05 发表
建一你从httplib开始因为urllib 和urllib2是在httplib等之上的

urllib 和urllib2碰到http协议的时候实际调用的httplib之类的呵呵

标准库之间是有关系的不用到处找源代码看有时间多看看标准库 ...

还没达到那个水平。。。唉，要学的东西太多了？？？

ghostwwl
python的httplib库里有能够获得完整headers信息的函数吗？

ghostwwl

UID: 5349
帖子: 12
积分: 27
在线时间: 10 分钟

10^# ghostwwl 发表于 2008-09-29 17:18

你具体去看看那个库就知道然后找点http协议的资料对着看

python中 找一个函数？？？

python中 找一个函数？？？

python中找一个函数？？？

python中找一个函数？？？