我的python歷程--第三天

wantfly

UID: 7906
帖子: 27
积分: 62
在线时间: 1 小时

1^# wantfly 发表于 2005-10-24 21:37

我的python歷程--第三天

今天看了正则表达式的用法，和AWK，SED的表示方法差不多，这样可以省点力了。
还利用Karrigell实现了WEB的输出。将Karrigell安装上，不过gadfly没有装成功。
RE的EXAMPLE:
import re
# open a file
file = open("z:web.log","r")
text = file.readlines()
file.close()
# compiling the regular expression:
keyword = re.compile("10.7.1.31")
# searching the file content line by line:
for line in text:
if keyword.search (line):
print line
################################
利用Karrigell实现了WEB的输出的sample,
我的Karrigell是安装在c:Karrigell的,将以下代码存为index.py,放到c:karrigell目录下:
打开IE或firefox执行:http://127.0.0.1/index.py即可得到输出的代码.
我试了几次,我的WEBEXTD20051013.log文件有82.5M(兆) ,大约有60多万行的LOG信息。
生成网页的时间不超过一分钟，看来PYTHON的执行速度和文本处理能力还是不错的。
import sys
import re
import string
import datetime
input = sys.stdin
input = open("z:WEBEXTD20051013.log")
ip_list = []
user_list = []
#ii = 0
for line in input.readlines():
list1 = line.split(" ")
if list1[0] not in ip_list:
      ip_list.append(list1[0])
      #user_list.append(list1[1])
#print ip_list
print ""
for x in range(2,len(ip_list)):
    #print ip_list[x]
    print ""
    print ""
    print ip_list[x]
    print ""
    print ""
    print "test width 30%"
    print ""
    print "test width 50%"
    print ""
print ""
总结：
1。对RE的使用，感觉还是没有LINUX下的SED，AWK，GREP来的方便.因为只能对每行分别进行处理
      不象GREP一样,直接对整个文档进行处理.也许是我还没有找到方法吧.
2. 在作第二的WEB页面输入的SAMPLE时老是出现list out of range的ERROR MESSAGE.不知道是为什么.
   开始以为是LIST大小有限制,但后面使用
   for x in range(200000)
            a.append("www.com")
   这样的代码来测试,也没有问题.不知道问题出现在什么地方.明天再研究一下吧.