urlgrabber


urlgrabber
http://linux.duke.edu/projects/urlgrabber/
urlgrabber is a pure python package that drastically simplifies the fetching of files. It is designed to be used in programs that need common (but not necessarily simple) url-fetching features. It is extremely simple to drop into an existing program and provides a clean interface to protocol-independant file-access. Best of all, urlgrabber takes care of all those pesky file-fetching details, and lets you focus on whatever it is that your program is written to do!
urlgrabber came into existence as the part of
yum
that downloads rpms and header files, but it quickly became clear that this is a general problem that many applications must deal with.
Features
Using urlgrabber, data can be fetched in three basic ways:
urlgrab(url)
copy the file to the local filesystem
urlopen(url)
open the remote file and return a file object
urlread(url)
return the contents of the file as a stringWhen using these functions (or methods), urlgrabber supports the following features:

  • identical behavior for http://, ftp://, and file:// urls
  • http keepalive - faster downloads of many files by using only a single connection
  • byte ranges - fetch only a portion of the file
  • reget - for a urlgrab, resume a partial download
  • progress meters - the ability to report download progress automatically, even when using urlopen!
  • throttling - restrict bandwidth usage
  • batched downloads using threads - download multiple files simultaneously (feature still in progress)
  • retries - automatically retry a download if it fails. The number of retries and failure types are configurable
  • authenticated server access for http and ftp
  • proxy support - support for authenticated http and ftp proxies
  • mirror groups - treat a list of mirrors as a single source, automatically switching mirrors if there is a failure

Not sure if urlgrabber is the tool for you? Check out our
comparison
of the major options.
News
       
          September-25-2003
          
                Version 0.0.12 released.  Features include:
            

                
  • The 'op' permission has been split into 'op' and
                 'autoop'.  If you're upgrading from an older version, be
                 sure to read the release
                 notes.
                
  • Bug fixes in the auto-op-ing.
                         
  • Added an irc.raw command for telling the bot to send
                         raw text to the server
                

          
       
    -->
Documentation, Examples, and Help
There are many sources of urlgrabber-related assistance and information

  • The urlgrabber
    package documentation
    , built from the __doc__ strings using pydoc
  • The
    examples page

  • The urlgrabber
    package contents
    including the source code
  • Browsable
    urlgrabber cvs
    (synced every few hours)
  • The
    yum-devel
    mailing list. For now, urlgrabber is piggy-backing on this list. If it becomes necessary, we will get our own list. When posting to this list, please indicate that it is a urlgrabber-related post by beginning the subject with [UG].

Authors and Credits
urlgrabber is written and maintained by Michael Stenner and Ryan Tomayko. We would like to thank Seth Vidal for many valuable ideas and suggestions, and also for the very earliest version of the code that became urlgrabber. We would also like to thank Linux@DUKE and Duke University for the resources they have provided.
All urlgrabber-related mail (questions, comments, requests, bug reports, praise) should be directed to the
yum-devel
mailing list. Please indicate that it is a urlgrabber-related post by beginning the subject with [UG].
License and Copyright
urlgrabber is © 2002-2006 Michael D. Stenner and Ryan Tomayko.
This software is licensed under the GNU LGPL and comes without any warranty, written or implied. For more information about GNU LGPL please see
http://www.gnu.org/licenses/lgpl.html
.
Download
Release information:
urlgrabber follows kernel-style version numbering. As such, the 3.0.x series is the current "stable" branch, and 3.1.x is considered development.
We are no longer providing RPMs for urlgrabber for two reasons:

  • It's too much pain to build them for all the relevant systems. We just don't have access to them.
  • It's really easy to build an RPM from the tarball. Simply unpack the tarball, cd into it, and do
    python setup.py bdist_rpm

Name        : urlgrabber                   Relocations: (not relocatable)
Version     : 2.9.5                             Vendor: Michael D. Stenner, Ryan Tomayko
Release     : 1                             Build Date: Wed 02 Mar 2005 07:54:58 PM EST
Install Date: (not installed)               Build Host: bird.ece.arizona.edu
Group       : Development/Libraries         Source RPM: (none)
Size        : 74288                            License: LGPL
Signature   : (none)
URL         : http://linux.duke.edu/projects/urlgrabber/
Summary     : A high-level cross-protocol url-grabber
Description :
A high-level cross-protocol url-grabber.
Using urlgrabber, data can be fetched in three basic ways:
  urlgrab(url) copy the file to the local filesystem
  urlopen(url) open the remote file and return a file object
     (like urllib2.urlopen)
  urlread(url) return the contents of the file as a string
When using these functions (or methods), urlgrabber supports the
following features:
  * identical behavior for http://, ftp://, and file:// urls
  * http keepalive - faster downloads of many files by using
    only a single connection
  * byte ranges - fetch only a portion of the file
  * reget - for a urlgrab, resume a partial download
  * progress meters - the ability to report download progress
    automatically, even when using urlopen!
  * throttling - restrict bandwidth usage
  * retries - automatically retry a download if it fails. The
    number of retries and failure types are configurable.
  * authenticated server access for http and ftp
  * proxy support - support for authenticated http and ftp proxies
  * mirror groups - treat a list of mirrors as a single source,
    automatically switching mirrors if there is a failure.
    -->
附: 文档

文件:
urlgrabber.rar
大小:
31KB
下载:
下载