共享一个ftp同步脚本

共享一个ftp同步脚本

一直想找一个能够完成FTP Synchronizer差不多功能的http://www.ftpsynchronizer.com/ 的工具,花费了很多时间找到了一个perl脚本,原始版本,详细注释http://www.linuxjournal.com/article/6686,升级版本,http://zuidema.org/edwin/Kiss_sync.html

首先感谢那三位作者的贡献,但是我发现该脚本仍然有几个问题:
1。ftp固有的不能设置修改时间的问题。
2。不能报告同步冲突
3。同名且大小相等并不能保证两个文件一致。

所以我对代码进行了修改,分享给大家,也希望大家帮我检查一下里面可能存在的问题(我已经进行过测试,但难保没问题。),让这个程序能够完全代替ftp synchronizer.
用法:
-h 帮助
-v verbose模式
-d 输出ftp调试信息
-k 输出同步动作,并不执行。强烈建议同步前做一次,以免误伤。
-P 设置ftp passive mode。防火墙后的机器需要。
-i  不同步的文件
-s ftp服务器地址
-u ftp用户名
-p ftp密码
-r  ftp同步目录
-l  本地同步目录
-o 时差偏移量,单位s,本地时区-ftp时区。第一次同步时需要,之后不用。

[Copy to clipboard] [ - ]
CODE:
#!/usr/bin/perl

# This script is (c) 2002 Luis E. Muñoz, All Rights Reserved
#                (c) 2005 Peter Orvos, All Rights Reserved
#                (c) 2006 Edwin Zuidema, All Rights Reserved
#                 (c) 2008 Plato, All Rights Reserved
# This code can be used under the same terms as Perl itself. It comes
# with absolutely NO WARRANTY. Use at your own risk.
#
# TO BE DONE
# - mtime: if from L->R, R has current mtime. Then next round R will go L (newer)
#   And then L-R and so on. How to solve? Remote mtime? update local time?
#

use strict;
use warnings;
use Net::FTP;
use File::Find;
use File::Listing; # Try EZ
use Pod::Usage;
use Getopt::Std;
use POSIX 'strftime';
#Plato Wu,2008/09/06
use Storable;
use Digest::MD5;

use vars qw($opt_s $opt_k $opt_u $opt_l $opt_p $opt_r $opt_h $opt_v
            $opt_d $opt_P $opt_i $opt_o);

getopts('i:o:l:s:u:p:r:hkvdP');

if ($opt_h)
{
    pod2usage({-exitval => 2,
               -verbose => 2});
}
                                # Defaults are set here
$opt_s ||= 'localhost';
$opt_u ||= 'anonymous';
$opt_p ||= 'someuser@';
$opt_r ||= '/';
$opt_l ||= '.';
$opt_o ||= 0;

$opt_i = qr/$opt_i/ if $opt_i;

$|++;                           # Autoflush STDIN

my %rem = ();
my %loc = ();

my $last_file = ".last";

print "Using time offset of $opt_o seconds\n" if $opt_v and $opt_o;

                                # Phase 0: Scan local path and see what we
                                # have
print "\n### Phase 0: Scanning local ###\n";

print "dir: $opt_l\n";
chdir $opt_l or die "Cannot change dir to $opt_l: $!\n";

# First get date/time of last sync
my $last = ((stat($last_file))[9] || 0);
my $mdtm_form = strftime("%c",localtime($last));
print "Last time synced: $mdtm_form\n";

find(
     {
         no_chdir       => 1,
         follow         => 0,   # No symlinks, please
         wanted         => sub
         {
             return if $File::Find::name eq '.';
             $File::Find::name =~ s!^\./!!;
             if (($opt_i and $File::Find::name =~ m/$opt_i/) || ($File::Find::name =~ m/$last_file/))
             {
                 print "local: IGNORING $File::Find::name\n" if $opt_d;
                 return;
             }
             stat($File::Find::name);
             my $type = -f _ ? 'f' : -d _ ? 'd' : -l $File::Find::name ? 'l' : '?';
             my @dirs = split /\//, $File::Find::name;
             open(F, $File::Find::name) or die "open error";
              binmode(F);
             my $r = $loc{$File::Find::name} =
             {
                 md5 => Digest::MD5->new->addfile(*F)->hexdigest,
                 mdtm => (stat(_))[9],
                 size => (stat(_))[7],
                 type => $type,
                 
             };
             close F;
             my $mdtm_form = strftime("%c",localtime($r->{mdtm}));
             print "local: adding $File::Find::name (",
             "$r->{mdtm}, $mdtm_form, $r->{size}, $r->{type})\n" if $opt_d;
         },
     }, '.' );


                                # Phase 1: Build a representation of what's
                                # in the remote site
print "\n### Phase 1: Scanning FTP ###\n";

my $ftp = new Net::FTP ($opt_s,
                        Timeout         => 999,
                        Debug           => $opt_d,
                        Passive         => $opt_P,
                        );

die "Failed to connect to server '$opt_s': $!\n" unless $ftp;
die "Failed to login as $opt_u\n" unless $ftp->login($opt_u, $opt_p);
die "Cannot change directory to $opt_r\n" unless $ftp->cwd($opt_r);
warn "Failed to set binary mode\n" unless $ftp->binary();

my $needhome = 0;

print "connected\n" if $opt_v;

sub scan_ftp
{
    my $ftp     = shift;
    my $path    = shift;
    my $rrem    = shift;
    print "scan_ftp $ftp, path $path, rrem $rrem\n";
#    my $rdir = length($path) ? $ftp->dir($path) : $ftp->dir();
    # parse_dir of File:Listing better parses mtime for directories
#    my $rdir = length($path) ? parse_dir($ftp->dir($path)) : parse_dir($ftp->dir());
    my $rdir;
    my @r2dir;

#    $path =~ s/\s/\\ /g;
#    $path = "\"$path\"";

#    print "scan_ftp $ftp, path $path, rrem $rrem\n";

    if (length($path)) {
        # Already in a path
        $ftp->cwd("$opt_r/$path");
        # Plato Wu,2008/09/08
        # it enter sub directory, then set a flag to use in the future.
        $needhome = 1;
    } else {
        print "first call\n";
        $ftp->cwd("$opt_r");
    }

    $rdir = parse_dir($ftp->dir());

    return unless $rdir and @$rdir;

#    print "Going through the files in this dir ($path)\n";
    for my $f (@$rdir)
    {
#        print "a file found in this dir ($path)\n";
        next if $f =~ m/^d.+\s\.\.?$/;

#        my @line = split(/\s+/, $f, 9);
#        my $n = (@line == 4) ? $line[3] : $line[8]; # Compatibility with windows FTP
#        next unless defined $n;
#        print "parsing entry (in dir $path)\n";
        my ($n, $type, $size, $mtime, $mode) = @$f;

        my $name = '';
        $name = $path . '/' if $path;
        $name .= $n;

        if ($opt_i and $name =~ m/$opt_i/)
        {
            print "remote: IGNORING $name\n" if $opt_d;
            next;
        }

#        print "name '$name'\n" if $opt_v;

        next if exists $rrem->{$name};

        my $mdtm = ($mtime || 0) + $opt_o;
        $size = $size || 0;
#        my $mdtm = ($ftp->mdtm($name) || 0) + $opt_o;
#        my $size = $ftp->size($name) || 0;
#        my $type = (@line == 4) ? ($line[2] =~/\<DIR\>/i ? 'd' : 'f') : substr($f, 0, 1); # Compatibility with windows FTP

        $type =~ s/-/f/;

        my $mdtm_form = strftime("%c",localtime($mdtm));

        if ($type eq 'd') {
            print "remote: recursing in dir $name: calling scan_ftp($ftp, $name, $rrem)\n" if $opt_v;
            scan_ftp($ftp, $name, $rrem);
        }
#        } else {
            print "remote: adding file $name (offset mtime $mdtm_form)\n" if $opt_v;
            $rrem->{$name} =
            {
                mdtm => $mdtm,
                size => $size,
                type => $type,
            }
#        }
    }
}
# Plato Wu,2008/09/06
if ($ftp->get($last_file, $last_file."remote")){
    # it seems no using
    # To do use parse_dir instead of mdtm for some ftp does not support it.
#    utime $ftp->mdtm($last_file), $ftp->mdtm($last_file), $last_file."remote";
    my $hash_ref = retrieve $last_file."remote";
    %rem = %$hash_ref;
#    unlink $last_file."remote";
#    exit;
}else{
    scan_ftp($ftp, '', \%rem);
}

$ftp->cwd($opt_r) if $needhome;

#
# Phase 2: Handle missing files
#
print "\n### Phase 2: Missing files ###\n";

# Algorithm
# If file is older than last sync delete it
# If file is newer than last sync sync it

# For local files:
for my $ml (sort { length($a) <=> length($b) } keys %loc)
{
    if ($loc{$ml}->{type} eq 'l')
    {
        warn "Symbolic link $ml not supported\n";
        next;
    }
        
    # Skip if file/dir exists also remotely (will be handled in phase 3)
    next if exists $rem{$ml};

    # File/dir exists locally but not remotely
    print "$ml file/dir missing from the FTP repository\n" if $opt_v;

    # Check if newer than last sync
    print "mdtm $loc{$ml}->{mdtm} last $last\n" if $opt_v;
    if ($loc{$ml}->{mdtm} > $last) {
        # Newer, so copy to remote
       if ($loc{$ml}->{type} eq 'd')
       {
            print "$ml dir missing remotely, making remotely\n" if $opt_v;
            $opt_k ? print "Kidding: MKDIR $ml\n" : $ftp->mkdir($ml)
                or die "Failed to MKDIR $ml\n";
       }
       else # Regular file
       {
            print "$ml file missing remotely, PUTting\n" if $opt_v;
            $opt_k ? print "Kidding: PUT $ml $ml\n" : $ftp->put($ml, $ml)
                or print "*** Failed to PUT $ml ***\n";
       }
    } else {
        # Local file older than last sync, so deleted from remote. Also delete locally
        if ($loc{$ml}->{type} eq 'd') {
            print "$ml dir removed remotely, removing locally\n" if $opt_v;
            $opt_k ? print "Kidding: rmdir $ml\n" : rmdir($ml)
                or print "*** Failed to rmdir dir $ml ***\n";
        } else {
            print "$ml file removed remotely, removing locally\n" if $opt_v;
            $opt_k ? print "Kidding: rm $ml\n" : unlink($ml)
                or print "*** Failed to rm $ml ***\n";
        }
        # Plato Wu,2008/09/07
        # maintain %loc
        delete $loc{$ml};

    }
}

# For remote files:
for my $mr (sort { length($a) <=> length($b) } keys %rem)
{
    if ($rem{$mr}->{type} eq 'l')
    {
        warn "Symbolic link $mr not supported\n";
        next;
    }
        
    # Skip if file/dir exists also locally (will be handled in phase 3)
    next if exists $loc{$mr};

    print "$mr file/dir missing locally\n" if $opt_v;

    # Check if newer than last sync
    print "mdtm $rem{$mr}->{mdtm} last $last\n" if $opt_v;
    if ($rem{$mr}->{mdtm} > $last) {
        # Plato Wu,2008/09/07
        # maintain %loc
        $loc{$mr} = $rem{$mr};
       
        # Newer, so copy to local
        if ($rem{$mr}->{type} eq 'd') {
            print "$mr dir missing in the local repository, making locally\n" if $opt_v;
            $opt_k ? print "Kidding: mkdir $mr\n" : mkdir($mr)
                or print "*** Failed to MKDIR $mr ***\n";
        } else {
            print "$mr file missing in the local repository, GETting\n" if $opt_v;
            $opt_k ? print "Kidding: GET $mr $mr\n" : $ftp->get($mr, $mr)
                or print "*** Failed to GET $mr ***\n";
        }
        # Added EZ: Set the file time to the mdtm
        my $mdtm_form = strftime("%c",localtime($rem{$mr}->{mdtm}));
        print "Setting mtime $mdtm_form to local $mr\n" if $opt_v;
        $opt_k ? print "Kidding: Set Utime\n" : utime $rem{$mr}->{mdtm}, $rem{$mr}->{mdtm}, $mr;

    } else {
        # Remote file older than last sync, so deleted locally
        # Also delete remotely
        if ($rem{$mr}->{type} eq 'd') {
            print "$mr dir deleted locally, removing remotely\n" if $opt_v;
            $opt_k ? print "Kidding: ftp->rmdir $mr\n" : $ftp->rmdir($mr)
                or print "*** Failed to remote rmdir $mr ***\n";
        } else {
            print "$mr file deleted locally, removing remotely\n" if $opt_v;
            $opt_k ? print "Kidding: ftp->delete $mr\n" : $ftp->delete($mr)
                or print "*** Failed to remote delete $mr ***\n";
        }
    }
}

#
# Phase 3: For files that exist on both sides
#
print "\n### Phase 3: Files on both sides ###\n";

# For remote files: Download if newer
for my $dl (sort { length($a) <=> length($b) } keys %rem)
{
    # only handle files that exist on both sides
    next if not exists $loc{$dl};

    warn "Symbolic link $dl not supported\n"
        if $rem{$dl}->{type} eq 'l';
   
    # forget dirs?
    if ($rem{$dl}->{type} eq 'f')
    {
        # Plato Wu,2008/09/07
        # remarks for handle exactly problem in the other place
         # Skip if exactly the same size
#         next if $rem{$dl}->{size} eq $loc{$dl}->{size};

         # Skip if remote older (local newer)
         next if $rem{$dl}->{mdtm} <= $loc{$dl}->{mdtm};

#        # If remote smaller, remove remote and PUT
#        if ($rem{$dl}->{size} < $loc{$dl}->{size})
#        {
#            print "$dl file smaller in the remote repository ";
#            print "(local: $loc{$dl}->{size} remote: $rem{$dl}->{size})\n";
#            print "DELETEing\n";
#            $opt_k ? print "Kidding: ftp->delete $dl\n" : $ftp->delete($dl)
#                or print "*** Failed to remote delete $dl ***\n";
#            print "PUTting\n";
#            $opt_k ? print "Kidding: PUT $dl $dl\n" : $ftp->put($dl, $dl)
#                or print "*** Failed to PUT $dl ***\n";
#        } else {

        # GET if file local older
        my $mdtm_form_loc = strftime("%c",localtime($loc{$dl}->{mdtm}));
        my $mdtm_form_rem = strftime("%c",localtime($rem{$dl}->{mdtm}));

        # Plato Wu,2008/09/07
        # next if exactly the same size and md5 checksum
        if (($rem{$dl}->{size} eq $loc{$dl}->{size}) && ($rem{$dl}->{md5} eq $loc{$dl}->{md5})){
            if($rem{$dl}->{mdtm} > $loc{$dl}->{mdtm}){
                print "Setting mtime $mdtm_form_rem to local $dl\n" if $opt_v;
                $opt_k ? print "Kidding: Set Utime\n" : utime $rem{$dl}->{mdtm}, $rem{$dl}->{mdtm}, $dl;
            }
            next;
        }

         # Plato, 08/09/06
         # if remote > local >= last sync, it mean there is a conflict after last sync
         # use = for cautious
         if ($loc{$dl}->{mdtm} >= $last) {
             print "there is a newer file $dl in local and cause a conflict, please handle it\n";
             print $mdtm_form_loc, ",", $mdtm_form_rem, ",", $mdtm_form, "\n";
             next;
         }

        if ($opt_v)
        {
            print "$dl file older in the local repository ";
            print "(local: $loc{$dl}->{mdtm} $mdtm_form_loc remote: $rem{$dl}->{mdtm}) $mdtm_form_rem\n";
            print "GETting\n"
        }
         $opt_k ? print "Kidding: GET $dl $dl\n" : $ftp->get($dl, $dl)
             or print "*** Failed to GET $dl ***\n";

         # Added EZ: Set the file time to the mdtm
         print "Setting mtime $mdtm_form_rem to local $dl\n" if $opt_v;
         $opt_k ? print "Kidding: Set Utime\n" : utime $rem{$dl}->{mdtm}, $rem{$dl}->{mdtm}, $dl;

        # Plato Wu,2008/09/07
        # maintain %loc for put it in the future.
        $loc{$dl} = $rem{$dl};

        }
#    }
}

# For local files: Upload if newer
for my $ul (sort { length($a) <=> length($b) } keys %loc)
{
    # only handle files that exist on both sides
    next if not exists $rem{$ul};

    warn "Symbolic link $ul not supported\n"
        if $loc{$ul}->{type} eq 'l';

    if ($loc{$ul}->{type} eq 'f')
    {
         # Skip if local older (remote newer)
         # fix with 100s for rounding errors
        # Plato Wu,2008/09/08
        # now it does not need fix rounding error for it use actual modification time
        # not ftp put time.
#         next if ($rem{$ul}->{mdtm} + 100) >= $loc{$ul}->{mdtm};
          next if $rem{$ul}->{mdtm} >= $loc{$ul}->{mdtm};

        # Plato Wu,2008/09/07
        # next if exactly the same size and md5 checksum
        next if ($rem{$ul}->{size} eq $loc{$ul}->{size}) && ($rem{$ul}->{md5} eq $loc{$ul}->{md5}) ;


         # PUT if file remote older
         my $mdtm_form_loc = strftime("%c",localtime($loc{$ul}->{mdtm}));
         my $mdtm_form_rem = strftime("%c",localtime($rem{$ul}->{mdtm}));


         # Plato, 08/09/06
         # if local > remote >= last sync, it mean there is a conflict after last sync
         # use = for cautious
         if ($rem{$ul}->{mdtm} >= $last) {
             print "there is a newer file $ul in remote and cause a conflict, please handle it\n";
             print $mdtm_form_loc, ",", $mdtm_form_rem, ",", $mdtm_form, "\n";

             next;
         }



        if ($opt_v)
        {
            print "$ul file older in the FTP repository ";
            print "(local: $loc{$ul}->{mdtm} $mdtm_form_loc remote: $rem{$ul}->{mdtm}) $mdtm_form_rem\n";
            print "PUTting\n"
        }
         $opt_k ? print "Kidding: PUT $ul $ul\n" : $ftp->put($ul, $ul)
             or print "*** Failed to PUT $ul ***\n";
    }
}

# Update last sync time
my $now = time;
# Plato, 08/09/05, if file does not exist, utime can not create it, so add a open & close sentence
# $opt_k ? print "Kidding: TOUCH $last_file\n" : utime $now, $now, $last_file or (open F, ">$last_file") && (close F);

# Plato Wu,2008/09/06
# save local file information
if($opt_k){
    print "Kidding: Store sync file\n"
}else{
    open F, ">$last_file"; #or print "open error";
    store \%loc, $last_file; # or print "write error";
    close F;
}
# open F, ">$last_file"; #or print "open error";
# $opt_k ? print "Kidding: Store sync file\n" : store \%loc, $last_file;
# # or print "write error";
# close F;

# Plato Wu,2008/09/07
$opt_k ? print "Kidding: PUT $last_file\n" : $ftp->put($last_file, $last_file);

print "### Done ###\n";

__END__

=pod

=head1 NAME

ftpsync - Sync a hierarchy of local files with a remote FTP repository

=head1 SYNOPSIS

ftpsync [-h] [-v] [-d] [-k] [-P] [-s server] [-u username] [-p password] [-r remote] [-l local] [-i ignore] [-o offset]

=head1 ARGUMENTS

The recognized flags are described below:

=over 2

=item B<-h>

Produce this documentation.

=item B<-v>

Produce verbose messages while running.

=item B<-d>

Put the C<Net::FTP> object in debug mode and also emit some debugging
information about what's being done.

=item B<-k>

Just kidding. Only announce what would be done but make no change in
neither local nor remote files.

=item B<-P>

Set passive mode.

=item B<-i ignore>

Specifies a regexp. Files matching this regexp will be left alone.

=item B<-s server>

Specify the FTP server to use. Defaults to C<localhost>.

=item B<-u username>

Specify the username. Defaults to 'anonymous'.

=item B<-p password>

Password used for connection. Defaults to an anonymous pseudo-email
address.

=item B<-r remote>

Specifies the remote directory to match against the local directory.

=item B<-l local>

Specifies the local directory to match against the remote directory.

=item B<-o offset>

Allows the specification of a time offset between the FTP server and
the local host. This makes it easier to correct time skew or
differences in time zones.

=back

=head1 DESCRIPTION

This is an example script that should be usable as is for simple
website maintenance. It synchronizes a hierarchy of local files /
directories with a subtree of an FTP server.

The synchronyzation is quite simplistic. It was written to explain how
to C<use Net::FTP> and C<File::Find>.

Always use the C<-k> option before using it in production, to avoid
data loss.

=head1 BUGS

The synchronization is not quite complete. This script does not deal
with symbolic links. Many cases are not handled to keep the code short
and understandable.

=head1 AUTHORS

Luis E. Mu&#241;oz <[email]luismunoz@cpan.org[/email]>

=head1 SEE ALSO

Perl(1).

=cut

顶一下
收藏了,留着慢慢看
支持一下,好多不明白。慢慢研究吧。


QUOTE:
原帖由 dalian_gq 于 2008-9-9 09:18 发表
支持一下,好多不明白。慢慢研究吧。

原始版本那个网页注释很详细的,我也是先学习了的。
  真是不赖,看起来很详细,可惜自己功力太浅,还是有很多看不太懂的:(