Jul
14th
 

How to script Realtime backups in Python

Posted by admin in Python

In this tutorial we shall show you how easy it is to make simple real-time backup software by using the Python module called Pyinotify. This module is based on the Linux kernel feature named inotify (included since kernel 2.6.13); and this being an event-driven notifier, it’ll convey its notifications from kernel space to user space by means of using system-calls. The Pyinotify module will eventually bind such calls.

Let’s get to the code now: create a file labelled “backup.py” in your working directory and paste in the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
import os, os.path, time, signal, sys, operator, glob, shutil, filecmp
from pyinotify import WatchManager, Notifier, ThreadedNotifier, EventsCodes, ProcessEvent
 
 
BACKUP = {
        "/home/user/my_dir" : "/home/user/my_backup",
        "/home/user/test" : "/home/user/test_backup"        
}
 
 
class FileSystemEvent(ProcessEvent):
 
        def __init__(self, watcher):
                self.f = FileBackup(watcher)                
 
        def process_event(self, event):
                self.f.process_file(event.path, event.name)
 
        def process_IN_CREATE(self, event):
                self.process_event(event)
 
        def process_IN_DELETE(self, event):
                self.process_event(event)
 
        def process_IN_MODIFY(self, event):
                self.process_event(event)
 
 
class FileBackup:
        def __init__(self, watcher):
                self.watcher = watcher
                self.l = BACKUP.keys()
 
        def process_file(self, event_path, event_name):
                path = os.path.join(event_path, event_name)
 
                if os.path.isdir(path):
                        self.watcher.add(path) #watch new dir
                        base = self.get_base_path(path)
                else:
                        base = self.get_base_path(os.path.dirname(path))
 
                dst = BACKUP[base[0]]+base[1]
 
                try:
                        os.makedirs(dst)
                except: pass
 
 
                if os.path.isfile(path):
                        if os.path.isfile(dst+"/"+event_name):
                                if filecmp.cmp(path, dst+"/"+event_name):
                                        return                                
 
                        shutil.copy2(path, dst+"/"+event_name)
 
 
        def get_base_path(self, path):                
                for p in self.l:
                        if (len(path.replace(p, '')) < len(path)):
                                return p, path.replace(p, '')
                return "",""
 
 
class FileSystemWatcher():
        def __init__(self, dirs=[]):
                self.wm       = WatchManager()
                self.fse      = FileSystemEvent(self)
 
                self.mask     = EventsCodes.IN_DELETE | EventsCodes.IN_CREATE | EventsCodes.IN_MODIFY # events
                self.notifier = ThreadedNotifier(self.wm, self.fse)
 
                for dir in dirs: self.add(dir)               
 
        def start(self):
                self.notifier.start()
 
        def block(self):
                self.notifier.join() 
 
        def close(self):
                self.notifier.close() 
 
        def add(self, dirpath):
                if os.path.isdir(dirpath):
                        self.wdd = self.wm.add_watch(dirpath, self.mask, rec=True)
 
 
class KeyboardCatcher:
 
        def __init__(self):
                self.child = os.fork()
 
                if self.child == 0:
                        return
                else:
                        self.watch()
 
        def watch(self):
                try:
                        os.wait()
 
                except KeyboardInterrupt:
                        print 'KeyBoardInterrupt'
                        self.kill()
                sys.exit()                               
 
        def kill(self):
                try:
                        os.kill(self.child, signal.SIGKILL)
                except OSError: pass
 
 
 
 
def _copy_base_dir(src, dst):
        try:
                os.makedirs(dst)
 
        except OSError:                
                pass # Directory already exists!
 
        for path in glob.glob("%s/*" % src):
                dst2 = os.path.join(dst, os.path.basename(path))
 
                if os.path.isdir(path):
                        _copy_base_dir(path, dst2)                        
 
                elif os.path.isfile(path):
                        if os.path.isfile(dst2):
                                if filecmp.cmp(path, dst2):
                                        continue
 
                        shutil.copy2(path, dst)
 
 
def copy_base_dir():
        for src in BACKUP: 
                _copy_base_dir(src, BACKUP[src])        
 
 
 
def main():         
        KeyboardCatcher()
 
        #copy_base_dir()
 
        w = FileSystemWatcher(BACKUP.keys())
        w.start()          
        w.block()
 
 
if __name__ == "__main__":
        main()

With this script we’ll be able to backup our files in real-time.
First off we created a dictionary called BACKUP, in which we set those directories which are to be monitored and where we’re going to save all our files; in this way we can back-up one or more directories.

BACKUP = {
        "from1" : "to1",
        "from2" : "to2"
}

Now… on with the main() function:

The first instruction, KeyboardCatcher(), provides us with a new process in our program. This class is not mandatory, but allows us to catch any possible keyboard exceptions (ctrl-c). The main program will stand by, waiting; whilst the child process can continue following the script.

We commented out the call to the copy_base_dir() method, but you could use it as best suits your necessities. In a nutshell this method will back-up the directories—before starting with the file-system monitoring—and skip every already-existing file.

We reach now the program’s very core, that is monitoring the file-system using python.
We’ve done a class called FileSystemWatcher to which we pass the list of files to be monitored. This class has the office of ‘listening’ for file-system events via the Pyinotify module.

Of course, we’re the ones to appoint which events are to be captured. Indeed, with the following line, we choose the events we want secured.

self.mask = EventsCodes.IN_DELETE | EventsCodes.IN_CREATE | EventsCodes.IN_MODIFY

Establishing the events’ names is not enough though, we’ve to make a class whose callback functions are triggered by the occurrence of such event; in this case it’s the class FileSystemEvent with its three callbacks: process_IN_CREATE – process_IN_DELETE – process_IN_MODIFY.

Finally the class FileBackup, thanks to the process_file method, will check whether the file to be saved is already in the backup directory.

As previously addressed, this is a very simple script to make backups and could be surely improved and optimized.

Well, have fun!

NOTE: The script has been tested with Python 2.5.2 and, needless to say, with GNU Linux OS; those of you who are MS-windows dependent should probably fast for this time :-)




Leave a Reply

Search



Categories