In this tutorial we shall show you how easy it is to make simple real-time backup software by using the Python module called Pyinotify. This module is based on the Linux kernel feature named inotify (included since kernel 2.6.13); and this being an event-driven notifier, it’ll convey its notifications from kernel space to user space by means of using system-calls. The Pyinotify module will eventually bind such calls.
Let’s get to the code now: create a file labelled “backup.py” in your working directory and paste in the following code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | import os, os.path, time, signal, sys, operator, glob, shutil, filecmp from pyinotify import WatchManager, Notifier, ThreadedNotifier, EventsCodes, ProcessEvent BACKUP = { "/home/user/my_dir" : "/home/user/my_backup", "/home/user/test" : "/home/user/test_backup" } class FileSystemEvent(ProcessEvent): def __init__(self, watcher): self.f = FileBackup(watcher) def process_event(self, event): self.f.process_file(event.path, event.name) def process_IN_CREATE(self, event): self.process_event(event) def process_IN_DELETE(self, event): self.process_event(event) def process_IN_MODIFY(self, event): self.process_event(event) class FileBackup: def __init__(self, watcher): self.watcher = watcher self.l = BACKUP.keys() def process_file(self, event_path, event_name): path = os.path.join(event_path, event_name) if os.path.isdir(path): self.watcher.add(path) #watch new dir base = self.get_base_path(path) else: base = self.get_base_path(os.path.dirname(path)) dst = BACKUP[base[0]]+base[1] try: os.makedirs(dst) except: pass if os.path.isfile(path): if os.path.isfile(dst+"/"+event_name): if filecmp.cmp(path, dst+"/"+event_name): return shutil.copy2(path, dst+"/"+event_name) def get_base_path(self, path): for p in self.l: if (len(path.replace(p, '')) < len(path)): return p, path.replace(p, '') return "","" class FileSystemWatcher(): def __init__(self, dirs=[]): self.wm = WatchManager() self.fse = FileSystemEvent(self) self.mask = EventsCodes.IN_DELETE | EventsCodes.IN_CREATE | EventsCodes.IN_MODIFY # events self.notifier = ThreadedNotifier(self.wm, self.fse) for dir in dirs: self.add(dir) def start(self): self.notifier.start() def block(self): self.notifier.join() def close(self): self.notifier.close() def add(self, dirpath): if os.path.isdir(dirpath): self.wdd = self.wm.add_watch(dirpath, self.mask, rec=True) class KeyboardCatcher: def __init__(self): self.child = os.fork() if self.child == 0: return else: self.watch() def watch(self): try: os.wait() except KeyboardInterrupt: print 'KeyBoardInterrupt' self.kill() sys.exit() def kill(self): try: os.kill(self.child, signal.SIGKILL) except OSError: pass def _copy_base_dir(src, dst): try: os.makedirs(dst) except OSError: pass # Directory already exists! for path in glob.glob("%s/*" % src): dst2 = os.path.join(dst, os.path.basename(path)) if os.path.isdir(path): _copy_base_dir(path, dst2) elif os.path.isfile(path): if os.path.isfile(dst2): if filecmp.cmp(path, dst2): continue shutil.copy2(path, dst) def copy_base_dir(): for src in BACKUP: _copy_base_dir(src, BACKUP[src]) def main(): KeyboardCatcher() #copy_base_dir() w = FileSystemWatcher(BACKUP.keys()) w.start() w.block() if __name__ == "__main__": main() |
With this script we’ll be able to backup our files in real-time.
First off we created a dictionary called BACKUP, in which we set those directories which are to be monitored and where we’re going to save all our files; in this way we can back-up one or more directories.
BACKUP = { "from1" : "to1", "from2" : "to2" }
Now… on with the main() function:
The first instruction, KeyboardCatcher(), provides us with a new process in our program. This class is not mandatory, but allows us to catch any possible keyboard exceptions (ctrl-c). The main program will stand by, waiting; whilst the child process can continue following the script.
We commented out the call to the copy_base_dir() method, but you could use it as best suits your necessities. In a nutshell this method will back-up the directories—before starting with the file-system monitoring—and skip every already-existing file.
We reach now the program’s very core, that is monitoring the file-system using python.
We’ve done a class called FileSystemWatcher to which we pass the list of files to be monitored. This class has the office of ‘listening’ for file-system events via the Pyinotify module.
Of course, we’re the ones to appoint which events are to be captured. Indeed, with the following line, we choose the events we want secured.
self.mask = EventsCodes.IN_DELETE | EventsCodes.IN_CREATE | EventsCodes.IN_MODIFY
Establishing the events’ names is not enough though, we’ve to make a class whose callback functions are triggered by the occurrence of such event; in this case it’s the class FileSystemEvent with its three callbacks: process_IN_CREATE – process_IN_DELETE – process_IN_MODIFY.
Finally the class FileBackup, thanks to the process_file method, will check whether the file to be saved is already in the backup directory.
As previously addressed, this is a very simple script to make backups and could be surely improved and optimized.
Well, have fun!
NOTE: The script has been tested with Python 2.5.2 and, needless to say, with GNU Linux OS; those of you who are MS-windows dependent should probably fast for this time