Usage
Outline
The idea is that you have one folder you want to store a backup of your files / folders into. This folder will be named destination folder in the following.
You may have multiple files / folders that shall be backupped. Most of those shall be compressed, but some may be already compressed or shall be kept as-are for any reason. That's why grebakker supports both, copying and compressing files / folders.
In addition, you may want to define own backup rules for subfolders. E.g., you may have a folder where all your projects are located in. You could compress each project individually, but some may be too big for being stored in a single archive. Or you may want to exclude some files or folders in one part of the project, but not in another one. That's why grebakker supports subfolders.
Basics
grebakker backups files and folders into a destination folder. The files / folders to backup are either copied or compressed. In addition, grebakker may walk through a hierarchy of folders to backup.
As such, grebakker needs to know which files / folders shall be backupped and how. For each folder to backup, at least one definition file called grebakker.json
is needed. This file uses JSON to define what shall be processed and how. Here is an example:
{
"destination": "d/",
"copy": [
"document.pdf",
"old_backups"
],
"compress": [
{ "sub": "repository" },
{ "sub": "current", "exclude": ["venv"] }
],
"subfolders": [ "attic" ]
}
You will find the following information in it:
destination
is a subfolder in the global destination folder to save the backup into- the (optional) attribute
copy
lists files and folders that shall be copied into the destination folder - the (optional) attribute
compress
lists files and folders that shall be compressed; the generated archive is stored in the destination folder - the (optional) attribute
subfolders
lists (sub)folders that shall be processed; in each of these folder, a newgrebakker.json
file must be given
The synopsis of grebakker is briefly
python src\grebakker.py <ACTION> <DESTINATION> <SOURCE>
So given the file above, you may start grebakker like this:
python src\grebakker.py backup f:\backup\2025_05 d:\
Then, grebakker reads the d:\grebakker.json
file and copies all files / folders listed in the copy
attribute, compresses all those listed in the compress
attribute, and runs iteratively over all folders given in the subfolders
attribute.
Definition files
Take the example from above. For each project you want to backup, set up a file that describes how it shall be backupped - which files / folders shall be copied, which compressed and into which folders grebakker shall dive into.
Logging
grebakker logs performed actions - either using the csv or the json format.
A csv file lists each action in a line, consisting of <ACTION>;<SOURCE>;<DESTINATION>;<DURATION>. So a csv log file could look like this:
"copy";"document.pdf";"d/document.pdf";"0:00:01.11"
"copy";"old_backups";"d/old_backups";"0:01:02.2"
"compress";"repository";"d/repository.zip";"0:03:02.34545"
"compress";"current";"d/current.zip";"0:01:02.43242"
"sub";"attic";"d/attic";"0:00:00"
"compress";"attic/subfolder11";"d/attic/subfolder11.zip";"0:01:01.2398"
"compress";"attic/subfolder12";"d/attic/subfolder12.zip";"0:02:01.2342"
The json output stores the same information like '{ "action": "<ACTION>", "source": "<SOURCE>", "destination": "<DESTINATION>", "duration": "<DURATION>" }'.
[
{ "action": "copy", "source": "document.pdf", "destination": "d/document.pdf", "duration": "0:00:0111" },
{ "action": "copy", "source": "old_backups", "destination": "d/old_backups", "duration": "0:01:02.2" },
{ "action": "compress", "source": "repository", "destination": "d/repository.zip", "duration": "0:03:02.34545" },
{ "action": "compress", "source": "current", "destination": "d/current.zip", "duration": "0:01:02.43242" },
{ "action": "sub", "source": "attic", "destination": "d/attic", "duration": "0:00:00" },
{ "action": "compress", "source": "attic/subfolder11", "destination": "d/attic/subfolder11.zip", "duration": "0:01:01.2398" },
{ "action": "compress", "source": "attic/subfolder11", "destination": "d/attic/subfolder12.zip", "duration": "0:02:01.2342" }
]
The log shall report what has been done. It may serve further purposes in the future.
Safety
Well, ok, the thing is about backupping your files. This is important and it should work, right?
That's why grebakker fails early and loud. After a successful backup, you should have the files located in your destination folder, including the backup definition files, and the log file. Only, and only if all these constraints are true (and grebakker is not buggy, uh?), your backup was successful.
In any other cases grebakker will complain with an error message and will stop the operation. It is then up to you to check the error messages and adapt needed actions.
BTW, I am paranoid here. I make a backup, extract the backupped files to a second folder and check whether all that shall be backupped was backupped.
How to start?
Yes, well, basically, you have to define a backup definition file for all the projects you want to back up and that's all... Run grebakker on them.
Any issues?
Yes, there are.
Currently, grebakker does not report correctly when a folder was not copied or compressed completely. This must be changed in next versions.
As well, the tests - though covering almost 100% of the code - do not cover the compression of files well. This should be changed in the next steps as well.
- Outline
- Basics
- Definition files
- Logging
- Safety
- How to start?
- Any issues?
Table of contents