Lab Activity: Automate Backup and Cleanup of a Target Folder
Objectives:
-
Practice working with file system operations in Python using
osandpathlib. -
Learn how to generate timestamped folder names with the
timemodule. -
Implement file copying using binary read/write operations.
-
Understand how to automate cleanup by deleting or moving files after backup.
-
Reinforce concepts of iteration, exception handling, and safe file handling in automation tasks.
In this lab, we will apply what we’ve learned to create a simple automation script. The goal is to back up all files from a target directory into a new folder (with a timestamped name), and then optionally clean up the original files. This simulates a backup rotation or cleanup task that might be scheduled to run periodically.
Scenario: Imagine you have a folder called “target_folder” that accumulates files (for example, daily reports or logs). You want to write a Python program that:
Creates a backup folder named with the current date/time.
Copies all files from target_folder into the backup folder.
(Optional) Deletes the original files in target_folder once backed up, or perhaps deletes only files older than a certain date (for safety, we will delete them in this lab to simulate a cleanup).
Follow the steps below to create this automation. (This can be done as a standalone script. Ensure you have a test directory with some sample files to try it out.)
step-by-step Instructions:
Step 1: Setup Paths and Timestamp
Decide on your target directory path. For example, you might have it as a Path or string. Let’s assume the target folder is “target_folder” in the current working directory.
Determine a name for the backup directory using the current timestamp. We can use the time module to get a timestamp string.
import os, time
from pathlib import Path
# Define the target directory and backup directory name
target_dir = Path("target_folder")
# Ensure the target folder exists
if not target_dir.exists():
print(f"Target folder {target_dir} not found!")
exit(1)
# Create a timestamped folder name, e.g., "backup_20250806_232045"
timestamp = time.strftime("%Y%m%d_%H%M%S") # format: YYYYMMDD_HHMMSS
backup_dir = Path(f"backup_{timestamp}")
print("Backup directory will be:", backup_dir)
We used Path for convenience. We check if target_folder exists; if not, exit or handle accordingly.
time.strftime(“%Y%m%d_%H%M%S”) gives a string like 20250806_232045 for August 6, 2025 23:20:45. We prepend “backup_” to form the backup folder name.
We print the name to verify. (In a real script, you might log this or omit the print.)
Step 2: Create the Backup Directory
Next, create the backup directory on disk. Use os.mkdir() or Path.mkdir().
It’s good to handle the case that the directory might already exist (though with a timestamp, it’s unlikely to collide).
# Create the backup directory
backup_dir.mkdir(exist_ok=True)
Using exist_ok=True prevents an error if, by chance, a folder with the same name exists (perhaps from a run in the same second).
Step 3: Copy Files to the Backup
We need to copy all files from target_folder into backup_dir.
We can use the glob module or Path.iterdir() to get all files. Let’s use iterdir() for simplicity.
For each file, open it in ‘rb’ (read binary) and create a new file in backup_dir with the same name in ‘wb’ (write binary), then copy bytes.
for item in target_dir.iterdir():
if item.is_file():
# Construct corresponding path in backup_dir
dest_file = backup_dir / item.name
# Copy file contents
with open(item, 'rb') as src, open(dest_file, 'wb') as dst:
# read and write in chunks
while chunk := src.read(4096):
dst.write(chunk)
print(f"Copied {item.name} to {dest_file}")
Explanation:
We loop through each item in the target directory.
Check item.is_file(). We skip subdirectories in this simple scenario. (If your target can have subfolders and you want to include them, you’d need a recursive copy approach or use shutil.copytree. But here we assume a flat structure in the target for simplicity.)
dest_file = backup_dir / item.name creates the new file path inside the backup directory with the same filename.
We open the source file in binary read and the destination in binary write. Then we read in chunks of 4096 bytes and write those chunks to the new file. This ensures even large files are handled without reading everything into memory at once.
After the loop, each file from the target should be duplicated in the backup folder. We print a message for each to confirm.
(Alternatively, we could have used shutil.copy2(item, dest_file) to copy files with metadata preserved, but doing it manually as above demonstrates the file handling concepts and uses binary mode reading/writing.)
Step 4: Clean Up Original Files (Careful!)
After successfully copying, if the goal is to remove the originals (to clean up space or avoid duplication), we can delete them. This step is irreversible, so in a real script ensure your backup is verified before deleting originals.
We can simply use os.remove() or Path.unlink() to delete each file.
Code (continuing from above):
# After copying, delete the original file (optional cleanup)
try:
item.unlink() # equivalent to os.remove(item)
print(f"Deleted original {item.name}")
except Exception as e:
print(f"Could not delete {item.name}: {e}")
We put this in the same loop right after copying each file. We use Path.unlink() which removes the file. We wrap in try/except to handle any unexpected errors (like permission issues) – in a robust script you might handle those differently or log them.
Important: Only do this if you’re sure the backup succeeded. In our code, if an error occurs during copy (for instance, read error), the script would break out and not reach deletion. If you wanted to be extra safe, you could first collect all file paths, copy everything, verify, then if all good, loop again to delete. But for this exercise, we proceed as above for simplicity.
Step 5: Verify the Backup
It’s good practice to verify that the backup files exist and possibly have the same size as the originals. While not writing full verification code here, you can do a quick check:
for item in backup_dir.iterdir():
print(item.name, "-", item.stat().st_size, "bytes")
Also, check that the target directory is now empty (if you deleted everything). You could print a message if list(target_dir.iterdir()) is empty, indicating cleanup done.
This verification could also involve checking the count of files, etc., but manual observation can suffice for now.
Running the Lab Script
If you run the combined code (Steps 1–4 in one script) on a test folder, you should see output like:
Backup directory will be: backup_20250806_232045
Copied report1.txt to backup_20250806_232045/report1.txt
Deleted original report1.txt
Copied report2.txt to backup_20250806_232045/report2.txt
Deleted original report2.txt
Copied image.png to backup_20250806_232045/image.png
Deleted original image.png
After running, target_folder would be empty (files removed) and a new folder backup_20250806_232045 would contain the files that were in target_folder. The timestamp ensures each backup run creates a new directory and doesn’t overwrite previous backups.
Clean Up Strategy Variations:
Instead of deleting files, you could move them. Using os.rename(item, dest_file) could move the file into the backup folder (which is effectively a cut-and-paste). That’s even simpler and ensures the backup has the file (since it’s literally the same file moved). However, moving is not copying – if the intention is to keep an original and have a copy, use copy + delete.
You might want to only delete files older than a certain age. You could check file modification times via os.stat(item).st_mtime (or item.stat().st_mtime with pathlib) and compare with the current time.
You could compress the backup folder into a zip file using Python’s shutil.make_archive or zipfile module if desired, instead of just copying to a folder.
For now, our lab accomplishes a basic backup and cleanup. It uses:
-
os/pathlib for filesystem operations.
-
time. strftime for timestamp in naming.
-
file reading/writing in binary mode to handle any file type.
-
optionally os.remove (via Path.unlink) for cleanup.
This automation can be run manually or scheduled (with Windows Task Scheduler, cron on Linux, etc.) to periodically archive files from a directory.
