Python Snippets

Automatic File Backup with Compression and Scheduling

This script creates an automated backup system that compresses files and directories, manages backup rotation, and includes scheduling capabilities. It’s particularly useful for creating regular backups of important project files or configuration directories.

import os
import shutil
import datetime
import zipfile
import schedule
import time
from pathlib import Path

def create_backup(source_path, backup_dir, max_backups=5):
    """
    Create a compressed backup of a file or directory.
    
    Args:
        source_path (str): Path to the source file or directory to backup
        backup_dir (str): Directory where backups will be stored
        max_backups (int): Maximum number of backups to keep
    """
    # Create backup directory if it doesn't exist
    backup_path = Path(backup_dir)
    backup_path.mkdir(parents=True, exist_ok=True)
    
    # Get source name and create timestamp
    source_name = Path(source_path).name
    timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
    backup_filename = f"{source_name}_backup_{timestamp}.zip"
    backup_filepath = backup_path / backup_filename
    
    # Create the backup
    source_path = Path(source_path)
    if source_path.is_file():
        # Single file backup
        with zipfile.ZipFile(backup_filepath, 'w', zipfile.ZIP_DEFLATED) as zipf:
            zipf.write(source_path, source_path.name)
    elif source_path.is_dir():
        # Directory backup
        with zipfile.ZipFile(backup_filepath, 'w', zipfile.ZIP_DEFLATED) as zipf:
            for file_path in source_path.rglob('*'):
                if file_path.is_file():
                    zipf.write(file_path, file_path.relative_to(source_path.parent))
    else:
        raise FileNotFoundError(f"Source {source_path} does not exist")
    
    print(f"Backup created: {backup_filepath}")
    
    # Remove old backups if we exceed max_backups
    remove_old_backups(backup_path, source_name, max_backups)

def remove_old_backups(backup_dir, source_name, max_backups):
    """
    Remove old backups to keep only the maximum number specified.
    
    Args:
        backup_dir (Path): Path to backup directory
        source_name (str): Name of the source being backed up
        max_backups (int): Maximum number of backups to keep
    """
    backup_files = list(backup_dir.glob(f"{source_name}_backup_*.zip"))
    backup_files.sort(key=lambda x: x.stat().st_mtime, reverse=True)
    
    # Delete old backups
    for backup_file in backup_files[max_backups:]:
        backup_file.unlink()
        print(f"Removed old backup: {backup_file}")

def schedule_backup(source_path, backup_dir, max_backups=5, interval_hours=24):
    """
    Schedule regular backups.
    
    Args:
        source_path (str): Path to the source file or directory to backup
        backup_dir (str): Directory where backups will be stored
        max_backups (int): Maximum number of backups to keep
        interval_hours (int): Backup interval in hours
    """
    schedule.every(interval_hours).hours.do(
        create_backup, 
        source_path=source_path, 
        backup_dir=backup_dir, 
        max_backups=max_backups
    )
    
    print(f"Backup scheduled for {source_path} every {interval_hours} hours")
    print("Press Ctrl+C to stop the scheduler")
    
    # Run the scheduler
    while True:
        schedule.run_pending()
        time.sleep(60)

# Example usage
if __name__ == "__main__":
    # Create a single backup
    source_path = "important_documents"  # Change this to your target file or directory
    backup_dir = "backups"
    create_backup(source_path, backup_dir)
    
    # Schedule regular backups (uncomment to run)
    # schedule_backup(source_path, backup_dir, interval_hours=6)

What This Code Does

This script creates an automated backup system with the following features:

  1. Compressed Backups: Automatically creates ZIP files from individual files or entire directories
  2. Backup Rotation: Automatically removes old backups to prevent storage overuse
  3. Flexible Scheduling: Option to schedule regular backups (e.g., every 24 hours)
  4. Timestamps: Each backup includes a timestamp for easy identification

How to Use This Script

  1. Install the required library:
    pip install schedule
    
  2. Modify the source path: Change the source_path variable in the __main__ section to point to your file or directory you want to backup

  3. Set backup location: Change the backup_dir variable to the desired backup storage location

  4. Run a single backup:
    python backup_script.py
    
  5. Schedule regular backups: Uncomment the schedule_backup function call and run the script

Key Features Explained

This script is ideal for backing up important data, project files, or configuration settings, ensuring you can always recover from previous versions. It’s especially useful for developers or users who want to automate their backup processes without relying on third-party tools.