Python Snippets

JSON Data Validator with Schema Enforcement

This snippet provides a robust JSON validation utility that checks if JSON data conforms to a predefined schema. It’s particularly useful for validating API payloads, configuration files, or any structured data before processing.

import json
from typing import Any, Dict, List, Union
from jsonschema import Draft7Validator, ValidationError

def validate_json_data(data: Union[str, Dict], schema: Dict) -> Dict[str, Any]:
    """
    Validate JSON data against a schema and return detailed results.
    
    Args:
        data: JSON string or dictionary to validate
        schema: JSON schema as a dictionary
        
    Returns:
        Dictionary containing validation results with keys:
        - 'valid': Boolean indicating if data is valid
        - 'errors': List of error messages (if invalid)
        - 'data': Parsed data if valid, None otherwise
    """
    # Parse JSON string if needed
    if isinstance(data, str):
        try:
            parsed_data = json.loads(data)
        except json.JSONDecodeError as e:
            return {
                'valid': False,
                'errors': [f"Invalid JSON format: {str(e)}"],
                'data': None
            }
    else:
        parsed_data = data
    
    # Validate against schema
    validator = Draft7Validator(schema)
    errors = []
    
    for error in sorted(validator.iter_errors(parsed_data), key=str):
        errors.append(f"{error.message} at {error.json_path}")
    
    return {
        'valid': len(errors) == 0,
        'errors': errors,
        'data': parsed_data if len(errors) == 0 else None
    }

# Example usage
if __name__ == "__main__":
    # Define a schema for user profile data
    user_schema = {
        "type": "object",
        "properties": {
            "name": {"type": "string", "minLength": 1},
            "age": {"type": "integer", "minimum": 0},
            "email": {"type": "string", "format": "email"},
            "tags": {
                "type": "array",
                "items": {"type": "string"}
            }
        },
        "required": ["name", "email"],
        "additionalProperties": False
    }
    
    # Valid data example
    valid_json = '''
    {
        "name": "John Doe",
        "age": 30,
        "email": "john.doe@example.com",
        "tags": ["developer", "python"]
    }
    '''
    
    # Invalid data example
    invalid_json = '''
    {
        "name": "",
        "age": -5,
        "email": "invalid-email",
        "tags": ["developer", 123]
    }
    '''
    
    # Validate data
    result1 = validate_json_data(valid_json, user_schema)
    result2 = validate_json_data(invalid_json, user_schema)
    
    print("Valid data result:")
    print(json.dumps(result1, indent=2))
    
    print("\nInvalid data result:")
    print(json.dumps(result2, indent=2))

Explanation

This JSON validator snippet provides a comprehensive solution for validating JSON data against a defined schema. It’s built on the widely-used jsonschema library which implements JSON Schema Draft 7 specification.

Key Features

  1. Flexible Input Handling: Accepts both JSON strings and Python dictionaries
  2. Detailed Error Reporting: Returns specific error messages with JSON path references
  3. Schema Compliance: Uses the official JSON Schema specification
  4. Clear Response Format: Returns structured results with validity status and errors

How It Works

  1. Data Parsing: Converts JSON strings to Python objects if needed
  2. Schema Validation: Uses Draft7Validator to check data against the schema
  3. Error Collection: Gathers all validation errors with precise location information
  4. Result Formatting: Returns a standardized result dictionary with validation outcome

Use Cases

Requirements

To run this snippet, you’ll need to install the required dependency:

pip install jsonschema

Schema Definition

The schema follows the JSON Schema specification, allowing you to define:

This validator is particularly valuable in production systems where data integrity is critical, as it provides clear feedback about validation failures while maintaining performance through efficient schema-based validation.