Python Snippets

CSV to JSON Converter with Custom Delimiters

This snippet converts a CSV file to JSON format, allowing custom delimiters and handling nested data structures. It’s useful for data preprocessing, API payload preparation, or migrating data between systems.

import csv
import json
from typing import Dict, List

def csv_to_json(
    input_file: str,
    output_file: str,
    delimiter: str = ",",
    nested_columns: List[str] = None
) -> None:
    """
    Converts CSV to JSON with optional nested structures.
    
    Args:
        input_file: Path to input CSV file
        output_file: Path to output JSON file
        delimiter: CSV field delimiter (default: ',')
        nested_columns: List of columns to convert to nested objects
    """
    with open(input_file, 'r') as csv_file:
        csv_reader = csv.DictReader(csv_file, delimiter=delimiter)
        data = []
        
        for row in csv_reader:
            if nested_columns:
                for col in nested_columns:
                    if col in row and row[col]:
                        try:
                            row[col] = json.loads(row[col])
                        except json.JSONDecodeError:
                            pass
            
            data.append(row)
    
    with open(output_file, 'w') as json_file:
        json.dump(data, json_file, indent=2)

if __name__ == "__main__":
    # Example usage
    csv_to_json(
        input_file="data.csv",
        output_file="output.json",
        delimiter=";",
        nested_columns=["metadata", "tags"]
    )

Why This Is Useful

  1. Data Transformation: Easily convert tabular data to JSON format for APIs or NoSQL databases
  2. Flexible Delimiters: Works with CSV files using commas, tabs, pipes, or other delimiters
  3. Nested Structure Support: Automatically converts specified columns from JSON strings to proper nested objects
  4. Type Safety: Includes type hints for better code maintenance
  5. Error Handling: Gracefully handles malformed JSON in nested columns

How to Use

  1. Save the code as csv_to_json.py
  2. Create a CSV file (example data.csv):
    id;name;metadata;tags
    1;Alice;"{'age': 30}";"['admin','user']"
    2;Bob;"{'age': 25}";"['user']"
    
  3. Run the script:
    python csv_to_json.py
    
  4. Output (output.json) will contain:
    [
      {
        "id": "1",
        "name": "Alice",
        "metadata": {"age": 30},
        "tags": ["admin", "user"]
      },
      {
        "id": "2",
        "name": "Bob",
        "metadata": {"age": 25},
        "tags": ["user"]
      }
    ]
    

Customize the nested_columns parameter for any columns containing JSON strings that should be converted to objects/arrays. The script preserves all original data while adding structure where needed.