This snippet demonstrates how to efficiently parse large JSON log files line by line, which is useful for processing log data without loading the entire file into memory.
import json
from typing import Iterator, Dict
def parse_large_json_log(file_path: str) -> Iterator[Dict]:
"""
Efficiently reads and parses a large JSON log file line by line.
Each line should be a valid JSON object (common in log formats).
Args:
file_path (str): Path to the JSON log file.
Yields:
Dict: Parsed JSON object for each line.
"""
with open(file_path, 'r', encoding='utf-8') as file:
for line in file:
try:
yield json.loads(line.strip())
except json.JSONDecodeError:
print(f"Skipping invalid JSON line: {line[:100]}...")
# Example usage:
if __name__ == "__main__":
log_file = "server_logs.json"
for log_entry in parse_large_json_log(log_file):
print(log_entry.get('timestamp'), log_entry.get('message'))
json_log_parser.py
).python json_log_parser.py
Replace server_logs.json
with your log file path.
Input (server_logs.json
):
{"timestamp": "2024-01-01T12:00:00Z", "level": "INFO", "message": "Service started"}
{"timestamp": "2024-01-01T12:01:00Z", "level": "ERROR", "message": "Disk full"}
Output:
2024-01-01T12:00:00Z Service started
2024-01-01T12:01:00Z Disk full
This snippet is optimized for real-world log processing and avoids common pitfalls like memory overuse.