Parser | Breach

Breach Parser – Forensic Analysis Report

Report ID: BP-2026-04-20-001
Date of Report: April 20, 2026
Prepared by: Security Incident Response Team (SIRT)
Classification: CONFIDENTIAL / TLP:AMBER

5. Credential Analysis

Check password hashes against known crackable patterns (rockyou, top 10k)
Decode base64 passwords if flagged as encoded
Flag empty, default, or placeholder passwords ("123456", "password", "changeme")
Detect password reuse within same breach

The Ethical Gray Area

Cybersecurity vendors like Have I Been Pwned (HIBP) parse breaches ethically. They ingest the dump, extract only email addresses and domain names, and never store plaintext passwords. They use k-anonymity models to share hashes without exposing the raw data. breach parser

If you build a breach parser, architect it to ignore data you don't need. If you only care about domain exposure, drop the plaintext password column immediately. Breach Parser – Forensic Analysis Report Report ID:

4.3 Geographic Origin of Exposed IPs (top 3)

Russia (34%) – likely crawler
US (22%) – internal corporate
China (18%) – scanning activity

Overview

The Breach Parser is a system that automatically processes raw breach data dumps (TXT, CSV, JSON, SQL, or compressed files), extracts structured fields, validates data types, detects anomalies, and prepares the data for security analysis, credential monitoring, or threat intelligence. Breach Parser Example Output (JSONL)

1. Incident Response (IR)

Security teams use breach parsers to identify the scope of a compromise. If a database dump is found on a compromised server, the parser identifies how many unique accounts were exposed.

Example Output (Parsed):

"username": "bob", "password": "password123", "email": "bob@mail.com", "ip": "192.168.1.1"
"username": "alice", "password": "letmein", "email": "alice@work.com", "ip": null

Breach Parser

Example Output (JSONL)


  "email": "user@example.com",
  "password_hash": "5f4dcc3b5aa765d61d8327deb882cf99",
  "hash_type": "MD5",
  "password_plain": null,
  "weak_hash": true,
  "is_cracked": false,
  "breach_id": "acme_2024",
  "source_line": 4523