Build a Flexible CSV Parser with Streaming, Validation, and Error Reporting

Create a streaming CSV parser with schema validation, error reporting, and data transformation for large file processing.

๐Ÿ“ The Prompt

Create a production-ready CSV parser module in [PROGRAMMING_LANGUAGE] that can handle large CSV files efficiently with streaming, data validation, and comprehensive error reporting. The parser should be designed for processing [DATA_DESCRIPTION] data (e.g., customer records, product catalogs, financial transactions) and meet these specifications: 1. **Core Parser Configuration**: - Accept configuration options: delimiter (default: comma), quote character, escape character, encoding ([ENCODING], e.g., UTF-8), whether the first row is a header, and custom line terminators. - Support both file path input and readable stream/buffer input. - Use streaming/chunked processing to handle files of [EXPECTED_SIZE] (e.g., 500MB+) without loading the entire file into memory. 2. **Column Mapping & Schema Definition**: - Define a schema for the expected CSV structure with these columns: [COLUMN_DEFINITIONS] (e.g., "email: string, required, unique | age: integer, min=0, max=150 | signup_date: date, format=YYYY-MM-DD | status: enum(active, inactive, pending)"). - Map CSV column headers (or indices) to internal field names, supporting aliases for common header variations (e.g., 'Email Address' โ†’ 'email', 'E-mail' โ†’ 'email'). 3. **Row-Level Validation**: - Validate each row against the schema: check required fields, data types, format patterns (regex), value ranges, and enum values. - Support custom validation functions for complex business rules (e.g., "if [BUSINESS_RULE]"). - Collect all validation errors per row (don't stop at the first error). 4. **Error Handling & Reporting**: - Track and categorize errors: parsing errors (malformed CSV), validation errors (bad data), and warnings (e.g., trailing whitespace auto-trimmed). - Generate a structured error report including: row number, column name, provided value, error type, and human-readable message. - Support configurable behavior: skip invalid rows, abort after [MAX_ERRORS] errors, or collect all errors. 5. **Output & Transformation**: - Transform valid rows into [OUTPUT_FORMAT] (e.g., array of objects, JSON, database-ready insert statements, or another CSV). - Apply optional data transformations: trim whitespace, normalize case, parse dates, and convert number formats. - Provide a summary: total rows processed, valid rows, skipped rows, and error count by category. 6. **API Design**: Expose both a simple one-call function `parseCSV(source, schema, options)` and an event-driven/streaming interface with callbacks for `onRow`, `onError`, `onComplete`. Provide complete code with thorough comments, type definitions (if applicable), and a usage example that parses a sample CSV matching the [DATA_DESCRIPTION] schema. Include unit test cases for: valid data, missing required fields, type mismatches, malformed rows, and large file streaming.

๐Ÿ’ก Tips for Better Results

Provide your actual column definitions and a few sample rows so the AI generates a parser that's immediately usable with your real data. Mention your expected file sizes โ€” if files are under 10MB, a simpler non-streaming approach may be cleaner and sufficient. Ask for integration with your database ORM as a follow-up so parsed rows can be directly batch-inserted into your database.

๐ŸŽฏ Use Cases

Data engineers and backend developers who need to import, validate, and process CSV files from external sources such as client uploads, third-party exports, or batch data migrations.

๐Ÿ”— Related Prompts

๐Ÿ’ป Coding beginner

Explain Code Like Im a Beginner

Get any code explained in plain English with line-by-line breakdowns, analogies, and learning suggestions.

๐Ÿ’ป Coding beginner

Debug My Code and Explain the Fix

Get your code debugged with clear explanations of what went wrong and why, plus the corrected version.

๐Ÿ’ป Coding intermediate

Write Unit Tests for My Code

Generate thorough unit tests covering edge cases, error handling, and both positive and negative scenarios.

๐Ÿ’ป Coding intermediate

Convert Code Between Languages

Convert code between any programming languages while maintaining idiomatic patterns and best practices.

๐Ÿ’ป Coding intermediate

Write a REST API Endpoint

Generate production-ready REST API endpoints with validation, error handling, and documentation.

๐Ÿ’ป Coding advanced

Refactor Code for Better Performance

Get your code refactored for better performance with Big O analysis and design pattern suggestions.