> ## Documentation Index
> Fetch the complete documentation index at: https://jobo.world/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Export Formats

> Supported export formats: CSV, JSON, and Parquet. Includes compression, splitting, and size comparison.

## Format Overview

|                     | CSV                          | JSON                  | Parquet                    |
| ------------------- | ---------------------------- | --------------------- | -------------------------- |
| **Extension**       | `.csv.gz`                    | `.json.gz`            | `.parquet`                 |
| **Compression**     | Gzip                         | Gzip                  | Snappy (built-in)          |
| **Split Threshold** | 50,000 records/file          | 50,000 records/file   | No splitting needed        |
| **Best For**        | Spreadsheets, quick analysis | App integration, APIs | Data warehouses, analytics |
| **Relative Size**   | Largest                      | \~1.1× CSV            | \~0.3× CSV                 |

## CSV

Comma-separated values compatible with Excel, Google Sheets, and most data tools.

* **Extension:** `.csv.gz`
* **Compression:** Gzip (automatic)
* **Splitting:** Automatically split at **50,000 records per file**
* **Best for:** Spreadsheets, quick analysis, data sharing

<Info>
  CSV files are always gzipped. If the export exceeds 50,000 records, multiple
  `.csv.gz` files are produced and bundled into a single `.zip` download.
</Info>

## JSON

Standard JSON array format for programmatic consumption.

* **Extension:** `.json.gz`
* **Compression:** Gzip (automatic)
* **Splitting:** Automatically split at **50,000 records per file**
* **Best for:** Application integration, API-style workflows, programmatic parsing

## Parquet

Columnar binary format optimized for analytics workloads.

* **Extension:** `.parquet`
* **Compression:** Built-in (Snappy)
* **Splitting:** Not required — Parquet handles large datasets efficiently in a single file
* **Best for:** Data warehouses, Spark/DuckDB/Pandas pipelines, BigQuery, Snowflake

<Info>
  Parquet files are significantly smaller than CSV/JSON for the same data
  (typically **3–5× smaller**) and support predicate pushdown for efficient
  querying.
</Info>

## File Splitting & Naming

CSV and JSON exports are automatically split when the total record count exceeds **50,000 records per file**. Each split file follows this naming convention:

```
export_{export_id}_part001.csv.gz
export_{export_id}_part002.csv.gz
export_{export_id}_part003.csv.gz
```

When an export produces multiple files, they are bundled into a single `.zip` archive for download:

```
export_{export_id}.zip
├── export_{export_id}_part001.csv.gz
├── export_{export_id}_part002.csv.gz
└── export_{export_id}_part003.csv.gz
```

Parquet exports produce a single file regardless of size:

```
export_{export_id}.parquet
```

## Size Comparison

Approximate file sizes for a **100,000-record** export with all columns selected:

| Format               | Compressed Size | Notes                             |
| -------------------- | --------------- | --------------------------------- |
| CSV (`.csv.gz`)      | \~120 MB        | 2 files × \~60 MB each            |
| JSON (`.json.gz`)    | \~135 MB        | 2 files × \~67 MB each            |
| Parquet (`.parquet`) | \~35 MB         | Single file, columnar compression |

<Tip>
  Choose **Parquet** when working with analytics tools — it's the most compact
  format and supports column-level reads, so tools like DuckDB or Pandas only
  decompress the columns you query.
</Tip>