Skip to content
/ csv Public

CSV engine with formulas, format rows, and SQL queries — implements the Photon CSV Format

License

Notifications You must be signed in to change notification settings

portel-dev/csv

Repository files navigation

@portel/csv

CSV is the one format every AI already speaks. When an LLM needs structured data, it doesn't reach for Protocol Buffers. It writes a comma-separated table. When a tool returns rows to an agent, it's CSV, or something that wishes it were.

And yet most CSV libraries give you a pile of string arrays and wave goodbye. Here's your data, good luck.

@portel/csv is different. It's a stateful engine that actually understands what's inside the cells. Formulas evaluate. Columns know they hold currency. Queries work without loading pandas. You get a spreadsheet brain in a 30KB package, with zero dependencies and no I/O.

:---,:---#:,:---$:
Product,Quantity,Price
Widget,42,9.99
Gadget,17,24.50

That first row? It's a format row. Any CSV reader that doesn't understand it just sees dashes. But tools that do understand it know that Quantity is a number (right-aligned) and Price is currency. The schema lives inside the file. No sidecar JSON. No separate config. Just CSV that knows what it is.

Built for AI Tool Chains

If you're building agents, MCP servers, or tool-calling pipelines, this is your CSV layer.

The problem: An LLM asks your tool for sales data. You read a CSV, return string arrays, and the model has to guess that column 3 is currency. It formats 1234.5 as plain text. The user sees a wall of numbers with no structure. You end up writing formatting hints into system prompts.

With @portel/csv:

import { CsvEngine } from '@portel/csv';

const engine = CsvEngine.fromCSV(csvText);
const snap = engine.snapshot('Sales data for Q1');

// snap.data        → evaluated values (formulas resolved)
// snap.columnMeta  → [{type:'text'}, {type:'number'}, {type:'currency'}]
// snap.charts      → any visual formulas, pre-resolved
// snap.table       → ASCII markdown table, ready to return

Your tool returns structured data and the metadata to render it. The client knows column 3 is currency and can format 1234.5 as $1,234.50 in whatever locale it wants. The AI doesn't have to guess. Neither does the UI.

Agents that mutate data get synchronous operations. No async, no file I/O, no "await the save." Just change the data and serialize when you're done.

engine.add({ Product: 'Thingamajig', Quantity: '100', Price: '4.99' });
engine.sort('Price', 'desc');
engine.set('D1', '=SUM(C1:C3)');

const csv = engine.toCSV();  // ready to write, with formulas preserved

SQL without the database. Need to answer "which products cost more than $10"? Your agent can query in plain SQL:

const result = engine.sql('SELECT Product, Price FROM data WHERE Price > 10');
// result.result     → [{ Product: 'Gadget', Price: 24.50 }]
// result.columnMeta → knows Price is currency

No database. No connection string. Just a CSV file and a question. (SQL needs alasql as a peer dep. Everything else works with zero dependencies.)

Install

npm install @portel/csv

For SQL queries:

npm install @portel/csv alasql

The Basics

Parse and explore

import { CsvEngine } from '@portel/csv';

const engine = CsvEngine.fromCSV(`
Name,Age,Score
Alice,30,95
Bob,25,87
Charlie,35,72
`);

engine.getHeaders();       // ['Name', 'Age', 'Score']
engine.rowCount;           // 3
engine.evaluate(0, 1);     // '30'
engine.toTable();          // ASCII markdown table
engine.toObjects();        // [{ Name: 'Alice', Age: 30, Score: 95 }, ...]

toObjects() auto-coerces numbers. Age comes back as 30, not "30". If you've set column metadata (or the CSV has a format row), currency columns strip $ and % before converting. The type information travels with the data.

Mutate

Every mutation is synchronous. No promises, no callbacks, no waiting for disk. The engine is pure in-memory state. You persist it when and how you want.

engine.set('D1', '=SUM(B1:B3)');                          // formulas just work
engine.add({ Name: 'Diana', Age: '28', Score: '91' });    // add by column name
engine.push([['Eve', '22', '88'], ['Frank', '40', '76']]); // batch append
engine.update(2, { Score: '90' });                         // update row 2
engine.remove(3);                                          // remove row 3
engine.sort('Score', 'desc');                              // sort descending
engine.rename('A', 'FullName');                            // rename column
engine.fill('C1:C5', '0');                                 // fill a range
engine.clear('B:B');                                       // clear entire column
engine.resize(10, 5);                                      // grow or shrink the grid

Query

Two ways to ask questions. Simple conditions for the common case, SQL for everything else.

// Simple: column, operator, value
const result = engine.query('Age > 25');
result.matchCount;    // 2
result.data;          // [['Alice','30','95'], ['Charlie','35','72']]
result.columnMeta;    // metadata for each column, so you can format the output

// SQL: full power when you need it
const sql = engine.sql('SELECT Name, Score FROM data WHERE Score > 85 ORDER BY Score DESC');
sql.result;           // [{ Name: 'Alice', Score: 95 }, { Name: 'Bob', Score: 87 }]
sql.columnMeta;       // still there, still useful

Both query() and sql() return columnMeta alongside the results. This is intentional. When your agent returns a table to the user, the rendering layer needs to know that Score is a number and Price is currency. The metadata is always available. You never have to ask for it separately.

Serialize

engine.toCSV();                        // CSV text, format row included if present
engine.toCSV({ formatRow: true });     // force the format row in
engine.toCSV({ formatRow: false });    // strip it out
engine.toObjects();                    // typed objects with number coercion
engine.toTable();                      // ASCII markdown table
engine.toTable('A1:B3');               // just a range
engine.snapshot('Quarterly report');    // everything: data, formulas, charts, metadata
engine.schema();                       // column types and fill statistics

snapshot() is the one you want for AI tool responses. It bundles evaluated data, raw formulas, chart descriptors, column metadata, and a message string into a single object. Hand it to your UI layer and it has everything it needs.

Formulas

Cells starting with = evaluate when read. The formula engine handles A1 references, ranges, and these functions:

Formula What it does Example
=SUM(range) Add up numbers =SUM(A1:A10)
=AVG(range) Average (alias: AVERAGE) =AVG(B1:B5)
=MAX(range) Largest value =MAX(C1:C10)
=MIN(range) Smallest value =MIN(C1:C10)
=COUNT(range) How many numbers =COUNT(A:A)
=IF(cond, t, f) Pick one or the other =IF(A1>10,"high","low")
=LEN(value) String length =LEN(A1)
=ABS(number) Absolute value =ABS(A1)
=ROUND(n, digits) Round to N places =ROUND(A1, 2)
=CONCAT(a, b, ...) Stick strings together =CONCAT(A1, " ", B1)

Formulas are stored as-is in the CSV. toCSV() preserves them. evaluate() resolves them. Round-trip safe.

Visual formulas

Some formulas don't produce numbers. They describe charts. The engine resolves the data ranges and gives you a ChartDescriptor. Your UI picks the charting library.

Formula Produces Example
=PIE(labels, values) Pie chart descriptor =PIE(A1:A5, B1:B5)
=BAR(labels, values) Bar chart descriptor =BAR(A1:A5, B1:B5)
=LINE(labels, values) Line chart descriptor =LINE(A1:A10, B1:B10)
=SPARKLINE(range) Sparkline descriptor =SPARKLINE(B1:B10)
=GAUGE(val, min, max) Gauge descriptor =GAUGE(B1, 0, 100)
const snap = engine.snapshot();
snap.charts;
// [{ cell: 'C1', type: 'pie', resolvedLabels: ['Q1','Q2'], resolvedValues: [40,60] }]

The data is resolved. The labels are resolved. The UI just draws.

The Format Row

This is the interesting part. Here's a normal CSV:

Product,Quantity,Price
Widget,42,9.99

And here's the same CSV with a format row:

:---,---#:,:---$:
Product,Quantity,Price
Widget,42,9.99

That first line tells any format-aware reader: column 1 is left-aligned text, column 2 is a right-aligned number, column 3 is right-aligned currency. Open this in Excel and you'll see a harmless row of dashes. Open it in @portel/csv and you get typed column metadata for free.

  :---#w120+*:
  │ │  ││   ││
  │ │  ││   │└─ right/center align marker
  │ │  ││   └── required field
  │ │  ││ └──── text wrapping
  │ │  │└────── width in pixels
  │ │  └─────── type indicator
  │ └────────── dashes (minimum 2)
  └──────────── left/center align marker

The syntax is inspired by Markdown table separators. If you've written |:---|---:| in a Markdown table, you already know how this works.

Alignment Pattern Types Char Modifiers Syntax
Left :--- number # width w120
Right ---: currency $ wrap +
Center :---: percent % required *
date D sort asc > prefix
bool ? sort desc < prefix
markdown M
longtext T

Full specification with ABNF grammar: FORMAT.md

Compatibility

The format was designed around one principle: zero cost to ignore.

Reader What happens
Standard CSV parser Sees a data row with dashes. Harmless.
Excel / Google Sheets Shows dashes as text. Ignore or delete the row.
pandas skiprows=[0] and carry on.
@portel/csv Full metadata extraction. Types, alignment, width, the works.

Metadata, not rendering

Worth repeating: the library stores column metadata but never formats output. type: 'currency' is a hint that says "this column holds money." Your UI decides whether to show $9.99 or €9,99 or 9.99 USD. The engine stays locale-agnostic and opinion-free.

const engine = CsvEngine.fromCSV(csvWithFormatRow);
const meta = engine.getColumnMeta();
// meta[2] = { align: 'right', type: 'currency' }

// Every result carries this metadata
const query = engine.query('Price > 5');
query.columnMeta[2].type;  // 'currency'
// Your formatter does: new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD' })

Standalone Utilities

You don't have to use CsvEngine. Every module is exported individually:

import {
  escapeCSV, parseCSVLine,                          // CSV primitives
  numberToColumnName, columnNameToNumber,            // A ↔ 0, Z ↔ 25, AA ↔ 26
  cellToIndex, rangeToIndices,                       // A1 → {row:0, col:0}
  isFormatRow, parseFormatCell, buildFormatCell,     // format row handling
  parseCondition, matchCondition,                    // query conditions
  evaluateFormula, isVisualFormula, parseVisualFormula, // formula engine
} from '@portel/csv';

Building your own CSV viewer? Just pull parseCSVLine and isFormatRow. Writing a data pipeline? Use parseFormatCell to extract types and ignore the rest. The engine is convenient. The parts are flexible.

API Reference

CsvEngine

Construction

new CsvEngine()                                          // 10 empty columns
new CsvEngine({ headers: ['Name', 'Age'] })              // custom headers
new CsvEngine({ headers, columnMeta, defaultCols: 5 })   // full control

CsvEngine.fromCSV(csvText: string): CsvEngine             // parse CSV text

Read

Property / Method Returns Description
rowCount number Data rows
colCount number Columns
getHeaders() string[] Column headers (copy)
getColumnMeta() ColumnMeta[] Column metadata (copy)
evaluate(row, col) string Evaluated cell value
evaluateAll() string[][] Full evaluated grid
getRawCell(row, col) string Raw content (formula string if any)

Mutate

Method Signature Returns
set (cell: string, value: string) void
add (values: Record<string, string>) number (1-indexed row)
remove (row: number) void
update (row: number, values: Record<string, string>) string[] (change descriptions)
push (rows: (string[] | Record<string, string>)[]) number (rows added)
fill (range: string, pattern: string) void
clear (range?: string) void
resize (rows?: number, cols?: number) void
rename (column: string, name: string) string (old name)
sort (column: string, order?: 'asc' | 'desc') void
format (column: string, opts) void

Query

Method Returns Notes
query(where, limit?) QueryResult Conditions: >, <, =, !=, >=, <=, contains
sql(query) SqlResult Full SQL. Requires alasql.

Serialize

Method Returns Description
toCSV(options?) string CSV text. { formatRow: true/false } controls format row.
toObjects() Record[] Type-coerced objects
toTable(range?) string ASCII markdown table
snapshot(msg?) EngineSnapshot Full state for UIs
schema() SchemaColumn[] Column types and stats

Data Loading

Method Description
loadCSV(csvText) Replace state from CSV text
appendCSVLines(lines) Append lines, skipping headers/format rows

Types

Full TypeScript interfaces
interface ColumnMeta {
  align: string;           // 'left' | 'right' | 'center'
  type: string;            // 'text' | 'number' | 'currency' | 'percent' | 'date'
                           // | 'bool' | 'select' | 'formula' | 'markdown' | 'longtext'
  width?: number;          // pixels
  required?: boolean;
  sort?: string;           // 'asc' | 'desc'
  wrap?: boolean;
}

interface EngineSnapshot {
  table: string;                       // ASCII table
  data: string[][];                    // evaluated values (non-empty rows)
  formulas: Record<string, string>;    // cell ref → raw formula
  headers: string[];
  columnMeta: ColumnMeta[];
  charts: ChartDescriptor[];
  message: string;
  rows: number;
  cols: number;
}

interface QueryResult {
  table: string;
  data: string[][];
  headers: string[];
  columnMeta: ColumnMeta[];
  message: string;
  matchCount: number;
}

interface SqlResult {
  result: any;
  columnMeta: ColumnMeta[];
  count: number;
  message: string;
}

interface ChartDescriptor {
  cell: string;
  type: 'pie' | 'bar' | 'line' | 'sparkline' | 'gauge';
  labelRange?: string;
  valueRange?: string;
  resolvedLabels: string[];
  resolvedValues: number[];
  min?: number;
  max?: number;
}

Design Decisions

Pure library. No fs, no fetch, no async. Strings in, strings out. You bring the persistence layer. This means it works in browsers, workers, edge functions, Deno, Bun, wherever JavaScript runs.

Metadata, not opinions. Column types and alignment are stored and returned, never rendered. The engine is intentionally locale-agnostic. A currency column in Tokyo and Toronto should look different, and that's your formatter's job, not ours.

alasql is optional. Most people just need CSV parsing and formulas. SQL is powerful but heavy. It lives behind an optional peer dependency. If you call sql() without installing alasql, you get a clear error message telling you exactly what to do. Not a cryptic module resolution failure.

Format rows round-trip. Load a CSV with a format row, mutate the data, save it back. The format row survives. If you set column formatting programmatically, toCSV() auto-includes a format row even if the original didn't have one.

License

MIT

About

CSV engine with formulas, format rows, and SQL queries — implements the Photon CSV Format

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors