What is Tabular-JSON?

Tabular-JSON is a data format. It is a superset of JSON, adding CSV-like tables. It is:

Real world JSON data often consists of an array with nested objects like a list of products, a list of messages, or a list of clients. This is verbose to write in JSON because all field names are repeated for every item in the array. This common data structure can be written much more compact in a tabular way, like CSV. Adding support for tables in a superset of JSON gives the best of both worlds.

Tabular-JSON aims to be just as simple as JSON and CSV. It combines the best of JSON and CSV, but without their drawbacks. It is human-readable, compact, and supports rich data structures and streaming. The aim of Tabular-JSON is to be a data format, not a configuration format.

Read “Tabular-JSON: Combining the best of JSON and CSV” to learn more about the background of Tabular-JSON.

Playground

Play around with Tabular-JSON in the interactive playground:

Try Tabular-JSON Online

Example

Here is an example of Tabular-JSON data:

{
  "name": "rob",
  "hobbies": [
    "swimming",
    "biking",
  ],
  "friends": ---
    "id", "name",  "address"."city", "address"."street"
    2,    "joe",   "New York",       "1st Ave"
    3,    "sarah", "Washington",     "18th Street NW"
  ---,
  "address": {
    "city": "New York",
    "street": "1st Ave",
  }
}

And here a table at root level, with streamable the rows:

"id", "name",  "address"."city", "address"."street"
2,    "joe",   "New York",       "1st Ave"
3,    "sarah", "Washington",     "18th Street NW"

Ingredients

So what are the ingredients of Tabular-JSON?

And that’s it!

The grammer of Tabular-JSON can be found on the Specification page, alongside the grammer of JSON for comparison.

Features

File extension

Tabular-JSON files have a *.tjson extension, for example data.tjson

Data types

Tabular-JSON supports the following data types:

Data typeExampleDetection
object{ "name": "Joe", "age": 24 }Starts with {
array[7.4, 5.2, 8.1]Starts with [
table
---
"id","name"
1018,"Joe"
1078,"Sarah"
---
Starts with ---
root table
"id","name"
1018,"Joe"
1078,"Sarah"
Starts with a string followed by a comma and another string, or a string followed by a newline and another value
booleantrueEquals true or false
nullnullEquals null
number-2.3e5Starts with a digit or a minus
string"hello world"Starts with "

Differences between JSON and Tabular-JSON

Remarks:

Differences between CSV and Tabular-JSON

FeatureCSVTabular-JSON
Table headerOptionalRequired
Nested header fieldsNot officially supportedMultiple names separated by a dot
Double quotesPrecede by an extra double quote ""Escape with a backslash \"
DelimiterComma (officially). In practice, it can sometimes be a semicolon or tabComma
White space around valuesPart of the value, not allowed when the value is enclosed in double quotesNot part of the value
Data typesNo data typesobject, array, table, string, number, boolean, null
Text valueOptionally enclosed in double quotesMust be escaped enclosed by double quotes
Control characterNo need to escapeEscape with a backslash
Unicode charactersNot officially supported (only ASCII characters are supported officially)Supported
Escaped unicode characters like \u263ANot supportedSupported

Remarks:

Differences between NDJSON and Tabular-JSON

Tabular-JSON can be used to stream data, but in general it isn’t a suitable replacement for NDJSON. NDJSON is often used to write log files. The reason NDJSON is popular is that allows to write structured data line by line, without the need to maintain a fixed data structure or write header data or whatever. Each line is a standalone JSON document. This makes it very robust and flexible. Writing Tabular-JSON in a streaming way requires a fixed data structure and requires writing a header line on top of the file. Using Tabular-JSON will result in smaller files, since the field names are not repeated every line. So if the data size is important and you have a fixed data structure, Tabular-JSON can be a good alternative to NDJSON.

Best practices

  1. You can safely use a Tabular-JSON parser to read both JSON and Tabular-JSON data. When writing data, the output will all become Tabular-JSON though, except when the parser supports disabling tables. Typically, you can use this feature to smoothly migrate from JSON to Tabular-JSON.
  2. Always use a CSV parser to parse CSV data. Do not use a Tabular-JSON parser to parse CSV data, even when the data looks like valid Tabular-JSON or the other way around. There are tricky edge cases around escaping (see the differences section).

Status

There is a JavaScript/TypeScript library available, but the data format has to be implemented in a more couple of languages (like Python). In order to facilitate this, a JSON based test suite has to be implemented. Furthermore, there is need for plugins for IDE’s like VS Code and Intellij to get syntax highlighting.

References