DataZen Documentation
DataZen User Guide

ADD COLUMN

Overview

Adds a column to the current pipeline data and sets its data type and default value as specified. If the column name already exists in the pipeline data set, this component will throw an error unless the IGNORE_IF_EXISTS flag is set.

This component supports a similar syntax to APPLY SCHEMA for a simpler specification.

FORMAT

When FORMAT is used, the behavior of this operation changes to build a JSON, XML, or CSV document fragment using the columns available or specified. This operation is typically used before calling the ZIP command to create the final document.

The following formatting options are available:

KeyValuesComments
headers0 or 1Indicates that a header line is present in the payload
skipan integer >= 0Skips N number of rows from the top of the content before parsing data
delimiterOne or more charactersMultiple characters may be provided when parsing content with more than one delimiter; use " " for space
quotedA single character, or 1 or 0A character to use as surrounding quotes; if 1 is provided, a double-quote is assumed; 0 means not quoted
trim0 or 1If 1, the content will be trimmed
fieldsa list of integersTo parse delimited files, provide a comma-separated list of integers without spaces (ex: 5,15,25 or "5, 15, 25")
flatten0 or 1When 1, flattens the content if line feeds are detected
commentscomma-separated list of stringsUsed to ignore a line when the characters are found at the start of a line (ex: --,#)
dateDate/time formatThe date/time format to use when a datetime field is detected (ex: "MM/dd/yy H:mm:ss zzz" or "s")
bytesascii, utf8, utf32, unicode, base64, hexConverts a byte[] field into the format specified

Syntax

Adds or alters a column to the current data set in a pipeline and sets a default value either as a calculation or the outcome of a DataZen function operation; can also be used to format a row into a JSON, XML or CSV output.

ADD { OR ALTER } COLUMN 
	({type} fieldName   { = {'} [ default | #func(...) | @func(...) | {{field}} ] {'} })
{
	FORMAT [ JSON | XML | CSV ] 
		{ [ INCLUDE | EXCLUDE ] '...' }
		{ COLLAPSE }
		{ WHEN '...' }
		{ OPTIONS '...' }
}
{ WITH 
	{ IGNORE_IF_EXISTS }
	{ CONTINUE_ON_ERROR }
}
;

ALTER

When the ADD OR ALTER command is used, the operation attempts to modify both the data type and/or the values of a column based on the specification provided

COLUMN

The column to add or alter using the APPLY SCHEMA column specification; if the column name already exists, this component throws an error unless the IgnoreIfExists option is set or ALTER is specified

FORMAT

When specified (JSON,XML,CSV), builds the corresponding formatted output

INCLUDE

Limits the FORMAT operation to the list of fields provided, or exclued them if NOT is specified

EXCLUDE

Use all available fields except those specified for the FORMAT operation (ignored if INCLUDE is specified)

WHEN

Limits the rows on which the FORMAT operation to the list of fields provided

COLLAPSE

Remove the fields used by the formatting operation

OPTIONS

Provides additional options for the FORMAT operation

CONTINUE_ON_ERRORS

Continues processing even if errors are detected

IGNORE_IF_EXISTS

Bypasses this component if the column name already exists (ignored if ALTER is used)

Example 1

-- This HTTP request uses the Rss Feed connection and calls the /econi.xml relative URL
-- The APPLY operation transforms the HTTP response body using an XPath command that 
-- turns the payload into rows and columns automatically
-- The component is added "inline" of the HTTP operation

SELECT * FROM HTTP [Rss Feed] (GET /econi.xml) APPLY TX (//item)

	-- This pipeline executes as part of the HTTP request
	-- If the HTTP request was made over multiple pages, this component would 
	-- execute for each page of data; since we are not using paging, 
	-- this will excute only once.

	APPLY PIPELINE (
		
		-- Now, add a column with the current time un UTC and 
		-- ignore this operation if the columns already exists
		-- The data type will be a datetime 

		ADD COLUMN 'datetime dateFound=#utcnow(s)' IGNORE_IF_EXISTS;

		-- Adding another columns, as a string; because we are formatting with 
		-- inner quotes, we can surround the expression with parenthesis 

		ADD COLUMN (dateFoundStr='#utcnow(s)') IGNORE_IF_EXISTS;

	)
;

Example 2

-- This HTTP request uses the Rss Feed connection and calls the /econi.xml relative URL
-- The APPLY operation transforms the HTTP response body using an XPath command that 
-- turns the payload into rows and columns automatically
-- The component is executed after the HTTP execution is completed

SELECT * FROM HTTP [Rss Feed] (GET /econi.xml) APPLY TX (//item);

-- Now, add a column with the current time un UTC and 
-- ignore this operation if the columns already exists
-- The data type will be a string

ADD COLUMN 'dateFound=#utcnow()' IGNORE_IF_EXISTS;


Example 3

-- Retrieves a list of tables from a SQL Server database
SELECT * FROM DB [sqldb] (SELECT * FROM sys.tables);

-- Add a JSON column for each row using a subset of the fields
-- returned by the previous command
ADD COLUMN 'jsonData' FORMAT JSON INCLUDE 'object_id,name,schema_id,[type]';

-- Add another column with a CSV output using a specific delimiter and date format
-- and remove the fields used by the format
ADD COLUMN 'csvData' FORMAT CSV 
	INCLUDE 'object_id,name,schema_id,[type],create_date' 
	OPTIONS 'delimiter:| date:s'
	COLLAPSE;

Example 4

-- Retrieves the raw payload of an HTTP endpoint that returns
-- a base64 encoded string

SELECT * FROM HTTP [ReportAPI] (GET /v2/getreport?id=1000);

-- Convert the payload returned by this API from a base64 to an ASCII string
-- using the #frombase64a() function
ADD OR ALTER COLUMN 'string payload=#frombase64a({{payload}})';

Example 5

-- Retrieves a list of tables from a SQL Server database
SELECT * FROM DB [sqldb] (SELECT * FROM sys.tables);

-- Add an XML column for each row using all available fields
-- returned by the previous command and wrap each XML fragment 
-- with an element called Item
ADD COLUMN 'xmlOutput' FORMAT XML OPTIONS 'element:Item';