APPLY TX
Overview
The APPLY TX operation transforms a specific column containing a, XML, JSON, or CSV document and replaces the current pipeline data with the transformed data set. It is equivalent to the APPLY TX option available through the SELECT HTTP and SELECT DRIVE command, but can be applied anywhere in an ETL pipeline regardless of the original source of the data.
XML and JSON
When processing an XML or JSON document, the value specified in the TX command is interpreted as a modified JSON or XML Path, with support for inner-document JOIN operation. See the Document Path documentation for further details.
CSV
When processing a CSV document, the value specified in the TX command is parsed as a space-delimited set of
key-value pairs separated by a column: key:value. Keys are not case-sensitive.
The following keys and values are supported:
| Key | Values | Comments |
| headers | 0 or 1 | Indicates that a header line is present in the payload |
| skip | an integer >= 0 | Skips N number of rows from the top of the content before parsing data |
| delimiter | One or more characters | Multiple characters may be provided when parsing content with more than one delimiter; use " " for space |
| quoted | A single character, or 1 or 0 | A character to use as surrounding quotes; if 1 is provided, a double-quote is assumed; 0 means not quoted |
| trim | 0 or 1 | If 1, the content will be trimmed |
| fields | a list of integers | To parse delimited files, provide a comma-separated list of integers without spaces (ex: 5,15,25 or "5, 15, 25") |
| flatten | 0 or 1 | When 1, flattens the content if line feeds are detected |
| comments | comma-separated list of strings | Used to ignore a line when the characters are found at the start of a line (ex: --,#) |
These additional formatting options are only available during write operations: bytes and date.
For example, this is a valid transformation that converts a flat file into rows and columns:
APPLY TX 'headers:1 delimiter:, quoted:1 trim:1' ON 'payload';
Syntax
Transforms an XML, JSON, CSV document from an existing field into rows and columns based on the path provided for each row found in the data pipeline. The resulting transformation becomes the new pipeline data set.
APPLY TX
'...' -- the JSON/XML path or CSV options
ON '...'
{ WITH
RAWCOL '...'
}
;
PATH |
The Json or XML Path that represents the node(s) to transform; this path supports the Document Path notation specified in the DataZen documentation; for CSV data, the file processing options to use (ex: headers:1 trim:1) |
ON |
The column name or column index to inspect containing the JSON or XML document |
ROWCOL |
When specified, represents the column name to add with the raw content for each row |
Example 1
SELECT * FROM HTTP [Rss Feed] (GET /econi.xml); -- Apply a JSON transformation on the payload column and add the original -- document used to create each row APPLY TX (//item) ON 'payload' WITH RAWCOL '_raw';
Example 2
-- Call an ADP Endpoint returning demographics data SELECT * FROM HTTP [ADP] (GET /worker-demographics?limit=100&select=workers/person/legalName); -- Apply a document transformation on the payload column and add the original -- document used to create each row; -- The transformation joins multiple nodes within the document APPLY TX (workers || person || legalAddress) ON 'payload' WITH RAWCOL '_raw';