DataZen Documentation
DataZen User Guide

Call an HTTP Function with SQL CDC

Overview

This document provides an overview on how to call an HTTP Endpoint as part of a SQL CDC Pipeline, such as a custom HTTP endpoint, a webhook, or cloud functions such as an Azure Function or an AWS Lambda endpoint. As long as the authentication and the HTTP Method are supported (GET, POST, PUT, DELETE, PATCH), you can call virtually any HTTP endpoint. The following types of calls are supported:

  • Call an HTTP Endpoint with or without sending a payload
  • Call an HTTP Endpoint per row of data in the pipeline
  • Call an HTTP Endpoint per batch of N records in the pipeline
  • Treat the response, if any, as additional columns to the current pipeline or replace the pipeline data completely
  • When calling per row, perform a left join or inner join of the response data

Call an HTTP Endpoint

You can call an HTTP Endpoint three different ways:

SELECT

The SELECT operation is typically used when starting a read operation and using an HTTP response as the initial data set of the pipeline. You can use the APPLY TX option to transform the HTTP response into rows and columns.

For example, to get weather alerts from the US National Weather Service called NWSWeather, you can execute this operation. This will return the HTTP Response payload, execution time, HTTP Headers are additional information. By inspecting the content of the payload returned, you will see that the features node contains weather data.

SELECT * FROM HTTP [NWSWeather] (GET /alerts/active);

This SELECT operation returns the data; however, the APPLY TX option transforms the features node into rows and columns automatically.

SELECT * FROM HTTP [NWSWeather] (GET /alerts/active) APPLY TX 'features';

APPLY HTTP

The APPLY HTTP operation can be used to send data using a fire-and-forget method or to obtain a response that should either enhance the current data set or replace it. The PROCESSING option indicates the mode being used. The following PROCESSING options can be specified:

  • continue: used to fire-and-forget a payload or trigger an HTTP action; the response payload is ignored
  • enhance: used to send payload and add columns with the data returned by the call assuming the same number of rows are returned (unless the SHIFT option is also used)
  • left join: used to send one row of data at a time and perform a left join operation using the response payload (when combined with APPLY TX, may return more rows than available initially in the pipeline)
  • inner join: used to send one row of data at a time and perform an inner join operation using the response payload (when combined with APPLY TX, may return fewer rows than available initially in the pipeline)

See HTTP Paging Options for more information on how to return multiple pages of data.

For example, this operation will create a CSV-formatted column called csv 10 rows at a time, then call the custom HTTP endpoint for every batch in the pipeline. After the execution of the APPLY HTTP operation, the resulting data set will contain the HTTP Response and HTTP Headers for each row processed because APPLY TX was not specified.

SELECT * FROM HTTP [NWSWeather] (GET /alerts/active) APPLY TX 'features';

ADD COLUMN 'csv' FORMAT CSV;
ZIP COLUMN 'csv' FORMAT CSV BATCH 10;

APPLY HTTP [httpEndpoint] (POST /sendData)
WITH PROCESSING 'left join'
    PAYLOAD ({{csv}});

Adding the APPLY TX option transforms the data back into rows and columns based on the data contained in the HTTP response payload. Using INNER JOIN would remove records for which the APPLY TX operation returns 0 records.

SELECT * FROM HTTP [NWSWeather] (GET /alerts/active) APPLY TX 'features';

ADD COLUMN 'csv' FORMAT CSV;
ZIP COLUMN 'csv' FORMAT CSV BATCH 10;

APPLY HTTP [httpEndpoint] (POST /sendData)
WITH PROCESSING 'left join'
    CONTENT_TYPE 'text/plain'
    PAYLOAD ({{csv}})
    APPLY TX '$.resp'
;

PUSH

The PUSH operation is used in a write pipeline and offers the ability to send N records at a time using the BATCH option. Unlike the APPLY HTTP operation, the PUSH operation does not process the HTTP response payload except when the TRACE INTO option is also provided. Regardless, the PUSH operation represents the last step in a Write pipeline, so no further action can be taken outside of TRIGGER calls. For example, this PUSH operation sends data to a custom HTTP endpoint by sending the content of the data field (as a JSON document), for every row in the data pipeline (if 50 rows are in the data pipeline, 50 calls will be made).

PUSH INTO HTTP [customEndpoint] 
ON_UPSERT (POST /sendData)
    CONTENT_TYPE 'application/json'
    PAYLOAD ({{data}})
WITH BATCH 1;

When BATCH is greater than 1, considering using one of the document-formatting functions to build a valid JSON, XML, or SQL payload. In this example, we are using the @concatjson() operation to build a JSON document and adding surrounding brackets to form an array of objects. When using this approach, it is your responsibility to ensure the JSON document created is valid. See Document Formatting for more information.

PUSH INTO HTTP [customEndpoint] 
ON_UPSERT (POST /sendData)
    CONTENT_TYPE 'application/json'
    PAYLOAD (
        [ @concatjson("id": {{id}}, "name": "{{name}}") ]
    )
WITH BATCH 50;