Calculation Engine Integration

Introduction

Analytical workloads are integrated into DCP using the ACDC component. For the technical implementation details on how ACDC exposes R function as callable endpoint, please refer to the ACDC design specification. Communication to ACDC instances uses SSL and basic authentication. To secure the application and to avoid CRS issues no direct access from the frontend to the calculation layer (ACDC nodes) is allowed. Communication is always passing trough the backend.

Calculation engines can be managed on a per module and site basis in site administration, doing so allows to distribute loads and keep the data processing as close as possible to the data storage location.

Requests from the backend are build on the fly. Request payloads are using different encodings in order to address limitations of a specific scenario (URL max. length, passing sessionIDs). The transfer format between an ACDC instance and the backend uses the JSON encoding.

ACDC client

Endpoint configuration

Endpoint routes, used for accessing external systems (e.g. ACDC or a different microservice) are defined as a string constant. Per accessed service, a dedicated static class is defined.

Data Access

In principle, DCP implements two ways for getting the raw data into the calculation node. The decision on which method is used in a specific scenario is chosen by the expected message size. Small messages (e.g. report metadata) are sent in a push manner. In this case, all raw data needed for the calculation is already present in the request. For cases where the message size would grow quickly (e.g. a modelling dataset) the raw data is sent in a pull/trigger-only fashion. In this case, the request from the backend only contains the information for getting the raw data. The actual value retrieval is performed by the calculation node itself. The logic for interacting with the data source is abstracted in the acdcDataFactory package.

In order to implement the pull pattern credentials for accessing the third party datasource needs to be shared in a save manner. This is performed by only putting secrets path on the request. All credentials are centrally managed inside vault. The access to configuration and/or secrets is abstracted into the acdcClient package. It uses certificate based authentication to connect to vault.

Passing data specific errors

Sometimes it is required to pass special errors from the calculation nodes to the UI. This is especially important for datasource related errors. This is implemented using the following design pattern. In general only custom error messages are forwarded. However, the developer on the R side has full control and can utilize the pattern below:

  • raising an exception using the stop() directive
    • normal standard errors are sanisized
    • there are a set of special prefixes - available - if added to the error message. This will be converted into the corresponding type on the calling part and therefore forewarded to the user. Currently supported pre-fixes are:
      • Datasource; indicating datasource specifc errors
      • MathError; indicating numerical/mathematical problems, instability, etc.
sequenceDiagram
    autonumber
    box CalculationNode ACDC & RCode
    participant ACDC
    participant RCode
    end
    box BE
    participant HTTPClient
    participant DCPExceptionFilter
    participant BusinessLogic
    end
    BusinessLogic ->>HTTPClient: Build Request
    HTTPClient->>ACDC: Request:
    ACDC->>RCode: Invoke R function call
    RCode->>ACDC: Raise error using stop() directive"
    ACDC->>HTTPClient: Convert to 404 status code
    HTTPClient->>DCPExceptionFilter: Check for specific keys
    DCPExceptionFilter --> BusinessLogic: specific error handling
This page was last edited on 03 May 2024, 07:57 (UTC).