Analytical calculations can be resources intensive. Especially during the training or fitting a model. As DCP made the decision that data remain in their source systems with an online connection, handling workloads has a big significance. In order to reduce load on the source systems, DCP implements multiple layers of caching:
MVDA caches the real-time evaluation of batch evolution models, as batch level models do not support real-time updates (the model vectors are only defined after the end of a batch).
Sources In the MVDA module the following sources can lead to a sign up which should be cached:
These sources are collected and registered for EventSubscription, see the implementation in BatchEventController
for the exact conditions.
The MVDA Cache worker services combines multiple WorkerServices using the ServiceCollectionHostedService method. The following services are registered:
The cache handler is listening for BatchStart and BatchEnd events from the message broker. The high-level concept can be seen in the illustration below:
stateDiagram-v2 state if_state1 <> state if_state2 < > [*] --> SignupSource: BatchStart SignupSource --> if_state1 if_state1--> ConditionCheck: Sources found if_state1 --> [*]: No sources found ConditionCheck --> if_state2 if_state2 --> InactiveSignUp: Condition do not match if_state2 --> ActiveSignUp: Condition match state ActiveSignUp { Update --> Update: every 20s } ActiveSignUp --> ReleaseSignUp: BatchEnd ReleaseSignUp --> [*] InactiveSignUp --> [*]
With the start of an event the service performs a condition check. The conditions to satisfy are set by the model developer and loaded from the model entity. The current context is taken from the message. If a context attribute is missing - this is treated as a no match. When the conditions are not fulfilled the state is saved in the database without any further actions. If the condition test is passed the service is responsible for populating the initial cache (the period from the batch start time to the current time). The state is saved in the database.
There are some scenarios where a batch end event has been missed (e.g. service restart) or the event is erroneous not closed on the data source layer (e.g. wrong analysis configuration, connection timeout). If for these events active sign-ups would be present they would never be released and cached forever - to avoid this and related performance degradation the batch duration worker is implemented. The responsibility of the worker is to detect abnormal long-running events and re-lease them in a defined way.
The high level concept is illustrated in the figure below:
stateDiagram-v2 state if_state <> [*] --> ActiveSignUp ActiveSignUp --> if_state if_state --> Release: if Model.AvgRuntime * 3 > DateTime.Now Event.Starttime if_state --> KeepCaching : if Model.AvgRuntime * 3 <= DateTime.Now Event.Starttime KeepCaching --> ActiveSignUp
The validation status cache is a very simple background task which is running every hour and updates the document status and the validation status of the related models. The document status needs to be cached as the API for the EDMS status only accepts one documentId per request. This would result in a big amount of parallel requests whenever the models are presented in a list view. The cache workflow is illustrated in the figure below:
sequenceDiagram ValidationStatusCache->>WorkerService: Fetch oldest 5% of the records WorkerService->>EDMS: Request document status EDMS->>WorkerService: Return document status WorkerService->>ValidationStatusCache: Update document status in cache
As the oldest 5% of the records are updated on each iteration and the iteration is performed on a hourly basis the maximum cache age in the system is 24 hours. As document status changes are not a volatile process this can be accepted.
This worker is performing the actual updates to the CacheData
table. The Cache uses the following database tables:
erDiagram SignUp { int ID "" int ModelId "The linked model from which the calculation was based on" tinyint CalculationType "The calculation type - describing the type of the model output" nvarchar BatchStartTime "" nvarchar BatchId "Human readable identifier for the batch - unique constrain not guaranteed" nvarchar EventFrameId "Unique identifier of batch on the datasource layer" tinyint EventType "Indicating the source of the sign up (dashboard, notification, etc.)" nvarchar DeviceWebId "The equipment identifier on the datasource" int ModelVersion "The model version which the calculation are based on" nvarchar YAxisLabel "The human readable axis label describing the output" tinyint BatchMachingConditions "Is the batch passing the conditions as defined by the model" bit IsDurationExeeded "Is the signup present longer than the expected duration" } CacheData { bigint ID "" float Maturity "The (model-dependent) maturity value of the data point" float Value "The value of the calculation point at the maturity timepoint" datetime2 ObsID "Unique identifier of the observation/data point on the datasource layer" float DistanceMetric "The calculated distance metric for the datapoint - used for limit assessment" bit BadValue "Is the data point marked as bad (data source or numerical problems)" int SignUpID "" tinyint SpecialValueType "Encoding special values not supported by the database e.g. Inf, NaN, etc." } SignUpTags { int ID "" int SignUpID "" nvarchar TagName "The name/identifier of the tag - describing the model output" nvarchar TagValue "The value of the tag - describing the model output" tinyint TagType "The type of the tag (calculation or limit related) - describing the model output" } Model ||--|| SignUp : "has" SignUp ||--|{ CacheData : "has" SignUp ||--|{ SignUpTags : "identified by"
Specification | Value |
---|---|
Content/Overview | Currently active batches and their values |
Data classification | Cache only |
Change Tracking | No |
Audit Trail | No |
Retention period | N/A |
In order to minimize the requests all signUps are grouped by the model and then a bulkRequest update is performed (As some restrictions are given by calculation node no further grouping is allowed).