Skip to main content

Managing dataflows

A data flow refers to the movement of data within a system. In Mammoth, you build a data flow whenever you create a View over a dataset, use a Join rule to fetch data from another View or dataset, Branch-out a View as a dataset, or export data from Mammoth into another databases. For the purpose of this document, let us call each of these Datasets or Views as a Node. Your data moves in a forward direction from any node that changes, either due to a change in data or a change in a rule in Views.

Data flow control allows you to hold the propagation of changes going forward at any of the nodes. This essentially means you can work on an area while pausing updates to other connected areas in a data flow as you wish.

For better understanding look at the following data flow. The green points show the controls that can hold the data at the entry or exit points of any node. These control points are called as Data sync.

How the data flow control works in Mammoth

Data Sync

Data Sync for a Dataset

You can control the dataflow in multiple Views of a Dataset simultaneously. You can also choose to allow or dis-allow sync of a View from the source by changing the Data sync settings in the Data Library. This is how:

When the Sync is turned off the data stays in a pending state at the Source and Mammoth shows a warning like this:

Pending data updates warning for a View

When you update the View with the changes, this warning goes away.

These pending data updates also show up in the Dataflow status in such cases.

Data Sync for a View

You can control the dataflow in individual Views as well. The Data sync settings appear at the bottom of the data pipeline.

Here's how you can change Data sync settings in a View:

Alternatively, you can also choose to disable dataflow to a View with the Data sync toggle in the navbar menu. This will stop incoming data from all nodes to the View.

When a View is out of sync with new data or pipeline updates, the system shows a warning like the following:

Warning showing inconsistent data in a View

When you update the View with the changes, this warning goes away.

Data Sync for Tasks

The Data sync feature is also present as a toggle button for separate tasks such as Crosstab, Join, Lookup, Branchout, and Exports to databases in the pipeline. Enable or disable the toggle to allow or dis-allow data flow from the respective Views.

Dataflow Status

The Dataflow status is a global monitor that tracks pending updates across your workspaces. It provides a summary of:

  • pending data updates,
  • pending pipeline changes,
  • pipelines in error,
  • active pipelines,
  • queued pipelines, and so on.

Data flow status

It ensures you are always on top of Views and pipelines that require your attention. You can also use this modal to manage all pending updates from within a single window like this:

note

Updating data in the dataflow status modal is a manual action and it does not alter Data sync settings elsewhere in your workspace.