Branch out to Dataset
A View can be saved as a dataset using the Branch out to Dataset task. Optionally, you can append to another dataset that is created using the same mechanism. This is useful for combining data from different Views.
Quickstart
Let us say you have two views with the following sample data:
Dataset 1 → View 1:
Student | Science | Language |
---|---|---|
Alice | 140 | 138 |
Bob | 135 | 145 |
Dataset 2 → View 1:
Student | Science | Language |
---|---|---|
Frank | 122 | 102 |
Judy | 185 | 135 |
Let us try to combine these two views into one target dataset. Complete the following steps:
Open Dataset 1 → View 1
.
- Go to the
Combine
menu. - Select the
Branch out to Dataset
option. - Enter a name for the new dataset. In this example it would be
Final Scores
. - Click APPLY.
The new dataset produced is created in the same folder. This data in the new dataset will look exactly like Dataset 1 → View 1
.
Now open Dataset 2 → View 1
, and complete the following steps:
- Go to the
Combine
menu. - Select
Branch out to Dataset
. - Select the
Combine with an existing Dataset
option on at the top of the panel. - Find the dataset
Final Scores
. - Verify the column mappings.
- Click APPLY.
The data in Final Scores
appears as shown below:
Student | Science | Language |
---|---|---|
Alice | 140 | 138 |
Bob | 135 | 145 |
Frank | 122 | 102 |
Judy | 185 | 135 |
Supported Options
Following options are supported in this task:
- Create a new Dataset - This option is used to create a new dataset. The new dataset is created in the same folder as the parent dataset.
- Dataset Name - Assigns name to the new dataset that would get created with this operation. When multiple datasets share the same name, the system appends a numerical suffix incrementally to prevent duplicate names.
- Combine with an existing Dataset - This option is used to combine with another existing dataset. This list is limited to Mammoth generated datasets only.
- Select a Dataset - To select target dataset where the data should go when combining with an existing dataset. This list will show all the datasets in the system that support appending of data.
- Match Columns - Match the columns from the current View into the target dataset. System will auto determine but you can change it. You can only match columns that have the same type. Also the matching is one to one.
- Conditions - You can choose to filter out data before branching it out as a dataset or appending data to another dataset.
Managing Task
Besides these options, there are options to manage the Task in the Pipeline. These include:
- Edit - This option allows you to edit the Task.
- Delete - Delete the Branch out to Dataset task with this option.
- Suspend/Restore - You can suspend or restore the Task.
- Copy - Copies the Task. Can be pasted into the pipeline of another view.
- Run now - Re-runs the task to sync all pending data updates. It appears as a yellow refresh icon on the Task in case of pending data updates. Clicking on the icon syncs all updates.
On updates
Apart from these options, you can decide what happens to the branched out dataset on data updates. There are two modes you can choose from:
- Replace mode: Only batch linked to this task in the target dataset gets replaced.
- Combine mode: The data will be added to the target dataset as a new batch. This might produce duplicate rows in target dataset if one is not careful.
- The system ignores hidden columns while branching out a View.
- Currently only datasets created from Branch out to Dataset and Crosstab support appending of data from Views.
- The resultant dataset gets modified only when there are changes in the original dataset.
- The resultant dataset does not reflect changes like - column rename, column reordering, and format changes for numeric/date column in Views.