Apply Filter

You can use the Apply Filter function when you want to keep or remove certain rows of data from the View based on some condition(s). These conditions are also applicable in other Tasks.

apply filter example

Fig. 108 Filter for Female on Column Gender

Quick Start

Let’s use the same dataset as above to reach the final output using the Apply Filter rule.

  1. Go to Transform > Label, Filter and Replace > Apply Filter.

  2. Select Keep from the Keep/Remove option to include the values which satisfy the condition;

  3. Select Gender from the column drop-down;

  4. Select is from the operator list;

  5. Select Value from the operand list;

  6. Select the operand value Female;

  7. Click Apply.

Fundamentals

In order to use filters effectively, let’s understand the following terminologies:

Condition

A condition is a statement against which every row of data is checked. The rows that satisfy the condition appear in the results.

Value or Column Value

A condition in Mammoth has two types of operands: Value and Common Value.

  1. Value

A Value lets you build a condition attached to any value from the selected column. For example, in the Dataset in Fig. 108, we can filter the Gender data for one of the two values - “Female” or “Male”.

Value example

Fig. 109 Selecting “Female” as value

  1. Column Value

A Column Value lets you build a condition connected to values in another Column. For example, in the Dataset in Fig. 108, a Column Value operand can help compare the exam marks over two semesters for all students.

Column Value example

Fig. 110 Selecting Column Value as Eng(Sem-2) for comparison

If we want to select the best performing students in Sem-1, we can build a condition comparing one (Sem-1) column with another (Sem-2) using the greater-than operator. The result is below:

Table 3 Output on comparison of marks

Student

Gender

English (Sem-1)

English (Sem-2)

Alice

Female

95

90

Chuck

Male

80

60

Building Complex Conditions

Let’s take a look at the Dataset in Fig. 108 . Suppose you want to filter the Gender data to show all “Female” exam marks in English in Sem-1 greater than 70. You’ll need to build two conditions for the Task using the AND or OR operation.To build multiple conditions, click on the ‘+’ icon.

To solve the problem above, create two conditions:

  • Gender is “Female”.

  • Value of English (Sem-1) column is greater than 70.

Multiple conditions

Fig. 111 Building complex conditions

Setting “AND” on both conditions to true will get the desired result.

Or perhaps you want to select the students with average marks of more than 65 but less than 80 in English across both Semesters or who scored above 90 in Sem-1. From the defined selection criteria, this requires two conditions. But the first condition needs to compare the marks for both semesters which means there is a condition within a condition, also called a nested condition. To create a nested condition, click on the ‘→’ icon.

So, to solve the above problem, you can build a nested condition as:

  • English (Sem-1) is greater than 90

    OR

  • Output of (Use ‘’ icon to create nested condition) -

    • English (Sem-1) in between 65-80

      AND

    • English (Sem-2) in between 65-80.

Nested conditions

Fig. 112 Building nested conditions

Use “OR” and “AND” as the operators in between the conditions to get the desired result.

Supported Options

  • Keep/Remove: To filter for or against of the condition statement. The Keep option retains rows satisfying the condition while the Remove option removes the columns satisfying the condition.

  • Select a column: Shows a drop-down menu of all the columns. Select a column from the menu to build your condition on.

  • Select an operator: Contains a list of operations. These operations vary as per the datatype. Here’s the complete list of operations for different datatypes:

    • Operations for text columns:


    Table 4 Operations for text columns

    Operation type

    Filters rows

    is

    matching the value

    is NOT

    not matching the value

    contains

    containing the value

    does NOT contain

    not containing the value

    starts with

    starting with the value

    does NOT start with

    not starting with the value

    ends with

    ending with the value

    does NOT end with

    not ending with the value

    is empty

    with no value in the selected column

    is NOT empty

    with value in the selected column

    • Operations for numeric columns:


    Table 5 Operations for numeric columns

    Operation type

    Filters rows

    is

    matching the value

    is NOT

    not matching the value

    is less than

    containing value less than the specified value

    is less than or equal to

    containing value less than or equal to the specified value

    is greater than

    with a value greater than the specified value

    is greater than or equal to

    with a value greater than equal to the specified value

    is the maximum value

    containing the highest value for the selected column

    is NOT the maximum value

    containing values other than the highest value

    is the minimum value

    containing the lowest value for the selected column

    is NOT the minimum value

    containing values other than the lowest value

    is empty

    with no value in the selected column

    is NOT empty

    with value in the selected column

    in between

    with values in the specified range

    • Operations for date columns:


    Table 6 Operations for date columns

    Operation type

    Filters rows

    is

    matching the value

    is NOT

    not matching the value

    contains

    containing the value

    does NOT contain

    not containing the value

    starts with

    starting with the value

    does NOT start with

    not starting with the value

    ends with

    ending with the value

    does NOT end with

    not ending with the value

    is empty

    with no value in the selected column

    is NOT empty

    with value in the selected column

  • Select an operand: Offers two options: Value and Common Value.

    • Operand for date type:


    Table 7 Operands for date columns

    Operand type

    Filters rows

    Date

    matching the date

    Date-Time in seconds

    matching date and time to the second

    Date/Minute

    matching date and time to the minute

    Date/Hour

    matching date and time to the hour

    Year/Month

    matching the year and the month

    Year

    matching the year

    Day of Month

    matching the day of the month

    Month

    matching the month

    Weekday

    matching the weekday

    Earliest single value

    with the earliest date and the time

    Latest single value

    with the latest date and the time

    Earliest Date-Time

    with earliest date-time increased to a certain date-time

    Latest Date-Time

    with latest date-time decreased to a certain date-time

    Current Date

    matching the current date or the current date increased/decreased by specific days, weeks, months, or years

  • Add condition: You can build complex conditions using this. The Add condition option is present as the + sign next to the condition.

  • AND/OR: Operators for complex conditions. The And operator checks for rows statisfying all the conditions. The OR operator checks for rows satifying at least any one of the statements.

Note

  1. Filtering does not delete data. It only filters the data for the next step of the pipeline. Filters can be modified or deleted;

  2. You can make the text values case sensitive by clicking on the case-sensitive checkbox present below the last condition ;

  3. Filter conditions are suggested based on current data in the View. It does not account for any future data the View may get. This could affect the filter results. For example, if you need to remove all those rows with a Column Value of -(hyphen) or _ (underscore) or if your current data in the View only has Values with - , the suggestions would reflect this. You need to explicitly add _ to account for any future data with this Value.

  4. Apply Filter can also be applied via explore cards which filter out the Values but don’t add the rule to the Data Pipeline. From the explore card you can add it to the Pipeline and it will open the Apply Filter task pre-filled with the filter conditions matching the explore card selection.

  5. You can place a condition outside the parentheses by clicking on ‘’ icon.