Group and Aggregate¶
The Group and Aggregate task lets you summarize your data by doing one or both of the following:
Grouping rows with the same set of values.
Calculate aggregate functions.
Table of Contents
Quick Start¶
Let us start with the following sample data:
Product |
Price |
Category |
---|---|---|
HP Elite |
900.00 |
Laptop |
Lenovo Thinkpad |
1100.00 |
Laptop |
Sony VAIO |
600.00 |
Laptop |
Microsoft Lumia |
300.00 |
Smartphone |
HTC One |
400.00 |
Smartphone |
iPad |
700.00 |
Tablet |
Kindle Fire |
300.00 |
Tablet |
Let us calculate the average price in each Category. Complete the following steps:
Open Data Preparation menu and click on Reshape, Group & Aggregate.
Select Group & Aggregate.
In the Group the following section, find the column Category and add it by clicking on the + button.
In the And Aggregate By section, find the column Price. Change the dropdown before it to AVG. Click on the + button. Change the format to have two decimal places.
Click APPLY.
The following table shows the final result:
Category | Avg. of Price |
|
---|---|
Laptop |
866.67 |
Smartphone |
350.00 |
Tablet |
500.00 |
We can calculate multiple aggregations. Let us edit the task we just created. Let us add these items to the result from the Aggregate the following section.
COUNT
MAX of Price
MIN of Price
The following table shows the final result:
Category |
Avg. of Price |
Count |
Max of Price |
Min of Price |
---|---|---|---|---|
Laptop |
866.67 |
3 |
1,100 |
600 |
Smartphone |
350.00 |
2 |
400 |
300 |
Tablet |
500.00 |
2 |
700 |
300 |
You can add multiple items from group the following section too. For example, say you add two different columns. Then the data will be grouped by each unique combination values in those two columns from the data.
Supported Options¶
The following inputs are expected by this task:
Group the following: Select any column you would like to group by. If multiple columns are selected, the data will be grouped by each unique combination values in those columns from the data.
Aggregate the following: Select any aggregations you would like to include in your summary. The aggregation will be calculated for each group. In case you have not selected any column to group by, aggregation will be calculated over the entire data.
Note
You can rename or change the format of the selected aggregations.
You drag-and-drop to reorder the results.