Remove Duplicates¶
This task removes duplicate rows from your data.
![]()
Fig. 105 Remove Duplicate rows¶
Two rows are duplicates if they have the same values in corresponding cells on two or more rows.
Mammoth allows you to ignore a few columns from comparison for duplicates (See Fig. 106). To ignore columns from the left side selection box, click on the ‘+’ button. The box on the right side will show the ignored columns. To compare an entire row to check for duplicates, make sure no columns are in the right side box.
![]()
Fig. 106 Remove Duplicates rows task window.¶
Since Remove Duplicates does not consider cells from ignored columns while comparing, it picks a random value from the duplicated rows for the ignored columns. See Fig. 107.
![]()
Fig. 107 Randomly picks the values from the column cells for the resultant row.¶
Note
If Remove Duplicates is not working, check the numbers or dates aren’t formatted to look the same but are internally different. Check if there are any formats set on the columns. Data seen on screen may look different from what is stored internally.