How do I remove duplicate rows from my downloaded data?

Learn why downloaded data may seem to have duplicate values and how to remove these duplicates.

There are a few instances where the data downloaded from the accounting software returns some data as multiple rows / lines which may look to have identical / duplicate values - such as the Contacts or Invoices data tables.

If you look closely at these ‘identical’ rows, you will notice that they are not identical and there are data differences within one or more columns of those rows. For example, a contact may have multiple telephone numbers or multiple emails - hence showing on different rows / lines. The Invoices may contain multiple payments and these are also represented as 2 separate rows / lines.

Unique Records only

When creating or refreshing a data table, duplicate values may be avoided by selecting the Unique records only in the Data cleaning section

image
When this option is used, DataDear will ensure that the returned list will not contain any duplicates and only the first line of the rows will be returned. This is particularly useful in data tables where the columns cannot be replicated - for example when a formula using =SUM() of the totals column is used.

Removing Duplicate Values

Duplicate values may sometimes give incorrect results when generating some reports and hence need to be removed before processing the data. DataDear already provides a solution in the parameters window however Excel can also help you achieve this task as per below instructions.

In this example, we will take the Invoices data table which contains a number of duplicate rows due to multiple payments for each Invoice (when the Unique records only is not ticked in the parameters window). Excel has a function which can remove duplicates.

  1. Once the report is created, under the Data toolbar , click on Remove Duplicates .

  2. Choose the Unselect all button and tick the columns which are duplicated. In this case we will tick the Invoice Number and click the OK button.

  3. Excel will return a message window indicating the number of duplicate values that were found and removed.

  4. You may now process this data without duplicates.

Note - This step needs to be repeated every time the data is refreshed as this setting is lost once the data changes. Using the Unique records only may be more suitable for reporting as refresh will remember this option.