Mass delete

Question

Mass delete

Anonymous

Hi

somehow i have appended the same records to a table multiple times.

The table has a data time stamp down to the second called

CallDate/Time

is there a cleaver way of matching the duplicates in this field, then running a delete that will only remove one of the duplicates

Anyhelp would be appreciated

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

Answer accepted by question author

8 additional answers

Answer 1

Yes, there's a method to do that.

However, it depends in part on there being a unique value in one field, i.e. a Primary Key, for records. Do you have a Primary Key?

Also, it may be the case that you could revert to a version of the table from a backup taken just before the erroneous append. That might be the least hassle. It might involve backfilling some records not available in that backup, but it would arguably be less difficult than finding and deleting all but one of a number of duplicate records.

Microsoft has a page describing one approach to deleting duplicates, if you need to go that route.

https://support.microsoft.com/en-us/office/delete-duplicate-records-with-a-query-4e6c3183-689c-439d-9062-326c85d80a83 It doesn't seem to be complete, although it should get you started, IMO.

Answer 2

Anonymous

Hi yes had a primary id but the data is used as a subform so the same id can have several different updates that’s why I really wanted to used the date/time field p

Answer 3

You might like to take a look at DeleteDemo.zip in my Dropbox public databases folder at:

https://www.dropbox.com/scl/fo/0scigd3r48hx5xrev2jrf/AB0-GMdTgMAO5O1cGdr3QW0?rlkey=ib6bs6g9jqcrywwzivur3265t&dl=0

This little demo file includes an illustration of how to delete all but one row of each subset of rows which have the same FirstName and LastName values. The first method retains the row with the highest ContactID primary key value of the duplicated rows. You could use the date/time stamp column, assuming that this is indexed uniquely:

DELETE *

FROM Contacts AS C1

WHERE ContactID <>

     (SELECT MAX(ContactID)

      FROM Contacts AS C2

      WHERE C2.LastName = C1.LastName

      AND C2.FirstName = C1.FirstName);

The second methods uses a less arbitrary attribute to determine which row is to be retained, in this example retaining those with the latest date of birth, or if that is also duplicated, with the highest ContactID of the rows with the duplicated dates of birth:

DELETE *

FROM Contacts AS C1

WHERE ContactID <>

     (SELECT MAX(ContactID)

      FROM Contacts AS C2

      WHERE C2.LastName = C1.LastName

      AND C2.FirstName = C1.FirstName

      AND C2.DoB =

             (SELECT MAX(DoB)

              FROM Contacts AS  C3

              WHERE C3.LastName = C1.LastName

              AND C3.FirstName = C1.FirstName));

Answer 4

I think what you want is the second of his two methods, since you have a unique date/time value for your differentiating field.

And, don't forget to make a backup before trying one of Ken's methods. I assume from the non-answer that you do not have a backup from which to recover, so let's not get in that pickle again while trying to recover.

Answer 5

Hi yes had a primary id but the data is used as a subform so the same id can have several different updates that’s why I really wanted to used the date/time field p

I don't fully understand what you are saying here, it sounds like the id column is a foreign key of a referencing table, not the primary key of the table from which you wish to delete all duplicates bar one, but if the date/time column is a candidate key, i.e. is, or can be, indexed uniquely, then you can use the first of the two methods I posted, otherwise you will need to use the second method. Either way you will need to identify a combination of columns which will reliably determine the subsets of rows which are duplicates.

Share via

Mass delete

8 additional answers