How to find identical values ​​in excel. The best programs to find duplicate (identical) files

Let's take advantage of conditional formatting. We have already discussed this topic in the article, and now we will apply it to solve another problem.

Looking for duplicate records in Excel 2007

Let's select the column in which we will look for duplicates (in our example, this is the column with catalog numbers), and on the main tab we look for the “Conditional Formatting” button. Further, point by point, as in the figure.

In the new window, all we have to do is agree with the proposed color scheme (or choose another) and click “OK”.


Now our repeating values ​​are colored red. But they are scattered throughout the table and this is inconvenient. You need to sort the lines to put them together. Please note that in the table above there is a column “Item No.” containing line numbers. If you don't have one, you should create one so that we can later restore the original order of the data in the table.
Select the entire table, go to the “Data” tab and click on the “Sort” button. In the new window we need to set the sort order. We set the values ​​we need and add the next level. We need to sort the rows first by cell color and then by cell value so that the duplicates are next to each other.


We deal with the duplicates found. In this case, duplicate lines can simply be deleted.


Notice that as duplicates are removed, the red cells return to white.
Having gotten rid of the colored cells, select the entire table again and sort it by the “No./n” column. After this, all that remains is to correct the numbering that has gone astray due to deleted lines.

How to do this in Excel 2003

It will be a little more complicated here - you will have to use the logical function “COUNTIF()”.
Enter the cell with the first value, among which you will look for duplicates.

  • Format.
  • Conditional formatting.

In the first field, select "Formula" and enter the formula "=COUNTIF(C;RC)>1". Just don’t forget to switch the layout in time - “COUNTIF” is typed in the Russian layout, and “(C;RC)>1” in the English one.


Select a color by clicking the “Format” button on the “View” tab.
Now we need to copy this format to the entire column.

  • Edit.
  • Copy.

Select the entire column with the data being checked.

  • Edit.
  • Special insert.


Select “Formats”, “OK” and the conditional formatting is copied to the entire column.
Conquer Excel and see you soon!

Let's consider To How to find and highlight identical values ​​in Excel. Conditional formatting will help us. What is conditional formatting and how to work with it, see the article "Conditional formatting in Excel".You can select duplicate values ​​in Excel both in the entire table and in a specific range (row, column). A function " Filter in Excel " will help hide them if necessary. Let's consider several ways.
First way.
How to Find Same Values ​​in Excel.
For example, number, last name, etc. How to do this, see the article “ How to select cells in Excel".
Second way.
How to highlight duplicate values ​​in Excel. In this table we need to highlight the year of birth 1960. Select the “Year of Birth” column.On the “Home” tab, in the “Styles” section, click the “Conditional Formatting” button. Then, in the “Selected Cell Rules” section, select “Duplicate Values”.
In the dialog box that appears, select what we need to highlight: duplicate or unique values. Select the cell fill color or font color.
For more details, see the article “Highlight date, day of week in Excel provided”.
Click "OK". In column D, all years are highlighted - 1960.

In conditional formatting, you can also select the “Contains text” function in the “Rules for selected cells” section. Write this text (for example, last name, number, etc.), and all cells with this text will be highlighted in color. We wrote the surname “Ivanov”. There are many more ways to find identical values ​​in Excel and highlight them not only with color, but also with words, numbers, and signs. You can configure the table so that duplicates will not only be highlighted, but also counted. You can select repeated meanings from the first word, or you can select duplicates from the second onwards. Read about all this and more in the article "

Finding duplicates in Excel may not be an easy task, but if you are armed with some basic knowledge, you will find several ways to tackle it. When I first thought about this problem, I quickly came up with a couple of ways to find duplicates, and after thinking about it a little, I discovered a few more ways. So, let's look at a couple of simple ones first, and then move on to more complex methods.

The first step is that you need to put the data into a format that makes it easy to manipulate and change. Creating headings on the top row and placing all the data under those headings allows you to organize your data into a list. In a word, the data turns into a database that can be sorted and various manipulations performed with it.

Find duplicates using built-in Excel filters

By organizing your data in the form of a list, you can apply various filters to it. Depending on the data set you have, you can filter the list by one or more columns. Since I'm using Office 2010, all I have to do is highlight the top row where the headings are, then go to the tab Data(Data) and press command Filter(Filter). Downward-pointing triangular arrows (drop-down menu icons) will appear next to each heading, as in the image below.

Clicking one of these arrows will open a filter drop-down menu that contains all the information for that column. Select any item from this list and Excel will display the data according to your selection. This is a quick way to summarize or see the scope of the selected data. You can uncheck the item Select All(Select All), and then select one or more of the items you want. Excel will only show rows that contain the items you selected. This makes it much easier to find duplicates if there are any.

After setting up the filter, you can remove duplicate rows, summarize subtotals, or additionally filter the data by another column. You can edit the data in the table as you need. In the example below I have the elements selected XP And XP Pro.

As a result of the filter, Excel displays only those rows that contain the elements I selected (that is, people on whose computers XP and XP Pro are installed). You can choose any other combination of data, and if necessary, even set up filters in several columns at once.

Advanced filter to find duplicates in Excel

On the tab Data(Data) to the right of the command Filter(Filter) there is a button for filter settings – Advanced(Additionally). This tool is a little more difficult to use and requires a bit of setup before you can use it. Your data should be organized as described previously, i.e. like a database.

Before you can use an advanced filter, you must set up a criterion for it. Look at the picture below, it shows a list with data, and on the right in the column L criterion is specified. I have written the column heading and criterion under one heading. The picture shows a table of football matches. Requires it to only show home meetings. That's why I copied the title of the column I want to filter on and below it I placed the criterion (H) that I need to use.

Now that the criterion is configured, select any cell of our data and press the command Advanced(Additionally). Excel will select the entire list of data and open this dialog box:

As you can see, Excel has selected the entire table and is waiting for us to specify a range with a criterion. Select the field in the dialog box Criteria Range(Range of conditions), then select the cells with the mouse L1 And L2(or those containing your criterion) and click OK. The table will display only those rows where in the column Home / Visitor worth the value H, and will hide the rest. Thus, we found duplicate data (one column at a time), showing only home meetings:

This is a fairly simple way to find duplicates, which can help save time and get the necessary information quickly. You need to remember that the criterion must be placed in a cell separate from the data list so that you can find it and use it. You can change the filter by changing the criterion (mine is in cell L2). In addition, you can disable the filter by clicking the button Clear(Clear) tab Data(Data) in group Sort & Filter(Sort and filter).

Built-in tool to remove duplicates in Excel

Excel has a built-in function Remove Duplicates(Remove duplicates). You can select a column of data and use this command to remove all duplicates, leaving only unique values. Use the tool Remove Duplicates(Delete duplicates) can be done using the button of the same name, which you will find on the tab Data(Data).

Be sure to select which column you want to keep only unique values. If the data does not contain headers, the dialog box will show Column A, Column B(column A, column B) and so on, so it’s much more convenient to work with headings.

When you're done with the settings, click OK. Excel will display an information window with the result of the function (example in the figure below), in which you also need to click OK. Excel automatically eliminates rows with duplicate values, leaving you with only unique values ​​in the columns you select. By the way, this tool is present in Excel 2007 and newer versions.

Finding duplicates using the Find command

If you need to find a small number of duplicate values ​​in Excel, you can do this using search. Go to the tab Hom e (Home) and click Find & Select(Find and highlight). A dialog box will open in which you can enter any value to search in your table. To avoid typos, you can copy the value directly from the data list.

If the volume of information is very large and you need to speed up the search, select the row or column in which you want to search, and only then start the search. If you don't do this, Excel will search through all available data and find unnecessary results.

If you need to search through all available data, perhaps the button Find All(Find All) will be more useful for you.

In conclusion

All three methods are easy to use and will help you find duplicates:

  • Filter– Ideal when your data contains multiple categories that you may need to split, summarize, or remove. Creating subsections is the best use for an advanced filter.
  • Removing duplicates will reduce the amount of data to a minimum. I use this method when I need to make a list of all the unique values ​​of one of the columns, which I later use for vertical search using the VLOOKUP function.
  • I use the command Find(Find) only if you need to find a small number of values, and the tool Find and Replace(Find and Replace) when I find errors and want to correct them at once.

This is not an exhaustive list of methods for finding duplicates in Excel. There are many ways, and these are just a few of them that I use regularly in my daily work.

In today's Excel files, duplicates are ubiquitous. For example, when you create a composite table from other tables, you may find duplicate values ​​in it, or two different users entered the same data in a shared file, which led to duplicates, etc. Duplicates can occur in one column, in multiple columns, or even in the entire worksheet. Microsoft Excel provides several tools for finding, highlighting, and optionally removing duplicate values. Below are the basic techniques for identifying duplicates in Excel.

1. Removing duplicate values ​​in Excel (2007+)

Let's say you have a three-column table that contains identical records and you need to get rid of them. Select the area of ​​the table in which you want to remove duplicate values. You can select one or more columns, or the entire table. Go to the tab Data to the group Working with data, click on the button Remove duplicates.

If each table column has a header, set the marker My data contains headers. We also place markers opposite those columns in which we need to search for duplicates.

Click OK, the dialog box will be closed and rows containing duplicates will be deleted.

This function is designed to delete records that completely duplicate rows in the table. If you haven't selected all columns to identify duplicates, rows with duplicate values ​​will also be removed.

2. Using an advanced filter to remove duplicates

Select any cell in the table, go to the tab Data to the group Sorting and Filter, click on the button Additionally.

Advanced filter, the switch must be set to position copy the result to another location, in the field Original range indicate the range in which the table is located in the field Place result in range specify the top left cell of the future filtered table and set the marker Only unique values. Click OK.

In the place specified for placing the results of the advanced filter, another table will be created, but with data filtered by unique values.

3. Highlight duplicate values ​​using conditional formatting in Excel (2007+)

Select the table in which you need to detect duplicate values. Go to the tab Home to the group Styles, choose Conditional formatting -> Cell highlighting rules -> Repeating values.

In the dialog box that appears Duplicate values you need to select a format for highlighting duplicates. I have the default color set to light red fill and dark red text color. Please note that in this case, Excel will not compare the entire table row for uniqueness, but only the column cell, so if you have duplicate values ​​in only one column, Excel will format them too. In the example, you can see how Excel has filled some cells in the third column with names, although the entire row of this table cell is unique.

4. Using Pivot Tables to Determine Repeating Values

Let's use a table with three columns already familiar to us and add a fourth, called Counter, and fill it with units (1). Select the entire table and go to the tab Insert to the group Tables, click on the button Pivot table.

Create a pivot table. In the field Line names place the first three columns in the field Values We place a column with a counter. In the created pivot table, records with a value greater than one will be duplicates, the value itself will indicate the number of duplicate values. For greater clarity, you can sort the table by column Counter to group duplicates.

In this material we will talk about tools for identifying duplicate photos. In particular, today we will review six programs for finding duplicate photos on a Windows computer. We will compare and choose the best and fastest among them.

Finding identical photos: programs and their comparison

There may be several reasons for the demand for programs for searching identical photos on a computer, for example:

  • Your collection may have grown so large that duplicates are already taking up a lot of space;
  • You need a tool that will find the same or similar photos without having to go through those images yourself.

In our selection there was room for six interesting programs, four of which are distributed free of charge. Below we are:

  1. We’ll tell you about each of these search programs and help you quickly navigate their interface;
  2. Let's compare all the programs, in which we'll see how they cope with searching for identical images when they are slightly modified;
  3. Let's check how well the programs can cope with a large set of photos weighing several gigabytes.

Find duplicate photos with Image Comparer

The first program in our review that searches for duplicate photos is called Image Comparer. Its strengths: good functionality and an interface translated into Russian, including detailed reference information.

Now about the disadvantages. First, the program is not free. However, the cost of a license is a humane 350 rubles (although for some reason the number on the website is 500). In addition, you can use Image Comparer for free for the first 30 days.

The second negative point is that it is slightly confusing, which can confuse an inexperienced user. For example, in order to search within one folder (which may contain others), you need to click on the “create gallery” button and select the desired directory in which to scan.

Next, you will immediately be prompted to give a name and save the file of the gallery being created to any convenient location (the program itself will need this file). Once this is done, a list of all the images in the specified folder and its subfolders will open in front of you in the form of a list or thumbnails:

Buttons marked with arrows start searching for duplicates. The first button is a search within one gallery (the folder you selected), the second button a little to the right is within several galleries. We went with the first option.

Next, the program suggested creating another service file in which the results will be saved for further convenient access to them. Actually, creating a file for the gallery and this file with search results can be a little confusing for an inexperienced user. However, then everything is already simple. The found duplicates will appear in front of you:

You can view them in the form of thumbnails, or by clicking on the “image pairs” tab, go to the view where the photos will be compared with each other:

The center slider allows you to adjust the threshold of image similarity. Set it to 100% and you will only have a list of identical pictures that are perfectly similar to each other. Lower values ​​will only show similar photos.

In the settings you can see a huge list of formats from basic JPG and PNG to more exotic ones that the program works with. Formats can be added and excluded from the search. You can also configure the accounting of reflected and inverted images.

  • Image Comparer program. Official website;
  • Language: Russian;

Finding identical photos in three clicks with VisiPics

The next program is VisiPics. Unlike the Image Comparer discussed above, VisiPics is a free application that also specializes in photo duplicates. Alas, there is no localization into Russian here, but you definitely shouldn’t be upset about this: everything is very simple and extremely clear.

Using the side navigation bar (we've framed it), select the desired directory. Next, click the arrow with the “+” sign to add this folder to the list that will be searched. If you wish, you can select several more folders in the same way. Finally, as a third step, click the Play button to start the process of finding duplicates.

To the right of it is a special slider where you can adjust the level of “attention” of the program. With the default baseline, VisiPics found only two groups of duplicates for us, one of which consisted of three images and the other of two:

These are the images that the program considers to be almost identical duplicates. However, if you lower the slider to the Loose level, then there will be images that are simply similar to each other. In our case, when installing Loose instead of Basic, the application found four more (5 in the final test below) groups of duplicates, and added one more picture to one of the two already found:

The program has relatively few additional options. Here you can configure search in subfolders (it is enabled by default), display of hidden folders, and take into account photos rotated by 90 degrees. On the loader tab, you can ask VisiPics to ignore small files or, conversely, pictures with too high a resolution. The latter is important for speed.

  • VisiPics program. Official website;
  • Language: English;
  • Distribution: free.

Awesome Duplicate Photo Finder

If you are looking for an extremely simple program for high-quality search of duplicate photos and images, which would be extremely easy to understand, then pay attention to Awesome Duplicate Photo Finder. The interface is in English, but it is so simple that anyone can understand it.

Using the “+” button, specify the directory or several directories you need to search, then click Start Search and the search will begin. The Scan Subdirectories option is enabled by default and is responsible for searching subfolders. The program copes with its tasks, finding both very similar:

And here are pictures that are slightly more different from each other:

In the program settings, you can set the match to 100% if you only need absolutely identical photos.

As you can see, there are few settings themselves. Perhaps the saddest thing is that the program works with only five main formats: BMP, JPG, PNG, GIF and TIFF. Moreover, the latter is not taken into account by default.

There are also options to ensure duplicates are deleted directly to the Trash and disable pop-up confirmation. The program can also update itself automatically.

    Official website;
  • Language: English;
  • Distribution: free.

The Similar Images Finder application greets us with an unkind message in English that we need to pay $34 for it. However, the program is ready to work for free for 30 days. Next, a window appears asking you to select directories to search for duplicates:

From it we learn that Similar Images Finder supports 29 image formats, and the user can select specific formats to search or exclude unnecessary ones. In the list, among other things, you can see ico and wbmp.

Clicking Next will start the search for duplicates, and when it's finished, click Next again to see additional settings. By adjusting these, you can more carefully customize what appears in the results list. Finally, by clicking Next a third time, you will see the result itself:

You can move to the next found picture by clicking the miniature arrow in the upper right corner. The entire list of found duplicates opens by clicking on the large button at the top with the addresses of the current files.

In turn, clicking the Next button at the bottom will lead to the final stage of work. There the program will display a list of what, in its opinion, are definitely duplicates and offer to delete them. In the screenshot above, Similar Images Finder coped with an image where a watermark was added and the histogram contrast was changed.

Distinguishing between the pictures, the program calculated at the level of 5.5%. Moreover, in another example, where we added a strong blur effect to the second picture, the differences, according to the application version, for some reason amounted to only 1.2%:

Alas, the program, while finding real duplicates, by default also shows many images that are completely different from each other, as if they have something in common:

  • Similar Images Finder. Official website;
  • Language: English;
  • Distribution: paid, 30 days free use.

Universal search for duplicates with Duplicate Remover Free

Duplicate Remover Free is the only program in our review that is not focused specifically on duplicate photos, but on finding duplicates in general.

As practice shows, such universal solutions do not perform very well in problems related to some narrower area.

However, today we are giving one such program a chance. As the word Free suggests, it is distributed free of charge. The second advantage of the application is the Russian language, and the third is its relative modernity compared to other programs in this collection, many of which, unfortunately, have not been updated for many years.

You should click on the “add directory” button and select the desired folders. By default, the program did not find anything for us in the given directory, however, when at the top instead of “exact duplicates” we selected the “similar images” item, four groups of duplicates were immediately found, one of which consisted of three files at once:

The application has very few additional features. In particular, you can exclude files from the search before and after a certain size.

  • Duplicate Remover Free. Official website;
  • Language: Russian;
  • Distribution: free.

Search for matching photos using various algorithms with AntiDupl

The final participant in our review, the AntiDupl program, may appeal to you for several reasons. First of all, it's free. Secondly, it has a Russian interface. The latter, however, is not obvious. In order to enable Russian, open the View menu and in the Language section select the appropriate item:

Unlike others, this program is not installed, but is located in a self-extracting archive, which extracts it along with the necessary files into a separate folder.

To prepare the search for duplicates, click on the button labeled Open and add the necessary directories in the window that appears:

Next, you can click OK, and then activate the green “start search” button on the toolbar. Using a basic algorithm, the program found several groups of duplicates for us:

Having switched the algorithm at the top to the more free SSIM, we already received two more groups of duplicates, and after increasing the “freedom” of the search from 20 to 35%, the program gave us an even more detailed list:

Moreover, in all cases, there were indeed images that were at least noticeably similar to each other. So don't hesitate to experiment with the settings.

The program has many additional options:

On the “search” tab you can find out that AntiDupl supports 13 formats, including, in addition to traditional JPG/PNG, ICON, PSD and EXIF. Of course, you can choose formats. The options also include checking for defects, blockiness and blurriness, and in the last two cases you can set a threshold. It is possible to search in hidden and system directories.

  • AntiDupl program. ;
  • Language: Russian;
  • Distribution: free.

When searching for duplicates, some users are interested in 100% matching in order to get rid of duplicates in their collection. However, the task often arises of finding simply similar pictures.

And here there is a huge space for possible differences. This can be different formats, resolutions, cropped versions of the same image, adding frames and watermarks, changed colors and captions on the pictures.

We tried to take into account most of these factors and, after all the testing, we ended up creating a small set with more than six dozen pictures. In them we created nine groups of duplicates. Let’s be honest, our experience certainly doesn’t claim to be the ultimate truth, but it was interesting to try. The results are as follows:

  • Duplicate Remover Free: found only 3 groups of duplicates;
  • Similar Images Finder: found 4 groups, but the inconvenient interface, many false results, and the paid nature of the application greatly spoiled the overall impression;
  • AntiDupl found 3 types of duplicates by default, installing the SSIM algorithm increased the search result to 5 groups;
  • VisiPics found only 2 groups of duplicates at the basic search level, but setting the slider to the Loose level allowed it to find 7 groups;
  • Awesome Duplicate Photo Finder found 7 groups of duplicates;
  • Image Comparer was also able to detect 7 groups.

At the same time, Image Comparer was able to find images that Awesome Duplicate Photo Finder and VisiPics missed, and they, in turn, filled in the gaps of Image Comparer.

The fastest programs for finding duplicate photos

At the same time, the quality of the program also depends on its speed. 60+ pictures is, of course, not something that users can work with. So we did another test. This time for speed. To do this, we took a selection of 4450 very different images, the total weight of which exceeded 2.1 GB.

Unfortunately, two programs from this review did not take any place in the test. As it turned out, Similar Images Finder, which costs $34, in its free version is ready to process no more than 200 images at a time.

In turn, the universal duplicate search engine Duplicate Remover Free, faced with a catalog of serious size, worked intensively for more than five minutes, and then completely froze. The remaining programs showed the following times:

  • AntiDupl: 0:39;
  • Image Comparer: 1:02 (35 seconds to create gallery and 27 to search);
  • VisiPics: 2:37;
  • Awesome Duplicate Photo Finder: 3:17.

As a result, Image Comparer and AntiDupl clearly took the lead in the speed test. It took them about or, in the case of AntiDupl, less than a minute to process our archive.

Conclusion

Let's summarize. If you need to find not identical, but rather similar photographs that differ, for example, in a signature or watermark, then Image Comparer, Awesome Duplicate Photo Finder and VisiPics in Loose mode will cope with this task better than others.

In terms of processing speed for a large collection of images, the undisputed leaders are AntiDupl and Image Comparer.

Finally, in terms of interface convenience, we liked Image Comparer and VisiPics, which immediately allow you to visually evaluate all groups of duplicates. In turn, for the clarity of comparing the characteristics of individual duplicates, we will also note AntiDupl.