I stray away from using Microsoft Excel for data management, but sometimes it is necessary to do quick manipulations on a relatively small dataset. For this purpose, Excel shines. One task I assumed Excel could do, but never knew exactly how to do it was filter out duplicate records from a dataset. After some Goggling, I found how to do this:
(1) Select the column headers and rows in the worksheet you want to remove duplicate records from.
(2) Go to the Data tab and select Advanced Filter.
(3) For action click the Filter the list, in-place radio button and check the box for Unique records only.
(4) Press OK.
You should now have a filtered worksheet with duplicate rows removed. This can be copied and pasted to a new worksheet for further manipulation. Hope this was helpful. If you have improvements on the method or further questions, please add them in the comments section below.
A repository of programs, scripts, and tips essential to
genetic epidemiology, statistical genetics, and bioinformatics
Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment