sponsor Vim development Vim logo Vim Book Ad

intermediate Tip #648: Uniq - Removing duplicate lines

 tip karma   Rating 11/5, Viewed by 573 

created:   February 1, 2004 20:45      complexity:   intermediate
author:   Michael Geddes      as of Vim:   6.0

There are two versions, the first leaves only the last line, the second leaves only the first line.

g/^\(.*\)$\n\1$/d

g/\%(^\1$\n\)\@<=\(.*\)$/d

Breakdown of the second version:

g//d    <-- Delete the lines matching the regexp

\@<=    <-- If the bit following matches, make sure the bit preceding this symbol directly precedes the match

\(.*\)$   <-- Match the line into subst register 1

\%( )     <--- Group without placing in a subst register.
^\1$\n   <--- Match subst register 1 followed by end of line and the new line between the 2 lines

In this simple format (matching the whole line), it's not going to make much difference, but it will start to matter if you want to do stuff like match the first word only

This does a uniq on the first word in the line, and deletes all but the first line:

g/\%(^\1\>.*$\n\)\@<=\(\k\+\).*$/d


 rate this tip  Life Changing Helpful Unfulfilling 

<<Single letter insert | expand existing abbreviation >>

Additional Notes

[email protected], February 2, 2004 4:57
Or you could simply pipe the file, (Or range of lines) through uniq(1) thusly:
:%!uniq

Cheers,
Morel.
Anonymous, February 2, 2004 5:35
unless you are stuck inside a windows machine using vim, in which case this tip is most appreciated  :)

-- RS
[email protected], February 2, 2004 10:33
then again, windows users can have 'sort', 'uniq', 'grep' and a host of others if they visit the unxutils.sourceforge.net site
Anonymous, February 2, 2004 15:34
Or http://www.cygwin.com
Anonymous, February 2, 2004 16:38
Of course, personally, I use  sort | uniq whether on my Windows or my Unix box.  However if you were (for example) going to make a script that wanted to use uniq, then you shouldn't be assuming either exists.

As sombody else has come up with sort, I thought I'd give a go at a pure vim version of uniq.

I would definitely use this in a script over assuming an environment.  Not everybody wants to download cygwin or friends for the lack of one or two commands. (Not that M*cr*s*ft doesn't suck in so many ways).  I'm sure that with the multitude of platforms that Vim runs on, there a few out there that don't have convenient ports of/alternates to unix commands.
//.ichael Geddes
[email protected], February 4, 2004 8:08
Here are some more vim-native ways for removing duplicate
lines. This time they don't have to be adjacent. Line order
is preserved.

This one can be a bit slow.
:nno \d1 :g/^/m0<CR>:g/^\(.*\)\n\_.*\%(^\1$\)/d<CR>:g/^/m0<CR>

This is faster (some help from Preben Guldberg with this one).
Uses mark l.
:nno \d2 :g/^/kl\|if search('^'.escape(getline('.'),'\.*[]^$/').'$','bW')\|'ld<CR>

Antony
[email protected], February 4, 2004 8:10
Here are some more vim-native ways for removing duplicate
lines. This time they don't have to be adjacent. Line order
is preserved.

This one can be a bit slow.
:nno \d1 :g/^/m0<CR>:g/^\(.*\)\n\_.*\%(^\1$\)/d<CR>:g/^/m0<CR>

This is faster (some help from Preben Guldberg with this one).
Uses mark l.
:nno \d2 :g/^/kl\|if search('^'.escape(getline('.'),'\.*[]^$/').'$','bW')\|'ld<CR>

Antony
[email protected], February 4, 2004 8:10
Here are some more vim-native ways for removing duplicate
lines. This time they don't have to be adjacent. Line order
is preserved.

This one can be a bit slow.
:nno \d1 :g/^/m0<CR>:g/^\(.*\)\n\_.*\%(^\1$\)/d<CR>:g/^/m0<CR>

This is faster (some help from Preben Guldberg with this one).
Uses mark l.
:nno \d2 :g/^/kl\|if search('^'.escape(getline('.'),'\.*[]^$/').'$','bW')\|'ld<CR>

Antony
If you have questions or remarks about this site, visit the vimonline development pages. Please use this site responsibly.
Questions about Vim should go to [email protected] after searching the archive. Help Bram help Uganda.
SourceForge Logo