Programming :: Remove Duplicate Lines From Shell Script
Apr 8, 2010
I have a file with semi duplicate lines, like:
abc 12 32
agsi 82
sha 26
abc 1
iaij
agsi 3
Now I want to edit my file and make it:
abc 12 32
agsi 82
sha 26
iaij
i.e. remove second occurrence of line when 1st column is abc or agsi.
View 13 Replies
ADVERTISEMENT
May 8, 2010
I have a file that contains lines representing the nodes of a polyline but I only need the first point in each segment. With the following text:
0,"013A",0.57,260739.891,4379258.87
0,"013A",0.57,260737.674,4379258.94
0,"013A",0.57,260684.628,4379258.35
1,"013A",0.545,260769.915,4379257.84
1,"013A",0.545,260739.891,4379258.87
[Code]....
The problem with uniq is that the last two colums will differ. I don't care about the x/y for any points following the first one.
View 4 Replies
View Related
Dec 16, 2010
Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?
View 7 Replies
View Related
Apr 14, 2010
i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right
heres the command
Code:
sort random.txt | uniq -u > rand-shorter.txt
the file is pretty big, everything on a new line. i found the command on a web site so i'm sure its correct(bit of a command line in linux newbie)
can anyone confirm if this will remove lines duplicate lines (keeping one copy) and dump what is left in a file named rand-shorter.txt?
EDIT: i think its actually working, just taking a reallllly long time (on an old pen 4 from 2000)
View 8 Replies
View Related
Mar 17, 2011
Trying to remove lines from a syslog text file that have duplicate strings
Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]
then a few lines down
Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360
got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.
View 4 Replies
View Related
Nov 24, 2009
How do you remove parts of strings using python? Such as, if I have something like:
Code:
erme1 sdifskenklsd
erme2 sdfjksliel
[code]....
View 3 Replies
View Related
Nov 1, 2010
we have a variable LIBS:
LIBS='-lpq -lmysqlclient -lssl -lpthread -lresolv -lssl -lpthread -lresolv'
I'd want to get:
LIBS='-lpq -lmysqlclient -lssl -lpthread -lresolv'
Bear in mind that LIBS can be variable, I mean I need to drop any duplicate and only retain the last one of each different entry. And we must keep the order as is, I must not sort out them.
So, if LIBS is:
LIBS='-lpq -lmysqlclient -lpthread -lresolv -lpthread -lresolv'
I need: LIBS='-lpq -lmysqlclient -lpthread -lresolv'
View 3 Replies
View Related
May 12, 2010
I have a huge (over 10 gb) file with a list of IP's each followed by a corresponding number like this:
Code:
12.32.34.23 10
143.32.34.543 11
232.32.45.65 12
54.23.5.232 13
143.32.34.43 14
and so on..
I'm trying to sort this file numerically and weed out any duplicate IP addresses. How do I do this on bash? I have come up with this but obviously it does'nt work.
Code:
$sort -n myfile.txt | cut -f1 | uniq -u
View 2 Replies
View Related
May 25, 2010
Thanks y'all for the great script and explanation. This helped a lot in my own project. I thought I'd share the efforts.The project is this: I've got lots of duplicate JPGs from all the family members who've named the same photo with different names. Since md5sum generates a "fingerprint" based on the file contents, not the name, I want to use the md5sum of each jpg to uniquely name each photo and also remove exact duplicates.
It has the following flaws:
0) it doesn't handle certain non-alphanumerics
1) it keeps both photo-shopped and unaltered photos (different md5s)
2) it (currently) doesn't preserve descriptive filenames.
(For me, removal of duplicates is more important than keeping the filenames. I may change that to concatenate the md5 and the filename.)Please note that the commented "rename" command should be used to strip non-aphanumerics from the file names, and the script should be launched with the commented "find" command.
View 1 Replies
View Related
Nov 3, 2010
How can I remove all lines which contain A,,,,,, I tried the following sed statements but no luck.
Code:
sed "/A,,,,,,/d file"
sed "/A,,,,,,/d file"
View 6 Replies
View Related
Apr 14, 2011
I really need help with this part of a shell script which I am trying to migrate to DOS batch script.
View 3 Replies
View Related
Sep 16, 2010
I have such a file(test.txt):
abc 123 456
abc 256 145
axd 125 225
[code]...
View 8 Replies
View Related
Nov 7, 2010
I am using 'sed -e /foo/d' to match lines which I want to delete from a file. I discovered I have some lines which contain random (extended?) characters like 'ủ' which I would also like to delete. The lines in the file should only contain alpha numeric characters.
View 8 Replies
View Related
Jul 15, 2010
I'm trying to search through some pdf files and I'm doing so by converting them to text files using pdftotext which is fine but I'm trying to get the number of occurrences in a paragraph of different words and it's adding a new line character at what it thinks is the right hand margin. I'm trying to remove all these singe new line characters but keep the doubles and I can't seem to work it out. i.e.
This is some text that has been broken.
Another paragraph.
becomes
This is some text that has been broken.
Another paragraph
View 9 Replies
View Related
Jan 21, 2011
I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.
View 9 Replies
View Related
Jun 21, 2011
I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?
View 13 Replies
View Related
Jul 7, 2010
I have a series of input files formatted like this:
Code:
RTREVF, KOG3266 = 111
RTREVF, KOG3294 = 130
RTREVF, KOG3295 = 177
WAGF, KOG3307 = 107
JTTF, KOG3320 = 174
Each line represents a portion of a data matrix. I want to convert the numbers after the "=" to the range of that partition in the matrix such that the output file looks like this:
Code:
RTREVF, KOG3266 = 1-111
RTREVF, KOG3294 = 112-241
RTREVF, KOG3295 = 242-418
WAGF, KOG3307 = 419-525
JTTF, KOG3320 = 526-699
View 5 Replies
View Related
Jul 16, 2010
I have a large text file that's formatted sort of like this:
Code:
foo bar
blah
[code]...
View 2 Replies
View Related
Jun 5, 2009
I want to remove duplicate or multiple similar lines from multiple files. I.e. if I have four files file1.txt file2.txt file3.txt and file4.txt and would like to find and remove similar lines from all these files keeping only one line from these similar lines. I only that uniq can be used to remove similar lines from a sorted file.
View 9 Replies
View Related
Feb 7, 2011
I have a file "test.txt" with following data
#1
aaa
#2
bbb
#3
aaa
#4
ddd
I wanted it to be displayed as
#1
aaa
#2
bbb
#4
ddd
I used awk "'!x[$0]++' test.txt > file.new"
,but it deleted #1 also.I tried using uniq command but i didn't work.
Can anyone Please let me know is there any way to do this using shell script.
View 2 Replies
View Related
Mar 22, 2010
I have some big files of logs that contain errors printed by an app. They are most of the time relevant, however most of them are similar. So i figured i could check what happened between a time interval with a find.
Im using this one
Code:
And I get an output similar to this one.
Code:
Is there a way to condensate the output lines to get only one or two, indicating the start and last occurrence of a block? Or I need to create a program to do so?
Because right now I get thousands of similar lines, but when I'm scrolling through them i sometimes miss relevant information that i would've otherwise noted if it wasn't all that spammy.
View 10 Replies
View Related
Jul 25, 2011
I'm using sed to remove certain line in a text file based on a match with 2 variables from input. Here is how it looks like in file
Philip S:Odds:45:343
Mike Junior:Odds:3:56
I prompt for 2 inputs in variable form which is compared to the first 2 fields of the above text (: seperated). So say i enter Philip S and Odds then it should delete the entire first line.
View 4 Replies
View Related
Oct 4, 2010
I'm looking for a script (bash, python, perl etc) or even a one liner (sed, awk etc) that can take a set of files and remove any line that has more than "x" instances of any character (case sensitive). I have been doing a lot of searching and can only come up with examples of how to remove blank lines, lines that start with a certain character or lines that contain a certain string. This will be used on a system running a Kubuntu derivative.
As a very poor and basic example, I would like to take files that contain lines like:
Code:
And end up with the files only containing the lines:
Code:
If I tell the script that 2 is the maximun number of times any character can appear in any line.
I know this must be possible, but for the life of me I cannot find even an example that will lead me in the right direction or better yet a piece of code I can use.
View 15 Replies
View Related
Oct 21, 2009
I have Fedora 11 installed-32bit-with xfce installed as the desktop. When I click on the fedora icon for the menu and select Preferences, there are 2 input methods listed even though I did not have any installed.Since there is no menu editor any more, does anybody know how to edit the menu so that I can get rid of these entries?
View 1 Replies
View Related
Dec 19, 2010
there is this unknown notification popup appear on top left of the screen other than the one on the panel. Anyone have experience on remove the top notification popup? this is my root account that i mainly use everyday, but if i created new account the top navigation not exist.
View 2 Replies
View Related
Feb 23, 2010
I did apt-get install qtcreator and it installed qt 4.5.3(qt4.5.2real) I had qt 4.5.2. If I go in Applications->programming I see 2 shortcuts for qtcreator, one of them being newer. How do I remove the older one? On another note, if I want to update Qt to 4.6 what would be the steps if I already have qt 4.5
View 1 Replies
View Related
Dec 13, 2010
I am looking for a Linux app that can find and remove duplicate images (with different filenames if that's at all possible).
View 5 Replies
View Related
Aug 19, 2010
I have two folders - Folder abc and Folder xyz which contains 1000's of files with few of them having the same file names. How can I remove the duplicates from Folder abc?
View 14 Replies
View Related
Feb 12, 2010
I'm looking for a program to remove songs I downloaded more than once. They're all tagged differently, and some are of varying sizes/lengths/types.
View 1 Replies
View Related
Mar 29, 2011
I would like to find a command which automatically finds and removes phrases which appear more than once in a text file. I still want to keep one of these phrases, but I only want to see one of them.
View 9 Replies
View Related