Ubuntu :: Remove Duplicate Lines From A Text File?
Dec 16, 2010
Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?
View 7 Replies
ADVERTISEMENT
Apr 14, 2010
i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right
heres the command
Code:
sort random.txt | uniq -u > rand-shorter.txt
the file is pretty big, everything on a new line. i found the command on a web site so i'm sure its correct(bit of a command line in linux newbie)
can anyone confirm if this will remove lines duplicate lines (keeping one copy) and dump what is left in a file named rand-shorter.txt?
EDIT: i think its actually working, just taking a reallllly long time (on an old pen 4 from 2000)
View 8 Replies
View Related
Mar 17, 2011
Trying to remove lines from a syslog text file that have duplicate strings
Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]
then a few lines down
Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360
got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.
View 4 Replies
View Related
Jul 22, 2011
I am basically trying to remove duplicate words in my <title></title> tag after I got hit by Google Panda. I have around 750 .html files and it will be difficult for to me remove one by one. I am looking for a way to remove only from within <title> </title>
Example of a duplicate title I have:
Code:
<title>Pasta, Pasta Recipe and Pasta Guide</title>
I dont want to replace those words anywhere else in the file except for within the <title>
View 14 Replies
View Related
Jan 28, 2009
I have a text file called file1.txt containing many lines eg.
line1
line2
line3
line4
line5
line6
Then i have another text file called file2.txt contains
3
5
6
Is there a command to remove the lines in file1.txt based on the keywords in file2.txt? note: It should remove line3,line5,line6 based on 3,5,6
View 10 Replies
View Related
Sep 6, 2010
I am creating my own address book Python program and I want to create a nction that removes some specified entries. The code looks like this now.
Code:
def remove():
delentry= raw_input('Enter the entry name to delete: ')
[code]...
View 1 Replies
View Related
Jul 6, 2011
anyone has ideas how to remove lone lines from a text file?
If I have a file that is like this:
-----------------------------------
line 1
[code]...
View 14 Replies
View Related
May 8, 2010
I have a file that contains lines representing the nodes of a polyline but I only need the first point in each segment. With the following text:
0,"013A",0.57,260739.891,4379258.87
0,"013A",0.57,260737.674,4379258.94
0,"013A",0.57,260684.628,4379258.35
1,"013A",0.545,260769.915,4379257.84
1,"013A",0.545,260739.891,4379258.87
[Code]....
The problem with uniq is that the last two colums will differ. I don't care about the x/y for any points following the first one.
View 4 Replies
View Related
Apr 8, 2010
I have a file with semi duplicate lines, like:
abc 12 32
agsi 82
sha 26
abc 1
iaij
agsi 3
Now I want to edit my file and make it:
abc 12 32
agsi 82
sha 26
iaij
i.e. remove second occurrence of line when 1st column is abc or agsi.
View 13 Replies
View Related
Aug 14, 2011
I have a text file with many pairs of number, one pair in each line. Each 25 of these pairs are a solution to a math problem I've been working on, and each solution is separated from another by a line with "**********".The problem is that there are duplicate solutions. In order to know exactly how many solutions I found, I have to delete the duplicate ones. How can I do that?Just to make things clear, here are the first three solutions:
1 1
3 2
5 3
[code]....
View 3 Replies
View Related
Jan 19, 2009
I need to insert 3-4 lines of text to the beginning of a text file. The file is a largish MYSQL dump, the result of a backup shell script. This shell script should insert the required text.I've wrestled with sed, but lost.
View 2 Replies
View Related
Nov 11, 2010
What I plan to do is, create a duplicate file of a text file, and then append some text into the new text file.
View 1 Replies
View Related
Dec 10, 2010
I have a text file that is filled with references to duplicate files. I'm trying to create a text file for each duplicate file found that contains the paths to the duplicates. I would also like the text file names to be based on the size and file name.
Some thing like:
231.5 KB - P&S.doc.txt
138.5 KB - LIMITED#C71.doc.txt
Code:
NamePathSizeLast ChangeLast AccessFile TypeOwnerAttributes
P&S.doc(3 Files)
P&S.docZ:Leg\_Pri_LegPurP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
P&S.docZ:Leg\_Pri_LegP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
P&S.docZ:Leg\_Pri_LegPropsPurP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
LIMITED#C71.doc(2 Files)
LIMITED#C71.docZ:Leg\_Pri_LegPurCV138.5 KB12/15/2003 1:04 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
LIMITED#C71.docZ:Leg\_Pri_LegPropsPurCV138.5 KB12/15/2003 1:04 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.doc(3 Files)
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegPropsPurP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegPurP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
Copy of 08 Lee All July Billing.xls(2 Files)
Copy of 08 Lee All July Billing.xlsZ:IS\_Sh_ISDevDocDocl 26 upgradeAS6 backup codeAPImport131.5 KB7/30/2010 12:11 PM11/22/2010 2:38 AM.xls (Microsoft Office Excel 97-2003 Worksheet)AdministratorsC
Copy of 08 Lee All July Billing.xlsZ:APKellie131.5 KB7/30/2010 10:03 AM11/22/2010 2:38 AM.xls (Microsoft Office Excel 97-2003 Worksheet)KellieC
View 5 Replies
View Related
Dec 6, 2010
I am looking for a way to keep a log and make if then statements if a line exitsts in the log. I also am looking for a way to make a simple loop, like goto line number, and I also am wondering how to add/remove bits of text from a text file (plugins line in server.properties)
View 5 Replies
View Related
Jul 27, 2011
I have a few rather large text files, and I need a way to look at the first three lines of each. Is there a way to do this using awk?
View 3 Replies
View Related
Aug 21, 2010
I have a plain text file with 360 lines of varying length text. How do I add a comma or other symbol to the end of each line so that I can convert the file to csv format that I can open in a spreadsheet (45 rows, 8 columns). That means each 8 lines of text forms 8 columns, with 45 rows.
View 9 Replies
View Related
Jun 17, 2009
I have a list of words that I want to grep in many files to see which ones have it and which ones dont. in the text file I have all the words listed line by line, ex: list.txt:
check
try this
word1
word2
open space
list ..
I want to grep each line one by one. like I want it to
grep "check" *.log
grep "try this" *.log
grep "word1" *.log .. etc how can I do this?
and maybe write the output to a file.
View 5 Replies
View Related
Feb 3, 2011
I have created a text file in Linux, and I only want to show certain users. Here is my text file:
usr user tty Limbo?
11 12:06:13 APW no
12 12:06:13 APW no
[code]...
View 12 Replies
View Related
Feb 17, 2011
how can I set the cat command to read specified lines of a text file,like if I have a text file with 100 lines, who can I say cat only line 23 to 42?
View 3 Replies
View Related
Jul 7, 2011
I need to chop of the top 30ish lines of several log files until a line starting with "Initialization completed."The trouble is that it's not always the same amount of lines that need to be deleted, and they don't always contain the same information, which is why I would need to delete everything priorhe line starting with "Initialization completed."Right now I have a little script I wrote based on looping each file through several "grep -v" commands with each known pattern of lines I want to ignore, but it is tedious and I have to inspect each file afterwards to make sure nothing is left from above "Initialization completed
View 3 Replies
View Related
Mar 11, 2011
For example, I have a text file with data which lists numerical values from two separate individuals
Code:
Person A
100
[code]...
View 1 Replies
View Related
Dec 17, 2010
I need to create a script to count the number of lines from a text file . The output must be put on another text file (no_lines.txt) and in this file i need to generate from the script this output :"File $FILE has $NO_LINES lines ".
View 3 Replies
View Related
Mar 11, 2011
For example, I have a text file with data which lists numerical values from two separate individuals
Code:
Person A
100
200
300
400
500
600
700
800
900
1000
1100
1200
Person B
1200
1100
1000
900
800
700
600
500
400
300
200
100
How would I go about reading the values for each Person, then being able to perform mathematical equations for each Person (finding the sum for example)?
View 13 Replies
View Related
Aug 6, 2010
When i want to remove particular lines containing a specific word in from entire document at a time,i am using the following command.
awk '$columnno !~/specificword/' inputfile > outputfile
But here, coulmn no is my problem, because iam having this in different columns. So i need a solution for it.
How to write such removal command without mentioning column no. , ie irrespective of column no, it has to remove all lines having that specific word.
View 10 Replies
View Related
Jan 21, 2011
I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.
View 9 Replies
View Related
Nov 5, 2010
Using sed, I am trying to append four commas ',,,,' at the end of lines containing the pattern 'Response' in a text file with lines such as these:
6,Pulse,50,254968,14886,NA,,,,
7,Picture,8,265157,0,1,15045,2,0,15000
7,Response,1,271553,6396,1
7,Pulse,50,274969,9812,NA,,,,
8,Picture,1,290232,0,1,15045,2,0,15000
8,Pulse,50,294969,4737,NA,,,,
[Code].....
View 1 Replies
View Related
Jun 16, 2010
Is there any commands or scripts to remove only selected line in the history file.
View 1 Replies
View Related
Jun 21, 2011
I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?
View 13 Replies
View Related
Apr 1, 2009
Was wondering if any perl guru's could help me with a quick log file adjustment. I have a text file that looks like so (tabs and newlines are revealed so you can see what separates the data):
There are maybe 100 lines of text in this file at any given time. I need to delete all duplicate lines only looking at the first bit of text prior to the first tab. It doesn't matter which one gets deleted as long as there are no two lines that begin with that same text at the beginning before the first tab. So in this example, either the fist line "1234" or the last line "1234" would need to be deleted. I already have code in my script that opens the files - I just need the code to read the text into an array and the part that would find matches based on the above criteria, and make the deletions.
If it would be easier, I can even do a system call and use SED (v4.1.5) and/or AWK (3.1.5) instead.
View 7 Replies
View Related
Feb 23, 2010
This should be simple but I can't seem to find what I am looking for.I want to search a text file for the existence of certain strings and execute a command if they exist, something along the lines of:
if <string> exists
command
or
if <any member of this list exists>
command
I know how to manually search a file with grep, cat, etc., but the "if this exists" part eludes me.
View 7 Replies
View Related