Ubuntu :: Remove Duplicate Lines From A Text File?

Dec 16, 2010

Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?

View 7 Replies


ADVERTISEMENT

Ubuntu :: Remove Duplicate Lines In Plain Text File?

Apr 14, 2010

i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right

heres the command

Code:
sort random.txt | uniq -u > rand-shorter.txt

the file is pretty big, everything on a new line. i found the command on a web site so i'm sure its correct(bit of a command line in linux newbie)

can anyone confirm if this will remove lines duplicate lines (keeping one copy) and dump what is left in a file named rand-shorter.txt?

EDIT: i think its actually working, just taking a reallllly long time (on an old pen 4 from 2000)

View 8 Replies View Related

General :: Remove Lines From A Syslog Text File That Have Duplicate Strings

Mar 17, 2011

Trying to remove lines from a syslog text file that have duplicate strings

Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]

then a few lines down

Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360

got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.

View 4 Replies View Related

General :: Remove Duplicate Words Within A Particular Text In A File?

Jul 22, 2011

I am basically trying to remove duplicate words in my <title></title> tag after I got hit by Google Panda. I have around 750 .html files and it will be difficult for to me remove one by one. I am looking for a way to remove only from within <title> </title>

Example of a duplicate title I have:

Code:

<title>Pasta, Pasta Recipe and Pasta Guide</title>

I dont want to replace those words anywhere else in the file except for within the <title>

View 14 Replies View Related

Programming :: Remove Lines In A Text File Based On Another Text File?

Jan 28, 2009

I have a text file called file1.txt containing many lines eg.

line1
line2
line3
line4
line5
line6

Then i have another text file called file2.txt contains

3
5
6

Is there a command to remove the lines in file1.txt based on the keywords in file2.txt? note: It should remove line3,line5,line6 based on 3,5,6

View 10 Replies View Related

Ubuntu :: Python : Remove Some Lines From A Text File?

Sep 6, 2010

I am creating my own address book Python program and I want to create a nction that removes some specified entries. The code looks like this now.

Code:
def remove():
delentry= raw_input('Enter the entry name to delete: ')

[code]...

View 1 Replies View Related

General :: Remove Lone Lines From A Text File?

Jul 6, 2011

anyone has ideas how to remove lone lines from a text file?

If I have a file that is like this:
-----------------------------------
line 1

[code]...

View 14 Replies View Related

General :: Using Sed To Remove Lines With Duplicate ID's?

May 8, 2010

I have a file that contains lines representing the nodes of a polyline but I only need the first point in each segment. With the following text:

0,"013A",0.57,260739.891,4379258.87
0,"013A",0.57,260737.674,4379258.94
0,"013A",0.57,260684.628,4379258.35
1,"013A",0.545,260769.915,4379257.84
1,"013A",0.545,260739.891,4379258.87

[Code]....

The problem with uniq is that the last two colums will differ. I don't care about the x/y for any points following the first one.

View 4 Replies View Related

Programming :: Remove Duplicate Lines From Shell Script

Apr 8, 2010

I have a file with semi duplicate lines, like:
abc 12 32
agsi 82
sha 26
abc 1
iaij
agsi 3

Now I want to edit my file and make it:
abc 12 32
agsi 82
sha 26
iaij
i.e. remove second occurrence of line when 1st column is abc or agsi.

View 13 Replies View Related

Debian :: Remove Duplicate "paragraphs" In Text File?

Aug 14, 2011

I have a text file with many pairs of number, one pair in each line. Each 25 of these pairs are a solution to a math problem I've been working on, and each solution is separated from another by a line with "**********".The problem is that there are duplicate solutions. In order to know exactly how many solutions I found, I have to delete the duplicate ones. How can I do that?Just to make things clear, here are the first three solutions:

1 1
3 2
5 3

[code]....

View 3 Replies View Related

Programming :: Adding Lines Of Text To Beginning Of Text File

Jan 19, 2009

I need to insert 3-4 lines of text to the beginning of a text file. The file is a largish MYSQL dump, the result of a backup shell script. This shell script should insert the required text.I've wrestled with sed, but lost.

View 2 Replies View Related

Programming :: Create A Duplicate File Of A Text File In PHP?

Nov 11, 2010

What I plan to do is, create a duplicate file of a text file, and then append some text into the new text file.

View 1 Replies View Related

Programming :: Splitting Text File Into Each Duplicate

Dec 10, 2010

I have a text file that is filled with references to duplicate files. I'm trying to create a text file for each duplicate file found that contains the paths to the duplicates. I would also like the text file names to be based on the size and file name.

Some thing like:
231.5 KB - P&S.doc.txt
138.5 KB - LIMITED#C71.doc.txt

Code:
NamePathSizeLast ChangeLast AccessFile TypeOwnerAttributes
P&S.doc(3 Files)
P&S.docZ:Leg\_Pri_LegPurP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
P&S.docZ:Leg\_Pri_LegP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
P&S.docZ:Leg\_Pri_LegPropsPurP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
LIMITED#C71.doc(2 Files)
LIMITED#C71.docZ:Leg\_Pri_LegPurCV138.5 KB12/15/2003 1:04 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
LIMITED#C71.docZ:Leg\_Pri_LegPropsPurCV138.5 KB12/15/2003 1:04 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.doc(3 Files)
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegPropsPurP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegPurP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
Copy of 08 Lee All July Billing.xls(2 Files)
Copy of 08 Lee All July Billing.xlsZ:IS\_Sh_ISDevDocDocl 26 upgradeAS6 backup codeAPImport131.5 KB7/30/2010 12:11 PM11/22/2010 2:38 AM.xls (Microsoft Office Excel 97-2003 Worksheet)AdministratorsC
Copy of 08 Lee All July Billing.xlsZ:APKellie131.5 KB7/30/2010 10:03 AM11/22/2010 2:38 AM.xls (Microsoft Office Excel 97-2003 Worksheet)KellieC

View 5 Replies View Related

Ubuntu :: Add / Remove Bits Of Text From A Text File

Dec 6, 2010

I am looking for a way to keep a log and make if then statements if a line exitsts in the log. I also am looking for a way to make a simple loop, like goto line number, and I also am wondering how to add/remove bits of text from a text file (plugins line in server.properties)

View 5 Replies View Related

General :: Use AWK To Print Out First Few Lines Of A Text File?

Jul 27, 2011

I have a few rather large text files, and I need a way to look at the first three lines of each. Is there a way to do this using awk?

View 3 Replies View Related

Programming :: Add Comma To End Of Lines In Text File?

Aug 21, 2010

I have a plain text file with 360 lines of varying length text. How do I add a comma or other symbol to the end of each line so that I can convert the file to csv format that I can open in a spreadsheet (45 rows, 8 columns). That means each 8 lines of text forms 8 columns, with 45 rows.

View 9 Replies View Related

General :: Grep Multiple Lines From A Text File

Jun 17, 2009

I have a list of words that I want to grep in many files to see which ones have it and which ones dont. in the text file I have all the words listed line by line, ex: list.txt:

check
try this
word1
word2
open space
list ..

I want to grep each line one by one. like I want it to

grep "check" *.log
grep "try this" *.log
grep "word1" *.log .. etc how can I do this?

and maybe write the output to a file.

View 5 Replies View Related

General :: Show Specific Lines In A Text File?

Feb 3, 2011

I have created a text file in Linux, and I only want to show certain users. Here is my text file:

usr user tty Limbo?
11 12:06:13 APW no
12 12:06:13 APW no

[code]...

View 12 Replies View Related

Server :: Set The Cat Command To Read Specified Lines Of A Text File?

Feb 17, 2011

how can I set the cat command to read specified lines of a text file,like if I have a text file with 100 lines, who can I say cat only line 23 to 42?

View 3 Replies View Related

Software :: Delete Top Lines Of A Text File Until A Word Is Met?

Jul 7, 2011

I need to chop of the top 30ish lines of several log files until a line starting with "Initialization completed."The trouble is that it's not always the same amount of lines that need to be deleted, and they don't always contain the same information, which is why I would need to delete everything priorhe line starting with "Initialization completed."Right now I have a little script I wrote based on looping each file through several "grep -v" commands with each known pattern of lines I want to ignore, but it is tedious and I have to inspect each file afterwards to make sure nothing is left from above "Initialization completed

View 3 Replies View Related

Programming :: Read Multiple Lines From A Text File?

Mar 11, 2011

For example, I have a text file with data which lists numerical values from two separate individuals

Code:
Person A
100

[code]...

View 1 Replies View Related

Ubuntu :: Create A Script To Count The Number Of Lines From A Text File?

Dec 17, 2010

I need to create a script to count the number of lines from a text file . The output must be put on another text file (no_lines.txt) and in this file i need to generate from the script this output :"File $FILE has $NO_LINES lines ".

View 3 Replies View Related

Programming :: (BASH) How To Read Multiple Lines From Text File

Mar 11, 2011

For example, I have a text file with data which lists numerical values from two separate individuals

Code:
Person A
100
200
300
400
500
600
700
800
900
1000
1100
1200

Person B
1200
1100
1000
900
800
700
600
500
400
300
200
100

How would I go about reading the values for each Person, then being able to perform mathematical equations for each Person (finding the sum for example)?

View 13 Replies View Related

General :: Remove All Lines In A File Containing Sameword?

Aug 6, 2010

When i want to remove particular lines containing a specific word in from entire document at a time,i am using the following command.

awk '$columnno !~/specificword/' inputfile > outputfile

But here, coulmn no is my problem, because iam having this in different columns. So i need a solution for it.

How to write such removal command without mentioning column no. , ie irrespective of column no, it has to remove all lines having that specific word.

View 10 Replies View Related

Programming :: Remove Specific Lines From File

Jan 21, 2011

I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.

View 9 Replies View Related

General :: Sed - Append Four Commas ',,,,' At The End Of Lines Containing The Pattern 'Response' In A Text File

Nov 5, 2010

Using sed, I am trying to append four commas ',,,,' at the end of lines containing the pattern 'Response' in a text file with lines such as these:

6,Pulse,50,254968,14886,NA,,,,
7,Picture,8,265157,0,1,15045,2,0,15000
7,Response,1,271553,6396,1
7,Pulse,50,274969,9812,NA,,,,
8,Picture,1,290232,0,1,15045,2,0,15000
8,Pulse,50,294969,4737,NA,,,,
[Code].....

View 1 Replies View Related

General :: Commands To Remove Particular Few Lines In History File?

Jun 16, 2010

Is there any commands or scripts to remove only selected line in the history file.

View 1 Replies View Related

Programming :: Remove Many Lines Based On Content Of Other File

Jun 21, 2011

I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?

View 13 Replies View Related

Programming :: Perl - Delete Line From Text File With Duplicate Match At Beginning Of Line

Apr 1, 2009

Was wondering if any perl guru's could help me with a quick log file adjustment. I have a text file that looks like so (tabs and newlines are revealed so you can see what separates the data):

There are maybe 100 lines of text in this file at any given time. I need to delete all duplicate lines only looking at the first bit of text prior to the first tab. It doesn't matter which one gets deleted as long as there are no two lines that begin with that same text at the beginning before the first tab. So in this example, either the fist line "1234" or the last line "1234" would need to be deleted. I already have code in my script that opens the files - I just need the code to read the text into an array and the part that would find matches based on the above criteria, and make the deletions.

If it would be easier, I can even do a system call and use SED (v4.1.5) and/or AWK (3.1.5) instead.

View 7 Replies View Related

CentOS 5 :: Search A Text File For The Existence Of Certain Strings And Execute A Command If They Exist, Something Along The Lines?

Feb 23, 2010

This should be simple but I can't seem to find what I am looking for.I want to search a text file for the existence of certain strings and execute a command if they exist, something along the lines of:

if <string> exists
command
or

if <any member of this list exists>
command

I know how to manually search a file with grep, cat, etc., but the "if this exists" part eludes me.

View 7 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved