Debian :: Remove Duplicate "paragraphs" In Text File?
Aug 14, 2011
I have a text file with many pairs of number, one pair in each line. Each 25 of these pairs are a solution to a math problem I've been working on, and each solution is separated from another by a line with "**********".The problem is that there are duplicate solutions. In order to know exactly how many solutions I found, I have to delete the duplicate ones. How can I do that?Just to make things clear, here are the first three solutions:
1 1
3 2
5 3
[code]....
View 3 Replies
ADVERTISEMENT
Dec 16, 2010
Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?
View 7 Replies
View Related
Jul 22, 2011
I am basically trying to remove duplicate words in my <title></title> tag after I got hit by Google Panda. I have around 750 .html files and it will be difficult for to me remove one by one. I am looking for a way to remove only from within <title> </title>
Example of a duplicate title I have:
Code:
<title>Pasta, Pasta Recipe and Pasta Guide</title>
I dont want to replace those words anywhere else in the file except for within the <title>
View 14 Replies
View Related
Apr 14, 2010
i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right
heres the command
Code:
sort random.txt | uniq -u > rand-shorter.txt
the file is pretty big, everything on a new line. i found the command on a web site so i'm sure its correct(bit of a command line in linux newbie)
can anyone confirm if this will remove lines duplicate lines (keeping one copy) and dump what is left in a file named rand-shorter.txt?
EDIT: i think its actually working, just taking a reallllly long time (on an old pen 4 from 2000)
View 8 Replies
View Related
Mar 17, 2011
Trying to remove lines from a syslog text file that have duplicate strings
Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]
then a few lines down
Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360
got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.
View 4 Replies
View Related
Nov 11, 2010
What I plan to do is, create a duplicate file of a text file, and then append some text into the new text file.
View 1 Replies
View Related
Dec 10, 2010
I have a text file that is filled with references to duplicate files. I'm trying to create a text file for each duplicate file found that contains the paths to the duplicates. I would also like the text file names to be based on the size and file name.
Some thing like:
231.5 KB - P&S.doc.txt
138.5 KB - LIMITED#C71.doc.txt
Code:
NamePathSizeLast ChangeLast AccessFile TypeOwnerAttributes
P&S.doc(3 Files)
P&S.docZ:Leg\_Pri_LegPurP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
P&S.docZ:Leg\_Pri_LegP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
P&S.docZ:Leg\_Pri_LegPropsPurP&SBUYBarry V231.5 KB11/2/2001 4:07 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
LIMITED#C71.doc(2 Files)
LIMITED#C71.docZ:Leg\_Pri_LegPurCV138.5 KB12/15/2003 1:04 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
LIMITED#C71.docZ:Leg\_Pri_LegPropsPurCV138.5 KB12/15/2003 1:04 PM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.doc(3 Files)
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegPropsPurP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)DMsC
ps revised.8.30.05.clean.docZ:Leg\_Pri_LegPurP&SSellVPSummit54.5 KB8/31/2005 11:46 AM11/22/2010 2:38 AM.doc (Microsoft Office Word 97 - 2003 Document)Lou_AC
Copy of 08 Lee All July Billing.xls(2 Files)
Copy of 08 Lee All July Billing.xlsZ:IS\_Sh_ISDevDocDocl 26 upgradeAS6 backup codeAPImport131.5 KB7/30/2010 12:11 PM11/22/2010 2:38 AM.xls (Microsoft Office Excel 97-2003 Worksheet)AdministratorsC
Copy of 08 Lee All July Billing.xlsZ:APKellie131.5 KB7/30/2010 10:03 AM11/22/2010 2:38 AM.xls (Microsoft Office Excel 97-2003 Worksheet)KellieC
View 5 Replies
View Related
Jan 28, 2009
I have a text file called file1.txt containing many lines eg.
line1
line2
line3
line4
line5
line6
Then i have another text file called file2.txt contains
3
5
6
Is there a command to remove the lines in file1.txt based on the keywords in file2.txt? note: It should remove line3,line5,line6 based on 3,5,6
View 10 Replies
View Related
Mar 28, 2010
I just wrote an html file, but somewhere on the write I lost count of the paragraphs.
In an html file paragaphs starts with <p> and ends with </p>
What I want to know is how to find the valid paragraph.
I used grep '<p>' file_name | grep -wc '</p>' . I dont know if this works.
View 7 Replies
View Related
Dec 6, 2010
I am looking for a way to keep a log and make if then statements if a line exitsts in the log. I also am looking for a way to make a simple loop, like goto line number, and I also am wondering how to add/remove bits of text from a text file (plugins line in server.properties)
View 5 Replies
View Related
Feb 8, 2011
I have just upped from lenny to squeeze. I didn't mean to, really, but the package manager was well into its stride by the time I realised what was happening. Mostly all went well, BUT /usr is now 100% full. I notice that there are duplicate files in /usr/lib, eg Oct 11 22:35 libgcj.so.10.0.0 and Sep 14 2008 libgcj.so.90.0.0 (I assume the latter has been replaced by the former?). Is it safe to remove the "outdated" lib files? Is there an elegant way of doing it?
View 4 Replies
View Related
Apr 1, 2009
Was wondering if any perl guru's could help me with a quick log file adjustment. I have a text file that looks like so (tabs and newlines are revealed so you can see what separates the data):
There are maybe 100 lines of text in this file at any given time. I need to delete all duplicate lines only looking at the first bit of text prior to the first tab. It doesn't matter which one gets deleted as long as there are no two lines that begin with that same text at the beginning before the first tab. So in this example, either the fist line "1234" or the last line "1234" would need to be deleted. I already have code in my script that opens the files - I just need the code to read the text into an array and the part that would find matches based on the above criteria, and make the deletions.
If it would be easier, I can even do a system call and use SED (v4.1.5) and/or AWK (3.1.5) instead.
View 7 Replies
View Related
Oct 14, 2010
I want to be able to check the contents of a text file for a specific string and remove it from the file from the command prompt. I would basically be searching through a number of files and if a specific string is found I would like it removed automatically. pretty much a find and replace, were the replace is nothing. any one got any ideas on how you would do this. I already have the search part sorted just need to be able to remove the string I don't want from the multiple files.
View 4 Replies
View Related
Jul 5, 2011
I have 2 lists of names, they aren't sorted, and may contain repeats.What I would like to do with a bash script is compare the 2 documents and find and remove each repeat name, saving only one of them. Then concatenate the files. Or if it were easier, concatenate first and find and remove all internal repeats.
View 5 Replies
View Related
Jun 5, 2009
I have a text file which include code...
I mean, this string should be removed from each line and save in another file.
View 9 Replies
View Related
Sep 6, 2010
I am creating my own address book Python program and I want to create a nction that removes some specified entries. The code looks like this now.
Code:
def remove():
delentry= raw_input('Enter the entry name to delete: ')
[code]...
View 1 Replies
View Related
Mar 24, 2010
[Code]....
I am trying to remove <a href links using SED but unable to do it.
The finale result I am looking for is
[Code]....
Is it possible with Linux or should I try with Php?
View 2 Replies
View Related
Jul 6, 2011
anyone has ideas how to remove lone lines from a text file?
If I have a file that is like this:
-----------------------------------
line 1
[code]...
View 14 Replies
View Related
Jul 1, 2010
I have a text file (actually a log file from a sensor) that looks like this:
Date/Time: 10.07.01 11:03:59
00 Battery Voltage 13.5 Volt
01 Reference 71
02 Wind speed 6.68 m/s
03 Wind gust 9.3 m/s
[Code]....
I want to delete every block that is not complete. If any of the above lines (Date line or lines 00 to 08)is missing I want to completely remove the block.
View 1 Replies
View Related
Sep 9, 2010
I use the show-leaves yum plugin, and sometimes I use that info to remove some unnecessary packages. But that list can be long, so I write them on a text file.
1) Instead of removing packages one by one is there a way to remove all packages written in that text file?
2) Why isn't the output of the show-leaves plugin compatible with the output of "package-cleanup --leaves"?
View 1 Replies
View Related
Sep 15, 2011
Is it possible to remove multiple packages listed in a text file? Similar to "cat orphan.txt | zypper rm" or "zypper rm <orphan.txt." Neither worked.
View 8 Replies
View Related
Oct 10, 2009
I need to do some text file manipulation which I think should be done with standard commands in BASH. I'm looking at comma seperated text files (stock market data). It comes in the form of date, stock code, open, high, low, close, volume. What I need to do first is move all data with same stock code sequentially into individual files.
While doing this since the stock code will now be the file name I need to remove the stock code. Next I need to filter out overlapping data from different files with the same date. ie. where two files contain the same date on the one line only one line will be added to the combined file. I think there must be a tutorial out there for basic text manipulation like this, I just haven't found it yet.
View 11 Replies
View Related
Feb 2, 2011
Any way to remove the background color from the icon text? The little color bubble that holds the icon text? Yes I know I am picky...
2011-02-01-203633_1280x800_scrot.png (6.88 KiB) Viewed 504 times
View 12 Replies
View Related
May 8, 2010
I have a file that contains lines representing the nodes of a polyline but I only need the first point in each segment. With the following text:
0,"013A",0.57,260739.891,4379258.87
0,"013A",0.57,260737.674,4379258.94
0,"013A",0.57,260684.628,4379258.35
1,"013A",0.545,260769.915,4379257.84
1,"013A",0.545,260739.891,4379258.87
[Code]....
The problem with uniq is that the last two colums will differ. I don't care about the x/y for any points following the first one.
View 4 Replies
View Related
Nov 1, 2010
we have a variable LIBS:
LIBS='-lpq -lmysqlclient -lssl -lpthread -lresolv -lssl -lpthread -lresolv'
I'd want to get:
LIBS='-lpq -lmysqlclient -lssl -lpthread -lresolv'
Bear in mind that LIBS can be variable, I mean I need to drop any duplicate and only retain the last one of each different entry. And we must keep the order as is, I must not sort out them.
So, if LIBS is:
LIBS='-lpq -lmysqlclient -lpthread -lresolv -lpthread -lresolv'
I need: LIBS='-lpq -lmysqlclient -lpthread -lresolv'
View 3 Replies
View Related
Oct 21, 2009
I have Fedora 11 installed-32bit-with xfce installed as the desktop. When I click on the fedora icon for the menu and select Preferences, there are 2 input methods listed even though I did not have any installed.Since there is no menu editor any more, does anybody know how to edit the menu so that I can get rid of these entries?
View 1 Replies
View Related
Dec 19, 2010
there is this unknown notification popup appear on top left of the screen other than the one on the panel. Anyone have experience on remove the top notification popup? this is my root account that i mainly use everyday, but if i created new account the top navigation not exist.
View 2 Replies
View Related
Feb 23, 2010
I did apt-get install qtcreator and it installed qt 4.5.3(qt4.5.2real) I had qt 4.5.2. If I go in Applications->programming I see 2 shortcuts for qtcreator, one of them being newer. How do I remove the older one? On another note, if I want to update Qt to 4.6 what would be the steps if I already have qt 4.5
View 1 Replies
View Related
Dec 13, 2010
I am looking for a Linux app that can find and remove duplicate images (with different filenames if that's at all possible).
View 5 Replies
View Related
Aug 19, 2010
I have two folders - Folder abc and Folder xyz which contains 1000's of files with few of them having the same file names. How can I remove the duplicates from Folder abc?
View 14 Replies
View Related