Trying to remove lines from a syslog text file that have duplicate strings
Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]
then a few lines down
Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360
got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.
Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?
i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right
I need to insert 3-4 lines of text to the beginning of a text file. The file is a largish MYSQL dump, the result of a backup shell script. This shell script should insert the required text.I've wrestled with sed, but lost.
I am basically trying to remove duplicate words in my <title></title> tag after I got hit by Google Panda. I have around 750 .html files and it will be difficult for to me remove one by one. I am looking for a way to remove only from within <title> </title>
Example of a duplicate title I have:
<title>Pasta, Pasta Recipe and Pasta Guide</title>
I dont want to replace those words anywhere else in the file except for within the <title>
I am looking for a way to keep a log and make if then statements if a line exitsts in the log. I also am looking for a way to make a simple loop, like goto line number, and I also am wondering how to add/remove bits of text from a text file (plugins line in server.properties)
I have a plain text file with 360 lines of varying length text. How do I add a comma or other symbol to the end of each line so that I can convert the file to csv format that I can open in a spreadsheet (45 rows, 8 columns). That means each 8 lines of text forms 8 columns, with 45 rows.
I need to chop of the top 30ish lines of several log files until a line starting with "Initialization completed."The trouble is that it's not always the same amount of lines that need to be deleted, and they don't always contain the same information, which is why I would need to delete everything priorhe line starting with "Initialization completed."Right now I have a little script I wrote based on looping each file through several "grep -v" commands with each known pattern of lines I want to ignore, but it is tedious and I have to inspect each file afterwards to make sure nothing is left from above "Initialization completed
I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.
I need to create a script to count the number of lines from a text file . The output must be put on another text file (no_lines.txt) and in this file i need to generate from the script this output :"File $FILE has $NO_LINES lines ".
I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?
This should be simple but I can't seem to find what I am looking for.I want to search a text file for the existence of certain strings and execute a command if they exist, something along the lines of:
if <string> exists command or
if <any member of this list exists> command
I know how to manually search a file with grep, cat, etc., but the "if this exists" part eludes me.
I want to be able to check the contents of a text file for a specific string and remove it from the file from the command prompt. I would basically be searching through a number of files and if a specific string is found I would like it removed automatically. pretty much a find and replace, were the replace is nothing. any one got any ideas on how you would do this. I already have the search part sorted just need to be able to remove the string I don't want from the multiple files.
I have this massive table file with some data in it and I want to replace some lines that are wrong with the correct ones that are in another table file of the same format. The wrong lines are not all together in a block but randomly distributed so I need to make a loop checking if the line is in the other file and if it is, replace it. I want to try and do it with sed or awk but I don't really know how to....
Is there anyway to delete certain paragraphs within a text file and then insert the paragraph into another text file.I just cannot figure out how to remove the specific lines from the file and then insert them into another file at a certain line within that new file. Thanks again