Trying to remove lines from a syslog text file that have duplicate strings
Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]
then a few lines down
Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360
got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.
I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.
Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?
I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?
i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right
I have this massive table file with some data in it and I want to replace some lines that are wrong with the correct ones that are in another table file of the same format. The wrong lines are not all together in a block but randomly distributed so I need to make a loop checking if the line is in the other file and if it is, replace it. I want to try and do it with sed or awk but I don't really know how to....
I need to filter the log from a massive wget. I want to remove the progress lines and only leave the last one. Now each progress line starts with a newline '
I have model output data in ascii format. It contains thousands of lines. The output file contains multiple text lines with variable values. here I copy-paste some of it's contents.
I have a txt file with couple of comment lines: Number of title = !num! #line1 #line2 #line3
I wrote a script with "sed" to replace !num! in this file, which is very straightforward. However, based on the !num!, I want to remove the number of "#" based on the !num! value. Is there an easy way to do that with "sed"; otherwise, i will have to write a script to loop through the file.
I want to be able to remove the first character of a line when I highlight multiple lines in gedit. Example:
%Example is %Commented Code %Uncomment using this shortcut
I would then highlight/select these lines, and remove the first character to make it look like this:
Example is Commented Code Uncomment using this shortcut
I'm pretty sure there is an actual shortcut for this. If there is another text editor on Linux that it would work in, it would be nice to know how to do it in that editor as well.
I have a CSV file that's created in an application that can't output lines longer than 250 characters. the data fields, all together, are longer than this. how would I remove the line break from every line that ends with a comma? For example:
I'm looking for a script (bash, python, perl etc) or even a one liner (sed, awk etc) that can take a set of files and remove any line that has more than "x" instances of any character (case sensitive). I have been doing a lot of searching and can only come up with examples of how to remove blank lines, lines that start with a certain character or lines that contain a certain string. This will be used on a system running a Kubuntu derivative.
As a very poor and basic example, I would like to take files that contain lines like:
Code:
And end up with the files only containing the lines:
Code:
If I tell the script that 2 is the maximun number of times any character can appear in any line.
I know this must be possible, but for the life of me I cannot find even an example that will lead me in the right direction or better yet a piece of code I can use.
I'm looking for a way to insert the number of lines in a file to the start of the aformentioned file. This should be simple but as I am not used to scripts in Linux, I am finding it tough going. I can find the number of lines in a file easily enough via
filesize=$(awk 'END {print NR}' $1)
but as for inserting this into the first line, i'm failing to do so. I've tried some of the other approaches on these forums but none so far have been able to do so.
I've tried:
sed '1i$filesize' $1
but sed i requires a string, not a variable so no go I've also tried:
but again with no luck as cat seems to need an input stream Just to recap, i want to insert a line at the start of a given file that holds the number of lines the original file has.
I am using RHEL 5.I have a very large test file which cannot be opened in vi.The content of the file has some 8000 lines.I need to view ten lines between 5680 to 5690.How can i view these particular lines in a large file.what is command and option i need to use.
I have a large text file containing over 180k lines and another text file containing about 1k. I would like to remove lines in the 180k-line file that exist in the 1k-line file.
I would like to modify the content of a text file in Linux, in the following way:=> the file has several of these lines:./run_pest3 ./g134366.04080_0.062 x 2_d043 1 0.43 results_EC=> I want to modify all lines to be:./run_pest3 ./g134366.04080_0.062 x 2_d043 1 0.43 results_EC0.062i.e., the last number of $2 should be "attached" to the end of $7, for each line.
I'm trying to output two certain fields of a very large text file to 2 very small text files. Then take those files and add all the lines together to come up with a total from each file (two totals).
Breakdown: Put 0 in a text file to be drawn by respective while loops for math later
Output last 60 integers to a file for total A (new integer every minute) Output last 60 integers to a file for total B (new integer every minute)
The two while loops are supposed to be adding the lines together. The echo commands at the end are for testing purposes, just to see the output. However, when I run this, I get the output of
Code:
0 0
Which is obviously not what it's supposed to be. Is there a more efficient way to do this or am I missing something in the script that would reset the values to "0"?