Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?
i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right
Trying to remove lines from a syslog text file that have duplicate strings
Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]
then a few lines down
Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360
got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.
I need to insert 3-4 lines of text to the beginning of a text file. The file is a largish MYSQL dump, the result of a backup shell script. This shell script should insert the required text.I've wrestled with sed, but lost.
I am looking for a way to keep a log and make if then statements if a line exitsts in the log. I also am looking for a way to make a simple loop, like goto line number, and I also am wondering how to add/remove bits of text from a text file (plugins line in server.properties)
I have a plain text file with 360 lines of varying length text. How do I add a comma or other symbol to the end of each line so that I can convert the file to csv format that I can open in a spreadsheet (45 rows, 8 columns). That means each 8 lines of text forms 8 columns, with 45 rows.
I have a list of words that I want to grep in many files to see which ones have it and which ones dont. in the text file I have all the words listed line by line, ex: list.txt:
check try this word1 word2 open space list ..
I want to grep each line one by one. like I want it to
grep "check" *.log grep "try this" *.log grep "word1" *.log .. etc how can I do this?
I need to chop of the top 30ish lines of several log files until a line starting with "Initialization completed."The trouble is that it's not always the same amount of lines that need to be deleted, and they don't always contain the same information, which is why I would need to delete everything priorhe line starting with "Initialization completed."Right now I have a little script I wrote based on looping each file through several "grep -v" commands with each known pattern of lines I want to ignore, but it is tedious and I have to inspect each file afterwards to make sure nothing is left from above "Initialization completed
I need to create a script to count the number of lines from a text file . The output must be put on another text file (no_lines.txt) and in this file i need to generate from the script this output :"File $FILE has $NO_LINES lines ".
For example, I have a text file with data which lists numerical values from two separate individuals
Code: Person A 100 200 300 400 500 600 700 800 900 1000 1100 1200
Person B 1200 1100 1000 900 800 700 600 500 400 300 200 100
How would I go about reading the values for each Person, then being able to perform mathematical equations for each Person (finding the sum for example)?
I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.
I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?
This should be simple but I can't seem to find what I am looking for.I want to search a text file for the existence of certain strings and execute a command if they exist, something along the lines of:
if <string> exists command or
if <any member of this list exists> command
I know how to manually search a file with grep, cat, etc., but the "if this exists" part eludes me.
I want to be able to check the contents of a text file for a specific string and remove it from the file from the command prompt. I would basically be searching through a number of files and if a specific string is found I would like it removed automatically. pretty much a find and replace, were the replace is nothing. any one got any ideas on how you would do this. I already have the search part sorted just need to be able to remove the string I don't want from the multiple files.
I have 2 lists of names, they aren't sorted, and may contain repeats.What I would like to do with a bash script is compare the 2 documents and find and remove each repeat name, saving only one of them. Then concatenate the files. Or if it were easier, concatenate first and find and remove all internal repeats.
i am on processing text tasks And i found that if you assign a text to a variable is chomp'ed automatically the newline
Code:
variable=$(cat file.txt)
The problem is i can only access the items/lines using:
Code:
for line in $variable do echo $line # Other commands done
how do i convert this to an indexed array. More importantly, how do i get access to individual $line[0], ..., $line[n] Another thing, if the file.txt, has lines with spaces it is a mess using the for...in..., but echoing prints line by line...o_0