Programming :: Script To Remove Lines In A File With More Than "x" Instances Of Any Character ?
Oct 4, 2010
I'm looking for a script (bash, python, perl etc) or even a one liner (sed, awk etc) that can take a set of files and remove any line that has more than "x" instances of any character (case sensitive). I have been doing a lot of searching and can only come up with examples of how to remove blank lines, lines that start with a certain character or lines that contain a certain string. This will be used on a system running a Kubuntu derivative.
As a very poor and basic example, I would like to take files that contain lines like:
Code:
And end up with the files only containing the lines:
Code:
If I tell the script that 2 is the maximun number of times any character can appear in any line.
I know this must be possible, but for the life of me I cannot find even an example that will lead me in the right direction or better yet a piece of code I can use.
View 15 Replies
ADVERTISEMENT
Jul 16, 2010
I have a large text file that's formatted sort of like this:
Code:
foo bar
blah
[code]...
View 2 Replies
View Related
Feb 18, 2010
In my command prompt I did:
Code:
sed 's/://' mytextfile > newtextfile
But it only deleted the first instance of : in each line when some lines have multiple : appearing in each one. How can I delete all the : from the entire file?
View 6 Replies
View Related
Jan 21, 2011
I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.
View 9 Replies
View Related
Jan 23, 2010
I want to be able to remove the first character of a line when I highlight multiple lines in gedit. Example:
%Example is
%Commented Code
%Uncomment using this shortcut
I would then highlight/select these lines, and remove the first character to make it look like this:
Example is
Commented Code
Uncomment using this shortcut
I'm pretty sure there is an actual shortcut for this. If there is another text editor on Linux that it would work in, it would be nice to know how to do it in that editor as well.
View 2 Replies
View Related
Jun 21, 2011
I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?
View 13 Replies
View Related
Nov 24, 2009
How do you remove parts of strings using python? Such as, if I have something like:
Code:
erme1 sdifskenklsd
erme2 sdfjksliel
[code]....
View 3 Replies
View Related
Oct 3, 2010
I have a set of files containing DNA or amino acid sequences from various organisms:
Code:
>14432|LGIG|186221
--MISVLAMA-NRITAAEKR
>14432|CAP1|21057
MVRVNVLADALKSI-TAEKR
>14432|HROB|156827
--RMNVLADALXSIC?AEKR
>14432|NVEC|159589
-VRVNVLN-ALNSICNAEX-
[Code]...
Can anyone help me get the position of the first and last non-missing data characters (while allowing missing data characters in the middle of the sequence)? I'm sure it is a simple sed or awk command but I can't figure it out. I think I can produce the output file I want once I have figured those commands out.
My ultimate goal is to write a script that can make composite sequences from two or more non-overlapping sequences (e.g., the two sequences from NEOM). I may also want to merge sequences that partially overlap (e.g., those from TEST) but that would complicate things. Is this a logical first step for such a script or would you do it differently?
View 10 Replies
View Related
Jan 28, 2009
I have a text file called file1.txt containing many lines eg.
line1
line2
line3
line4
line5
line6
Then i have another text file called file2.txt contains
3
5
6
Is there a command to remove the lines in file1.txt based on the keywords in file2.txt? note: It should remove line3,line5,line6 based on 3,5,6
View 10 Replies
View Related
Jun 27, 2010
I have a string like this "/home/test/filename.txt" and i want to delete all character after the last "/". how to do that using sed or awk.
View 5 Replies
View Related
Feb 24, 2010
which is the simplest way to have first and last character cropped out from a string? Something simpler than
Code:
echo $STRING | cut -b 2- | rev | cut -b 2- | rev
View 6 Replies
View Related
Jun 22, 2010
i have the following:
Quote:
echo %host%|sed "$s/.$//"
this would remove the Last character of the value assigned to the %host%. for example if my value is: abcd i get abc. but i am not able to assign the output. for example when i do
Quote:
set k=`echo %host%|sed "$s/.$//"`
after doing echo %k i get no output at the command prompt...!! whereas when i just type:
Quote:
echo abcd|sed "$s/.$//"
at the command prompt i get abc. maybe some other ways to Remove the Last character...?
View 13 Replies
View Related
Jul 22, 2011
I have got certain files which somehow contain abnormal character "Del" "0x7f" or 177 which represents Del. And this is causing SVN to reject these files and abruptly end the process. I need to remove those characters from the file names but am not able to. find or grep do not search the files. This is how the file looks like with ls or find code...
View 3 Replies
View Related
Nov 3, 2010
How can I remove all lines which contain A,,,,,, I tried the following sed statements but no luck.
Code:
sed "/A,,,,,,/d file"
sed "/A,,,,,,/d file"
View 6 Replies
View Related
Apr 1, 2009
I am trying to copy a large number of files from a Linux server to a Windows file share. Unfortunately, all of the files and folders I have to copy have 10 numbers followed by 2 colons "::" in the name (example: 1234567890::WordDoc.doc) which of course is invalid in windows naming conventions. So now I'm trying to come up with a way to change the file and folder names on the fly to replace the colons with a dash "-" or space " ". I'm even willing delete the frist 12 characters in necessary. I have tried cp, mv, tr, and several -bash scripts but get no positive results.
View 4 Replies
View Related
Nov 7, 2010
I am using 'sed -e /foo/d' to match lines which I want to delete from a file. I discovered I have some lines which contain random (extended?) characters like 'ủ' which I would also like to delete. The lines in the file should only contain alpha numeric characters.
View 8 Replies
View Related
Jul 15, 2010
I'm trying to search through some pdf files and I'm doing so by converting them to text files using pdftotext which is fine but I'm trying to get the number of occurrences in a paragraph of different words and it's adding a new line character at what it thinks is the right hand margin. I'm trying to remove all these singe new line characters but keep the doubles and I can't seem to work it out. i.e.
This is some text that has been broken.
Another paragraph.
becomes
This is some text that has been broken.
Another paragraph
View 9 Replies
View Related
Apr 8, 2010
I have a file with semi duplicate lines, like:
abc 12 32
agsi 82
sha 26
abc 1
iaij
agsi 3
Now I want to edit my file and make it:
abc 12 32
agsi 82
sha 26
iaij
i.e. remove second occurrence of line when 1st column is abc or agsi.
View 13 Replies
View Related
Aug 6, 2010
When i want to remove particular lines containing a specific word in from entire document at a time,i am using the following command.
awk '$columnno !~/specificword/' inputfile > outputfile
But here, coulmn no is my problem, because iam having this in different columns. So i need a solution for it.
How to write such removal command without mentioning column no. , ie irrespective of column no, it has to remove all lines having that specific word.
View 10 Replies
View Related
Sep 6, 2010
I am creating my own address book Python program and I want to create a nction that removes some specified entries. The code looks like this now.
Code:
def remove():
delentry= raw_input('Enter the entry name to delete: ')
[code]...
View 1 Replies
View Related
Dec 16, 2010
Contained within each of these 67 text files is about 1 million urls. Yes. I have 67 text files that contain 1 million lines of urls each. I am sure I am swimming in duplicates. I tried opening one text file and clicking sort ----->remove duplicates. Now Gedit is not responding my processor is maxed out to 100% and I think I am finally ready to delve into some command line code. Can anyone give me idiot proof instructions on how to sort the duplicates out of each one of these 67 text files? How about no duplicates across all 67?
View 7 Replies
View Related
Jun 16, 2010
Is there any commands or scripts to remove only selected line in the history file.
View 1 Replies
View Related
Jul 6, 2011
anyone has ideas how to remove lone lines from a text file?
If I have a file that is like this:
-----------------------------------
line 1
[code]...
View 14 Replies
View Related
Apr 14, 2010
i have a big file of random numbers i generated at some point in time, after working with it with different things(how fun that was)... i want to remove duplicate lines and i'm not sure i'm doing this right
heres the command
Code:
sort random.txt | uniq -u > rand-shorter.txt
the file is pretty big, everything on a new line. i found the command on a web site so i'm sure its correct(bit of a command line in linux newbie)
can anyone confirm if this will remove lines duplicate lines (keeping one copy) and dump what is left in a file named rand-shorter.txt?
EDIT: i think its actually working, just taking a reallllly long time (on an old pen 4 from 2000)
View 8 Replies
View Related
Jun 5, 2009
I want to remove duplicate or multiple similar lines from multiple files. I.e. if I have four files file1.txt file2.txt file3.txt and file4.txt and would like to find and remove similar lines from all these files keeping only one line from these similar lines. I only that uniq can be used to remove similar lines from a sorted file.
View 9 Replies
View Related
Mar 17, 2011
Trying to remove lines from a syslog text file that have duplicate strings
Mar 10 06:51:11[http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360]
then a few lines down
Mar 10 06:52:03 [http-8080-1] INFO com.MYCOMPANY.webservices.userservice.web.UserServiceController [u:2533274802474744|360] Authorize [platformI$tformIdAndOs=2533274802474744|360, userRegion=America|360
got the same thing in terms of a u: number but the issue is I need to remove duplicates and just leave one and the file has multiple duplicates of different u: numbers and it's 14,000 lines long. can anyone tell me if I can use awk? sed? or sort for something like this to? removing lines that have a certain string in there that's a duplicate.
View 4 Replies
View Related
Jun 23, 2011
Im trying to read a file in c++ and search for particular character for example if this is a list that I have:
Alice
Bob
David
[code]....
if the input is D, it should give David, if its B, gives bob. so in this case, meaning it reads the first character of every line. but if possible I want to make this dynamic so the user can specify which character position he is looking for, so in case he is looking for R as character index 3 in all lines, it should give Charlie. but the problem is, it does now recognize , besides, I do not know how to specify the character position in each line.
here is my code
Code:
#include <iostream>
#include <fstream>
#include <cstring>
[code]....
View 1 Replies
View Related
Jun 15, 2011
I need some software that will check .xml file and tell me which character is malformed in 'utf-8'. I am using perl for some parsing.
View 2 Replies
View Related
Sep 7, 2009
I have a script that looks like:
Code:
cat servers.txt
trivia:P:N
trivia:D:N
tucana:P:Y
[code]....
I want to be able to find the lines that matches my input and change the N to a Y, but only for the lines that matches the name and not any other N's My problem is the line does not always contain a P as it can be a D as well so my matching did not work. If my script issues the name $1=triva the lines will change to:
Code:
trivia:P:Y
trivia:D:Y
I have the following code so far but as you can see it does not change the D's
Code:
sed -i 's/trivia:P:Y/trivia:P:N/g' servers.txt
*** UPDATE ***
should I be using a method as follows? I am still stuck on the changing all instances though.
Code:
$1=server
sed -i 's/$server1:P:Y/$server:P:N/g' server.txt
sed -i 's/$server1:D:Y/$server:D:N/g' server.txt
View 7 Replies
View Related
Sep 17, 2009
I am trying to delete lines of a file if they contain text that is present on another file. For example
> cat one.txt:
a
b
c
d
[code]....
I get the following output:
> ./test.sh one.txt two.txt
a
b
d
e
[code]....
View 6 Replies
View Related