Programming :: Script To Delete Aligned Single-character Columns With No Field Separator?
Apr 20, 2010
The lines beginning with greater-than symbols are the sequence descriptors and the lines immediately after each descriptor with A-Z characters, dashes, and question marks are the aligned DNA sequences. The sequences are always the same length within a file and never span/wrap across more than one line.I am trying to write a script to remove positions in the sequences that are only represented by a -, X, ?, or N (these represent gaps or missing data). Also, if there is exactly one non-gap/missing character in a position it is also useless (there is nothing to compare it to) so I would like to remove those positions as well.
Position 5 (from the left) was removed because it was all gap/missing characters. Position 9 was removed because only one character was a non-gap/missing character. Position 10 was retained because there were 2 non-gap/missing characters.I'm really not sure where to start here. My first concern is I can't figure out how to tell awk to treat each character in lines not containing a greater-than symbol as a separate field. After that, I'm thinking I should use set up a counter to count the number of lines with gap/missing characters comparing that to the total number of lines not containing greater-than signs?
I'm trying to display fields from flat files where the first 8 fields are always the same. Fields 9 - n are varied but will contain specific patterns I'm after. I'm using this so far because "mySearch" is on each line I want to examine.
Code:
How would you pattern match and include 2 additional fields above field $9 but change field position from line to line?
I am creating a game with random variables. In the game I have created a dialogue exchange to players. I have set up a table with various returns and I inserted {$fields} to represent various random variables. When I call on the requested fields, I only see the field text and my field names. Am I supposed to parse something and call it back another way?
ie: myfield is: "You have won {$random1} silver! <br />{$wi['gender'] majesty rewards you well." the code I am using to call that field is:
I use with 11.2 a network without problems.Now I installed in the same PC on another hdd 11.3 and get in yast2 network the messages:eth0 not aligned and for wlan too.My driver realtek r8192s_usb What can I do? adda7,
I am unable to write a simple Makefile. Though I know the concept am facing this error: Makefile:2: *** missing separator (did you mean TAB instead of 8 spaces?). Stop. Should I give a tab or spaces not able to continue.
I tried to find out how awk works with multiline strings. I found this. I hope it will be useful for somebody. 1. I know that awk can searches simple patterns like '/^one/'
Code: s="one two three" echo $s | awk '/^one/'
2. I know that "Awk can handle multiline records by specifying the field separator to be a newline and set the record separator to an empty string." I've found it here
I am running RHEL 5 and when I get the backspace key it deletes a section to the left (such as /usr/localbin) instead of just a single character to the left. How do I change that?So it deletes the section instead of just the single character n.
Can anyone offer a code snippet to recursively go through directories and replace any single or double quotes quotes found in a filename with another character (e.g. "_").If any of the filenames contain a single quote or double quote, then replace it with an underscore.
I want to write a function which calculates the space needed between fields, to generate a table with aligned fields, like when you type "ls -l", the operating system generates a table with beautifully aligned fields. I've got this code so far:
Code:
for line in $(cat tmpSearch) do line=`echo $line | tr ":" " "`
I increase my knowledge in vim in two ways. Little hints about doing this or that and scattered studies using the vim help files. Please do not believe I always rely on the first one.
I doing malloc and getting the chunk of dynamic memory. Now I want it to align that memory to 64KB. This means that the address of the memory starts from 64KB or multiple of 64KB.
i need some help to solve thisif i have this , CREATE TABLE "HALOOO"in one lineafter this line they have "BRANCH INFO" how do i use the (") that is in create table line and not affect other line
We had seen some time ago, various tricks to remove the character MS-DOS text files on Linux. Here is a new trick to do this directly from the vim editor. to convert a file opened with vim in UNIX format, simply use the following command code...
I have question about the UNIX sockets. my goal is to connect multiple sockets from a single client to a single server and keep them open...I'm not sure if that is possible to create or not. Do you have any suggestion or an example of code?
I've had a very colorful morning learning the innerparts of Linux's sort command, and have come across yet another issue that I can't seem to find an answer for in the documentation. I'm currently using -t, to indicate that my fields are split by the comma character, but I'm finding that in some of my files, the comma is used (between double-quotes) within values:
Jonathan Sampson,,foo@bar.com,0987654321 "Foobar CEO,","CEO,",ceo@foobar.com,,
How can I use a comma to terminate my fields, but ignore the occurences of it within values? Is this fairly simple, or do I need to re-export all of my data using a more-foreign field-terminator? (Unfortunately, I do not have any control over declaring a different terminator with this particular project).
How can I add columns to the right of GtkTreeView? How can I add the menu to the right of the window? How can I change the position of the icon in the GnomeMessageBox to the right of the dialog? And how can I change the arrange of the buttons from right to left in GnomeMessageBox? and position of the icon on the buttons in the GnomeMessageBox?
I have a file that contains a couple of email addresses and I want to extract the usernames ( Letters before @ symbol ). How can I do that using sed/awk.
I know cut will work, but the current environment doesn't allow me to use cut command. I can use either awk or sed.
Say I have a text file with10 columns. I need to reorder them based on a list of column numbers that will reorder them.
My problem is this:
If I want to cut out 5 columns (columns 1,2,3,9,10) in the order 1,10,2,9,3 then I have tried using:
Code: cut -f1,10,2,9,3 my_file.txt > reordered_file.txt But this just extracts the columns in order as if I used:
Code: cut -f1,2,3,9,10 my_file.txt > reordered_file.txt How can I cut these columns and place them into the new file in the order I specify?
While this might seem quite trivial, I will actually need to do this for a file containing ~14000 columns with ~12000 columns that I need to extract in a particular order.
I have a folder with only 24 files named <number>.dat (i.e. 4.dat, 6.dat and so on) where <number> is between 0 and 256. Each file has just two columns of data and nothing else.
I'm trying to combine all the second columns ($2) together. I've been fiddling around with getline and so far have
which takes file 4.dat and adds $2 from 6.dat, but I want a single command to take each $2 from every file and add them to (for example) 4.dat (having $1 from 4.dat is no problem). A command that takes every file in the folder and grabs $2 and places them in a common file would be ideal. Frankly I can work around if you combine both columns from every file.
I bet this is a Perl one-liner (or very simple python script).I have a tab separated files in which each row looks like:Unique_Eight_Character_Sequence [3 tabs] data1~moredata1~moredata1 [3 tabs] data2~ moredata2~ moredata2 ... dataN~.The output file should have each column converted into a row (with the unique character sequence copied in for the first column), and then each "~" replaced by a comma.
Is there any way to filter the output of a command based on the values on the output columns. For example i execute du -h on directory with many files. Now I want to filter the output based on the size (i.e. M or G or K ). The filtered o/p should contain only M(megabytes) or G(gigabytes) and also all columns.
I have a Perl script that has two arrays - they are related. I would like to print out the contents into two columns next to each other.
#!/usr/bin/perl open(PINGFILE, </home/casper/pingdata.txt") or die " can not open file "; my @totalfile=<PINGFILE>; foreach $string(@totalfile) { if ($string =~ m/(^1sping)(?=.*max))/) { push(usecstring,"$string");
I have an array with 15 elements, and I want to break it down into three columns. When the array is split into a the three elements - however on the iteration, it does not conform to that structure.
I've been hitting my head against a wall for awhile with this one:As the last part of some data analysis I performing I would to construct a matrix from a series of different files. These files have the format: