Programming :: Check For Word Frequency In 1 Column
Jul 6, 2010
I have a large tab delimited text file, about 17gb. It only has 6 column. On column number 4, it is all numbers. Ranging from 1-1000. I want to count how many times each number occured. So the output I want is in 2 columns, first one is a number, second column is how many times it occured. I tried
head -n 1000 coverage | cut -f 4 | uniq -c
Didn't work for me, the first column returned is not unique.
I'm building a script for my place of employment. The next step in it is checking what the user input was. Determining if they added a part in there or not. The script prompts for a hostname. Hostnames are localhost.localdomain. Now, I want the script to check to see if they put localdomain and if they did, not to add the domain to the /etc/sysconfig/network, but just what they entered. So say the user inputs:
I would like to make a file with all these data in one column, like
a1 a2 . .
[code]....
Can it be done with awk or some other command? Also, is it possible then do add another column in front of this one with numbers of the lines (for every previous column), like
Well, I am facing one issue:How can i read two files word by word at a time using any loop as i need word by word comparision in shell script?Please let me know pseudo code.
I am pretty new to bash scripting...I am trying to write a script that will take an input and read it word for word and then DO something with it like echo. I have been able to find how to read word for word from a file but I don't know how to do it with input.
I was looking for something like
Code:
exit 0 The input would be A-Z a-z 0-9 and have a single space between each word.
I had it in mind that Ubuntu ran disk checks every 30 boots, but mine are more frequent - running between 10 & 25, which is an irritation. Records show checks after: 12-21-10-20-10-20-13-25-16-21 boots. Should I worry about either the frequency or the variability? I found threads suggesting how to change the frequency using tune2fs, so I suppose I can try that to stretch the interval to maybe 50 or weekly? Will it have any effect, since there is so much variation already? Is there a GUI for setting this frequency, instead of fiddling in terminal?
Is their any software in Linux which tells about audio sound quality (frequency,bits/s etc...? which is special designed for all Audio_quality-features. Moreover, I have tried Themonospot software but its only for Video formats. I want soft 4 audio formats only.
Problem is simple but I can't figure out how to solve it, I tried any possible way that I know but with no result.I'm using simple perl script with DBI and do select from one table and do update in other table with results from select, but I can't preserve my '' returned from select when doing update. I simply want my '' from first table to be '' in second but postgres makes them real new lines. I tried to escape '' with , '',"",E(I mean E'value here') in front of value that updating but they are always real new lines not '' in new table.
I've got a bit of an obscure question for you to test your brains a wee bit. I'm trying to implement a search program to find areas of high density in a binary string.
Where density is the number of 1's / number of digits with a maximum number of digits being the current number in a buffer (in this example 50). So for the example the density for the whole buffer is 15/50. But the density of Buffer[14..20]=[1110001]=4/7. So if looking for areas of density = 1/3 it would find the longest sequences of density over 1/3.So in the example. Buffer[4..9]=[100101]=3/6=1/2 which is above 1/3 but it is within the Buffer[4..48]=[100101000011100010000001000100100001001011001]=15/45=1/3
#!/bin/bash ls -lhGg | while read line; do echo "$line"; done | awk ' { print $3" "$6 } '
what i want to do is be able to print column 3 and every column greater then 5. Has to be to the end of the line, since different filenames can have different amounts of words in them and the blank space is the separator. my current code works just fine if the file has no blank space.
I have a text field that is just list of servers and I need to add the word hostname in front of them... It must be brain fart but I can't think of how to do this. Basically I need this:
With tr '''' < file I can select all columns to become separate rows,but as you see x3 and x4 have to be grouped when transposing.Or should I use awk for this one?
i've been using a awk script to calculate my data... i have 3 files:
file a1.txt:
2 3 4
[code]....
the results were (3.5, 6 and 3) which is pretty easy.. now i want to combine all this into 1 file and each have different columns and called it avg.txt which have something like this in the end:
I'm having problems adding up column totals using arrays. I've got it to add up the row totals and display at the end of the row. Here is my code so far
Code:
#include <stdio.h> #include <string.h> const int maxrows=10;
[code]...
What i need it to do is, add up the columns and display it at the bottom of each column similar to how the row totals display
I need to extract the Info from the RC column for the first 4 players of liverpool. The test code i have does the same,but can anyone show me a better way of doing it.I could do it easily with gawk -F"|" and print the respective column,but i need to do this in perl.
I have a huge (over 10 gb) file with a list of IP's each followed by a corresponding number like this:
Code:
12.32.34.23 10 143.32.34.543 11 232.32.45.65 12 54.23.5.232 13 143.32.34.43 14 and so on..
I'm trying to sort this file numerically and weed out any duplicate IP addresses. How do I do this on bash? I have come up with this but obviously it does'nt work.
I just started programming in PHP so I haven't figured out how to do this yet, but I have a multi-dimensional array that I need to sort by one column. That's fine...but I need the sort to ignore case! Right now I have it sorted by 'name' (the other column is 'uid').
The problem is that by the default the sort is case-sensitive so the array looks like this: Code: Apple 4015 Banana 4011 Cherry 4045 avocado 4046
I want to be able to sort the the 'name' column in a case-insensitive manner so that the array actually looks like: Code: Apple 4015 avocado 4046 Banana 4011 Cherry 4045
How to accomplish this? Just FYI I'm not actually sorting the PLUs for fruits...but it was a simple example. I'm actually doing this for a Facebook application.