Programming :: HUGE Files - Compare A List Of Patterns From One File And Grep Them Against Another File And Print Out Only The Unique Patterns?
Aug 13, 2010
I am trying to compare a list of patterns from one file and grep them against another file and print out only the unique patterns. Unfortunately these files are so large that they have yet to run to completion. Here's the command that I used:
Code: grep -L -f file_one.txt file_two.txt > output.output Here's some example data:
Code:
>FQ4HLCS01BMR4N
>FQ4HLCS01BZNV6
>FQ4HLCS01B40PB
>FQ4HLCS01BT43K
>FQ4HLCS01CB736
>FQ4HLCS01BU3UM
>FQ4HLCS01BBIFQ
how to increase efficiency or use another command?
View 14 Replies
ADVERTISEMENT
Aug 10, 2010
I am interested in using the grep method in the shell of my CentOS machine to obtain patterns from a file and use them to search through another file and highlight the patterns found. For example:
pattern file:
one
two
three
test file:
AAAAAAAAAAAAAAAAAAAAAoneAAAAAAAAAAAAAAAAthreeAAAAAAAAAAAA
View 8 Replies
View Related
Jun 9, 2010
I want to traverse a directory and get a list of files that contain a set of patterns. I assumed I could use grep for this, but I having trouble getting grep to only return files that match ALL patterns. Here's what I've come up with so far:
Code:
grep --recursive --file=searchpatterns.txt --files-with-matches somedirectory/*
However, this gives me a list of files that match ANY of the patterns in the searchpatterns.txt file. I want to match ALL of the patterns. I've looked through the man page, but can't find anything that allows me to change the "OR" to "AND" for multiple patterns.
View 5 Replies
View Related
May 4, 2011
I have a file with joker character patterns:
./include/*
./src/*
etc.
From the current directory I would like to recursively get the list of files that do not match these patterns.
View 2 Replies
View Related
Dec 13, 2010
I have to write a script which would search the IP adesseses in a given directory.
Below is my command.
Code:
grep -HwrnI --exclude=*.log '[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}' *|grep -v '/.svn/'
I have to exclude the following from search resluts.
1. Comments
a. Can be starting with /, * or #...
b. Cane be between a line
EX: some text... #comment1
View 9 Replies
View Related
Apr 11, 2011
I have a file that goes like this:
I need to grep the lines between pattern 1 and pattern 2 and not the lines following pattern 2. Cannot use grep -A(num), as there are varying number of lines following pattern 1. Also, used awk one-liners, but results are erroneous.
View 14 Replies
View Related
Jul 29, 2011
Awk varying patterns to different file?
[Code]...
What I want to do is when the records have identical $3 i.e. same gene:blabla, I want to put them in a file with $3.out (P.S. along with the lines below it) I tried grepping out $3 first separately onto a file, and then taking each line in that file as a pattern and pulling out records using awk. Somehow I faced probs with pulling out onto $3.out
View 6 Replies
View Related
Nov 22, 2010
I want to replace a pattern with other pattern in a textfile. But there are two same patterns,but I need two change only the second occurence. EG:
Text file is
aaaa=1
bbbb=2
cccc=3
dddd=4
[code]....
Now I want to change aaaa=x into some other entry.
View 3 Replies
View Related
Jul 28, 2010
I am trying to delete any blank lines within two patterns e.g.
Address: 53 HIGH STREET Cred Id :
MYTOWN
MYCOUNTY
MM12 6MM
Pay Method : Crossed Cheque
The start of my pattern is "Cred Id" and the end is "Pay Method" and I want to delete the blank lines between county and post code. I did find the code below but it doesn't seem to change anything:
sed -ne '/Cred Id/,/Pay Method/!bp' -e '/^$/b' -e -e p ll.out
I can get it to print just the range I'm interested in by doing sed -ne '/Cred Id/,/Pay Method/p'.
View 5 Replies
View Related
Jan 20, 2009
I'm having a small issue with regex matching in Perl. I'm pretty certain it's a simple fix, but it all looks correct to me...
If I run the following:
Code:
It prints out all the lines containing a 'P', as one would expect. But when the regex is
Code:
I get zero lines printed. It seems to match only single-character patterns.
The file I'm reading is: (It has the same effect whether I leave it with Windows linebreaks or convert them to unix).
Code:
View 3 Replies
View Related
Apr 4, 2011
I remember using some dos program that would scan text files for particular words/patterns. It had an ncurses-like interface and if I'm not wrong its name was "Concord". Is there anything like that on linux? The main functionality was as follows:
1. support for regular expressions
2. print lines containing a particular word or pattern (highlighting it) and printing the surrounding lines.
3. print lines containing a particular word or pattern only if another pattern occurs within N words to the left/right.
The second point is easy to achieve in grep. The 3rd one could be done in awk. The problem is that as much fun as it would be to put it all together and embellish with some nice ncurses interface(eg. with dialog), I don't want to reinvent the wheel. Besides, I have just relocated and have been waiting for my phone line welcome pack for almost 2 weeks now (ie. no internet apart from work and mobile phone), which makes it difficult for me to get anything done.
View 2 Replies
View Related
Apr 7, 2011
i have a problem about deleting a line from a text file which contains two specific patterns. i am using "sed -i "/$name/ d" peop.txt" but i must use one more variable which is surname.
"
burak:ak:3242:2342:dsa@a.com
gokhan:an:432:4234:da@a.com
"
and this is the code of text file. and the second question when i use "/$name/ d" it deletes not only the names which are macthing with $name but also all words that contain $name. so how can i fix these problems_?
View 2 Replies
View Related
Jun 16, 2010
I have a file that I need to scan and output data between Number and End containing string 123.
Number 1:
6
7
123
1
End
View 1 Replies
View Related
Jul 5, 2011
I'm trying to use sed to search for a certain 'primary' pattern that may exist on several lines, with each primary pattern followed by an --unknown-- number of 'secondary' patterns.The lines containing the pattern start with: test(header_name)On that same line is an arbitrary number of strings that come after it.I want to move those strings over to their own lines so that they each are preceded by their own test(header_name).e.g. Original file (mytest.txt):
apples
test("Type1", "hat", "cat", "dog", "house");
bananas
[code]....
View 2 Replies
View Related
Apr 7, 2011
In Midnight Commander, is it possible to exclude some directories/patterns/... when doing search? (M-?) I'm specifically interested in skipping the .hg subdirectory.
View 1 Replies
View Related
May 1, 2011
My script.
This is may script:
Code:
Problem: Output file doest not exclude the values in grep -av
View 3 Replies
View Related
Feb 22, 2011
I need to get names of all installed packages in 2 machines and save them in 2 text files, then I want to compare these 2 files to know the differences between 2 files and from that I could know the differences between 2 machines. Is it possible to do that and what program I could use?
View 2 Replies
View Related
Nov 22, 2010
I need to kind of grep within grep. My input file would be something like:
[Code]....
and I need to find the first occurrence of hello before MY PATTERN (hello 9008 in this case), so the output should be:
[Code]....
View 4 Replies
View Related
Dec 9, 2010
how to program in bash, an i have a problem, i am trying compare values in between 2 values (from another file), so far my solution is to make a nested for loop, but that causes it to compare every value. Here is a visulization of what i want
file.a 2,3,4,5
file.b
3 5
[code]...
i want the values 2, 3, 4, 5 from file.a to be compared inbetween values 3 5, 6 9,1 2, 4 7 from file.b (var1 is the value im comparing, var2 is the less value, var 3 is the greater value)
for i in $var1
do
for k in $var2
do
[code]....
my problem with the above code is it compares EVERYINNG, not the values inbetween what i want (which is 3 5, 6 9 etc).
View 8 Replies
View Related
Sep 12, 2010
I have to write a script that accepts two directory names (JIIT, JUIT) as positional parameters and checks which files are identical in both directories and files having same contents are also considered as identical in same directory. I tried using diff:
#both directories contain three files...file1, file2, file3
echo "Enter the directories:"
read d1
read d2
cd $d1
if diff file1.sh file2.sh > /dev/null
then echo same 1,2
else echo different
fi
if diff file1.sh file3.sh > /dev/null
then echo same 1,3
else echo different
fi
if diff file2.sh file3.sh > /dev/null
then echo same 3,2
else echo different
fi
cd ../
cd $d2
.....
I used the same code in the other directory for the three files. This is not running. I also want to know what to do when I need to compare files from different directories. i.e., JIIT, JUIT..
View 3 Replies
View Related
May 30, 2011
I am trying to write a program in C which compares two files and prints the line that is equal.
Here file1.txt has
and file2.txt has
Note: file2.txt consist of only a single string where as file2.txt has multiple lines. Actually im comparing two files with md5sum values.
Here is the code but it compares only first line of files..but it should compare the whole file1..and sorry iam a beginner in C can any1 sujest some modification to this code so that..it can compare file2 with entire file1
Quote:
View 9 Replies
View Related
Feb 15, 2011
I'm working with Radiotap headers right now. I want to get the RSSI data. I came through a problem that I can't figure out right now.The value that I need to get is:
Code:
s8 IEEE80211_RADIOTAP_DBM_ANTSIGNAL
now, when I printf it:
[code]...
View 4 Replies
View Related
May 3, 2010
I want to know that is there any method to grep a particular data from a file without using the "cat --- | grep ' ' " command....I need to use a system call for this functionality.
View 1 Replies
View Related
Oct 27, 2010
I want to write some code to search for a specific string in a text file, but without using grep command.
View 5 Replies
View Related
Jul 16, 2011
I am using File::Find to go through a very large tree. I am looking for all xml files and open only those that contain a tag <Updated>. I then want to capture the contents of two tags <Old> and <New>.
My problem is, after I open the file and do the first grep for <Updated> (which does work), I am unable to grep again unless I close the file and open it.
I did something like this:
Quote:
find(&check, $dir);
sub check {
if ($_ =~ /.xml/){
open(FILE,"$_");
if (grep{/Updated/} <FILE>){ # <-- works
[Code]....
View 6 Replies
View Related
Apr 2, 2010
What I'm trying to do is grep [mysqld] from inside my.cnf and then add lines inside the file should they not be there. how do i do that?
View 3 Replies
View Related
Jul 13, 2010
On one of my servers, it appears that a bunch of html files got the following code added to it...Quote:[URL]I was going to try to remove this line using grep & sed... as sample
grep -lr -e 'apples' *.html | xargs sed -i 's/apples/oranges/g'I can get the grep portion to work...
Code:
grep "<script src='http://b.rtbn2.cn/E/J.JS'[>][<]/script[>]" *
But not the sed
View 1 Replies
View Related
Mar 4, 2010
Suppose i have a file(1.txt) separated by TAB delimiter in a line
1 B AB 2
2 C AB 2
if I need to search for the records having B?? using grep.If i need to perform multiple search like line having "C and AB" or "B and AB"??
View 5 Replies
View Related
Nov 13, 2010
I have a huge binary log file. There are lets say 4 id's that I want to find in a log file. I know that those 4 id's will be present in the log file and I also know in what order they will be present. I want to find 1st id from the log then 2nd id and then third id and so on..
Simple/inefficient solution is: Loop through the id's and then grep in the log file. Problem with this solution is for each id grep will search from the beginning of the file.
Better/efficient solution would be: Sine I know the order in which id's will be present in the log file. Loop through id's, grep 1st id and then move on to grep 2nd id and so on...this way I can grep all id's in one pass. Is this solution possible ?
I have 500000 + values to find in log files and I have to find efficient solution for it.
View 2 Replies
View Related
May 28, 2010
I'm trying to math all class references in a C++ file using grep with regular expression. I'm trying to know if a specific include is usuless or not, so I have to know if there is a refence in cpp. I wrote this RE that searches for a reference from class ABCZ, but unfortunately it isn't working as I espected:
grep -E '^[^(/*)(//)].*[^a-zA-Z]ABCZ[]*[*(<:;,{& ]'
^[^(/*)(//)] don't math comments in the begging of the line ( // or /* )
.* followed by any character
[code]....
Well, I can get patterns like this:
class Test: public ABCZ{
class Test: public ABCZ {
class Test : public ABCZ<T>
[code]....
View 4 Replies
View Related