General :: Regular Expressions Match 2 File Names?
Nov 20, 2010How can we do a simple match by regular expressions on two filenames. I plan to use it in the command 'find -regex'
Code:
hosts.txt
ipaddress.txt
How can we do a simple match by regular expressions on two filenames. I plan to use it in the command 'find -regex'
Code:
hosts.txt
ipaddress.txt
I'm writing a program that works with text files, and I'm trying to create some filters with grep. I have various questions here, so I'll number them for clarity.
1) First of all, I'd like to know what wc -w is actually returning. The word count is less than what gedit is counting in Document Statistics, so obviously gedit is counting something (like newlines) that wc -w is not
2) Secondly, I was wondering if there was a way to grep x number of words. I'm looking for something like the -m option, but returning a certain number of words instead of lines. For example, to find the first 2000 words, do something like grep -someoption 2000 ".*" or using {1,2000}.
3) Finally, I'm trying to filter out headers and footers of a text file but having no luck. The text files are Project Gutenberg files, so they have standardized headers and footers. Here's an example: [URL]...
The header starts with "The Project Gutenberg EBook of" and ends with the line containing "START OF THIS PROJECT GUTENBERG EBOOK" The footers begin with: "End of the Project Gutenberg EBook of" My problem is, grep can find:
[Code]...
I can't get this simple regular expression to work for matching emails: 'w*(?:.w*)*@w*(?:.w*)*w{2,5}'
It should be working as I have tested it with regex pal and it works just fine. I think there's a problem with optional character class but I'm not sure.
I am pretty new at this topic but I would like to learn it from example. The first thing I am working on is to modify the command date to be shown as DD/MM/YY only using regular expression but I dont know how to combine what there is in the regexp tutorials online and the syntax for batch scripting. Any help?
Here is what I what.
run a file test: ~# ./test
Where file test is:
#!/bin/bash
#
DATE=$( date )
[Code]..
Also if you can point me to good regexp tutorials (directed towards batch scripting), that will be great.
As the subject says, can anyone explain to me what is the difference between Regular Expressions and Globbing?
View 9 Replies View RelatedHow to make tools like sed operate on the whole file, instead of line-by-line?
View 14 Replies View RelatedLets say I have 20 files named FOOXX, where XX is the number of the file, eg 01, 02 etc. At the moment, if I want to delete all files lower than the number 10, this is easy and I just use a wildcard, eg rm FOO0* However, if I want to delete specific files ina range, eg 13-15, this becomes more difficult. rm FPP[13-15] does not work, and asks me if I wish to delete all files. Likewse rm FOO1[3-5] wishes to delete all files that begin with FOO1 So, what is the best way to delete ranges of files like this? I have tried with both bash and zsh, and I don't think they differ so much for such a basic task?
View 2 Replies View Relatedi have a file like this
# comments
#comments
#comments
bla bla
[code]....
i want to grep lines which do not start with # or a blank space. like
bla bla
bla bla
how do i do this? i tried grep --invert-match '^#' which gives lines not starting with # but gives me blank lines too i tried grep --invert-match '^#|^ ' which will give lines not starting with # OR not starting with blank ( which means any line including ones starting with #
Using a list of names (over 4000 of them) painstakingly gleaned from the source file years ago for a database file, I want to match the names against the source file so that they can be updated with the tags <forename></forename> in the original source file.
I placed the list of names in @forenames (only posted a few of them here).
Perl script is:
I am able to get the name bracketed by the tags to appear on the console screen but don't know how to apply the output to the source file. Perhaps I need to do a match on the words then some kind of edit to surround the matching words with the xml tags? I'm a rank novice doing this as a labour of love for a friend.
I want to use regular expressions and sed to remove html tags from a text file.
View 2 Replies View Relatedhow a way that I can edit the metadata tags on some MP3s using regex?
I've got almost 100 MP3, all named "01 - <song title>," "02 - <song title>," etc., and I, understandably, don't want to edit them all by hand.
Running "s/d{2} - //g" would be so much easier.
I've began to develop with C++ (Eclipse+Qt) and the first problem I see is that there is not good functions for manipulating strings. if there is a library for manipulating strings with regular expressions?
View 1 Replies View Relatedi am trying to create an exclude regular expression for my build.xml. The problem is, that i am trying to find some info on which REs are acceptable/valid for ant... Is ant using standard regular expressions? POSIX ones? Since it is a java-based tool, the "Java REs" are probably valid. I am a little bit confused. If somebody can help me out with the different RE standards, i would be most obliged.
View 1 Replies View RelatedGidday, I'm puzzled as to why this works:
Code:
find /Data/ -type f -iname "*7pm*"
But this doesn't:
Code:
find /Data/ -type f -regex *7[Pp][Mm]*
I've tried MANY variations, but I'm getting no error messages, just no returns, and yet the first find, will find the sorts of files I'm looking for. I realise a win is a win, but I'm of the understanding that the -regex switch allows for some really complex use of regular expressions - but I can't even get a very simple one to work,
The * would not have to be because it means everything [az] [0-9 ][$%&!"/()=?'=) but not how to solve[URL]..
ls [0-9a-zA-Z]*[@]*[gmail | yahoo | hotmail]*[.]*[com]
ls [0-9a-zA-Z][.-_][0-9a-zA-Z]*[@]*[gmail | yahoo | hotmail]*[.]*[com]
How do I make Vim use extended regular expressions?
I really wish I wouldn't have to use all these ugly backslashes to do backreferences.
I'm attempting to search through a rather large assortment of html files created in Word using 'save as html'. what I'm trying to do is find and delete these tags (they're causing browsers to display black diamonds with white question marks):
<span style='mso-spacerun:yes'> </span> Tags contain from 1 to 4 spaces between opening and closing. I get positive results from this:
grep <span style='mso-spacerun:yes'> filename.html but once I attempt to tell it to match any number of characters up until the next '>' symbol, it tells me I'm using an invalid regex: grep <span style='mso-spacerun:yes'>[^>]+> filename.html
I've been nose-deep in regex tutorials for the past day or so, and I'm still not understanding why this doesn't work. If I put the pattern (without backslashes) into a separate file and use `grep -f patternfile filename.html`, I get no error but no matches either. So far as I can figure, the above regex boils down to:
Match the string "<span style='mso-spacerun:yes'>", followed by any number of characters that are not a ">", followed by a ">". If someone could tell me where I'm going wrong with this,
I have this..
RewriteRule ^(apes|ape)/(.*)$ $2?fh=$1 [L,QSA]
I only want to match the directories ape/ and apes/ but I think it is matching any directory that ends in "ape" or "apes" or maybe does it match any string containing those characters in any order? I am not great at regex, and have read alot, but still not sure if I understand this correctly.
I have a list of urls like code...
How can I use grep to match the domain names only?
All the urls have a / after the domain. And there are a lot of tlds, not sure how many, the list is quite big.
Is it possible to point to a two-digit interval with regular expressions?
Background:
I'm using mplayer to watch tv shows, that often have episode numbers in their names. I know how to easily add several files to the playlist by using brackets, by typing something like
$ mplayer tv/South.Park.S1.E0[1-5]*avi
Is there a way to point to files 06-13 in a single expression?
Here's my need:
If Calendar Day= Then FCPeriod= And FCYear=
Jan 11 to Feb 1002Current Year of Feb 10
Feb 11 to Mar 1003Current Year of Mar 10
Mar 11 to Apr 1004Current Year of Apr 10
Apr 11 to May 1005Current Year of May 10
May 11 to June 1006Current Year of June 10
June 11 to July 1007Current Year of July 10
July 11 to Aug 1008Current Year of Aug 10
Aug 11 to Sept 1009Current Year of Sept 10
Sept 11 to Oct 1010Current Year of Oct 10
Oct 11 to Nov 1011Current Year of Nov 10
Nov 11 to Dec 1012Current Year of Dec 10
Dec 11 to Jan 1001* Current Year of Jan 10
* Note for Dec 11 - Dec 31,
The Next Year to be used
IE: Current date is Dec 28th, 2010. Year = 2011
IE: Current date is Jan 8th, 2011, Year = 2011.
Looks like I'll need a case statement with some regular expressions...
What I am doing is reading the text from a text document and storing all of the text inside of a ArrayList. I then set one of the values of the Arraylist as a string. I want to use regular expressions find out what the first two characters of the String are. if first two characters = "//" then function(); I only care about the first two characters though. If you need any more information, just ask.
View 4 Replies View RelatedI need to use sed to edit a file that contains just one line. This should be pretty simple, but I've googled and can't seem to figure it out. I need to match everything from a certain string up until the first comma in the line. There are multiple commas in the line and my matching pattern is matching up until the last comma, not the first.
Here is what I'm trying:
As you can see it is matching up until the last comma. Seems like the .* is matching any character including the other commas. The output from this that I am hoping to achieve:
How can I get the regular expression to match from asdf: up until the first comma?
I have made this:
Code:
from urllib import urlopen import re
current_site = urlopen("http://www.krak.dk/").read()
search = re.findall("((http://|https://|ftp://)|(www.))+(([a-zA-Z0-9.-]+.[a-zA-Z]{2,4})|([0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}))(/[a-zA-Z0-9%:/-_?.'~]*)?", current_site)
[code]....
I only want to match complete URL's. how do i avoid matching the fragments ?
#!/bin/csh -f
source /xxxx/
set $spicedir = `sim_${1%.*}`
printf "Output folder is %s
" $spicedir
toolcommand -i $1 -o $spicedir
(-o output directory, -i =input)
I got spicedir: Undefined variable in my xterm
How could I match a file name without extension in csh
I am supposed to take some small files, and print them to a specific printer, such that the small files are concatenated into one file. The file name has to be included in the file that gets printed.
Should I be looking to concatenate the files into one file with the file names included, and then print them?
something like: -printfunction -printername < file*
I have a considerable number of files in a subdirectory (some fascinating old military clips from archive.org - search on Big Picture if interested). Anyhow, I am downloading them using Internet Download Manager running in an XP virtual machine in VMWare on my Ubuntu 10.04 PC (due to the queuing, restart and speed capabilities of IDM). But I digress - the files are being saved on the host (Samba share) without a file extension. So I have a collection of files with names like
Quote:
The Douglas MacArthur Story
THEY WERE THERE (1960)
I wish to add the extension ".mp4" In Windows this is simply done with the command
Quote:
rename *. *.mp4
This of course does not work in Linux. I have researched the Linux rename command and reviewed a lot of examples. However, I have not found a way to add an extension to a batch of files which are named with no extension to start with. The spaces in the file names also seem to present an issue. At the moment I am renaming them from the Windows VM while they are sitting on the Samba share using the ancient File Manager program from Windows NT which works great on XP. I have experimented with the file rename facility in Gnome Commander however, it does not seem to want to do something so simple.
Generally SSH related log messages are logged in /var/log/messages file. Is there a way to log them in another different file? I mean is there some configuration setting to enable this?
View 7 Replies View RelatedI use the command line frequently to navigate my files so I try not to have spaces in file names. Typically I have used an underscore to connect words but it was recently suggested that I should use a dash. Are there any disadvantages to using an underscore in file names?Should I switch to a dash? My system is running Xubuntu and I almost exclusively use the bash shell.
View 4 Replies View RelatedI am running gentoo openbox(rox file manager and desktop) I installed Digikam and Amarok. But I have problems with files which include special character in their names(such as �,�, �,ğ... ) The files are shown with strange and weird characters in the file dialogs of Digikam and Amarok.
I don't have this problem in other applications. I can create files with special character included. I think some settings do not agree with KDE4. How can I solve this problem? Does anyone have an idea? I also installed KDE systemsettings program but could not find a relevant config option for character encoding.