General :: Using SED Or AWK To Cut Data From A File Between Certain Characters
Oct 23, 2010
I would like to create a .ksh script which cuts certain data from a document. I have tried to use SED and AWK and piping but its been some time since i have operated on Linux and my memory is patchy.
What command could I use in terminal to delete all ASCII characters? That is, delete a-z, A-Z, 0-9, and all punctuation? I have a file containing Chinese characters, and I want to remove everything else and leave just the Chinese.
I can use grep to leave only the lines that have Chinese in them, but this still leaves a lot of non-Chinese stuff on those lines. Does anyone know how I could actually remove everything that isn't Chinese?
I would like to modify the content of a text file in Linux, in the following way:=> the file has several of these lines:./run_pest3 ./g134366.04080_0.062 x 2_d043 1 0.43 results_EC=> I want to modify all lines to be:./run_pest3 ./g134366.04080_0.062 x 2_d043 1 0.43 results_EC0.062i.e., the last number of $2 should be "attached" to the end of $7, for each line.
I have a problem with both genisoimage and mkisofs. Both of them are limited to 8 characters. There are very many options for them. Which one would remedy the issue?
Simple logging script that allows user to enter quick notes and questions, but I can not get it to pass punctuation '?. no matter what I type after 'n' i need that to be inserted at the end of the working project note file.Any help and working examples would be appreciated, but please also direct me to the proper reading material so I can learn something - not looking for someone to just do it for me.usage:
I have txt file with list of ID's and I need to insert comma in every line and then remove new line character so it'll become one long string. So to clarify, I have txt file content that looks like this.
234 5466 2356 ... and so on.
but I would like this to change to 234,5466,2356,... I looked at sed and tried to wrap my head around the commands but I guess my brain isn't smart enough. its really confusing for me. I've managed to add commas to end of line (sed "s/$/,/g" filename) but somehow I can't seem to remove new line character from each line.
I am running gentoo openbox(rox file manager and desktop) I installed Digikam and Amarok. But I have problems with files which include special character in their names(such as �,�, �,ğ... ) The files are shown with strange and weird characters in the file dialogs of Digikam and Amarok.
I don't have this problem in other applications. I can create files with special character included. I think some settings do not agree with KDE4. How can I solve this problem? Does anyone have an idea? I also installed KDE systemsettings program but could not find a relevant config option for character encoding.
Anybody know and command to count any characters in file? I would like to know the total number of character " (quote) in file. My idea is to check if in a script the total number of this character is pair.
While modifying the definition of my PS1, I saw that "[" and "]" markers should be added to help bash to compute the right display lenght. Many exemples on the web do not use them or even mention them.I searched for a solution to add them automatically, like with sed, but I didn't find any example.Are they still needed and is there a recommandation not to use sed to define PS1?
I know GEdit has a bug which prevents it from opening a file with null () characters in it. This is a huge inconvenience for me because I frequently have to open big log files with only a couple rogue 's in them.
Sometimes I just run a quick tr -d '' < file.log > file.log.correct and open the correct file. This is a big nuisance. I would like to have maybe an external tool in GEdit that would execute the above command. I tried writing an external tool action (GEdit plugin) using just:
#!/bin/bash tr -d ''
Input is "current document", output action is "replace current document". But this isn't working. When I open the file, GEdit shows the familiar red warning; activating the external tool with the warning showing apparently has no effect (I think the script is being called but its input/output are not set).
I am having difficulty getting sed to replace a string of text in an XML file, despite the fact that I have no trouble using grep to find that same string. Since the new string and old string to be replaced contain a lot of special characters, I thought it best to store them in variables as opposed to using a slew of backslashes:
I was wondering if someone can help me... I have a file that I am trying to LFTP from my Linux server to my Mainframe and the file is uploading but the data looks slightly corrupted..First off let me tell you the following text is French and it contains accent characters.Below is how the data looks when you view the data through vi or on the Mainframe: factureNum
Note how the accent e is turned into a uppercase accent A and a copyright symbol.When I view the PC file in hex through Notepad ++ the turns into and the result is one character turning into two characters is a hex C3 and is a hex A9.
When I LFTP the file to the mainframe the characters are transferred and I end up with a hex 66 and a hex E4.The end result I am after is that I want to transfer my linux file, who's data looks like this on the linux server: factureNumro and like this when viewed on a PC: factureNumro to my mainframe and I want it to look the same as when I transfer it to the PC. Like this: factureNum.
We're in the process of implementing an offsite backup of all our servers to a remote Linux server. We're using rsync over ssh.What I've found is that characters such as ±, ¶,´ and £ are replaced on the Linux server with underscores.I don't mind if it changes these characters in the filenames of documents, but when it renames a language pack from Espa±ol.clx to Espa_ol.clx, it could cause issues for us further down the line.
What do I need to do differently to make the special characters copy over correctly? For the initial sync which will take place locally, before the machine is moved offsite, I have SAMBA enabled. I am able to copy files from Windows to the Samba share, retaining the original filename, though it looks different in the Linux directory listing, i.e. t̻st becomes ĻstThese files get deleted by rsync when it runs, as it does not match the filenames.
I have a log file that contains information like this:
---------------------------- r11141 | prasath-palani | 2010-12-23 16:21:24 +0530 (Thu, 23 Dec 2010) | 1 line Changed paths: M /projects/ M /projects/
[code]....
what i need is, i need to copy the data given between the "---" to seperate files, for, e.g. the first set of data between the "---" should be in one file and another set of data in another file.
I am new to shell scripting.What i am trying is to write a shell script which take the input file and output should like as mentioned below.Output file should have data till SOK (marked in red)from every second line and then the selected data(marked in green) from 4th line.So selected data from 2nd and 4th line in one line of O/P file and then similarly selected data from 6th and 8th line in second line of O/P file.Input File:
I just installed Fedora12 in a Core i3 machine... everything looks fine, but I have a huge problem... every time I upload a file (using ftp or sftp) some wier characters are included inside the file... for example.
We purchased a new database system at work last October, ditching the old system because of a lack of support from the vendor. This is a retail Point of Sale and Backoffice database system. I am not sure what system the new one runs on, but the system we replaced was a Firebird data base. The reason I am posting is because we are now in need of the information contained in the old database which was not completely imported into the new system.
Basically the problem is this: The database in on a Windows XP system and I found a copy of SQL Manager Lite 2008 on the system, which after quite a bit of studying, I figured out how to extract the database into a removable file. I have this file (178MB) on a USB stick in a file called Backoffice.fbd.
My studying suggests to me that I can get into this database with MySQL. I have never used this and have no clue how to do this. All I want to be able to do get into the database and create tab deliminated spreadsheet files for each of the database sections (Customers, Repairs, Sales History, stock files, etc.) Is it possible to do this with Ubuntu and MySQL and if so, can expert suggest one or two things to get me started. While a guided tutorial would be nice because I am not an expert, I am willing to learn on my own if someone could point me in the right direction.
The first file is a "key" of second file. In the second file, the first word is the key of each line. Each key and sentence in second file ONLY have one line. The Second File have many lines with key, but not all the key is shown on file1, but file1's key MUST in the second file. How can I get the result like this: (Need to sort by the key from File1)
I'm confused about "hard link" feature. I've been learning from my UNIX Academy DVDs training that hard links to a file can be many and each of them is an effective filename for the associated data. So let assume that we have some very sensitive data in a file and we want it to be deleted and file has 20 links. I "delete" a file, but in fact I deleted only one "name" of it. My understanding from the training that data is still there until we delete the last associated hard link. But how can I find the names of all of them? If we have the names, they can be removed one by one. Or may be there's command that can trace all the "names" and remove them at once?
in my php page they ask the user to enter some input example like year. i want that input to be transfer into my .sh file and will show its output. how to make .sh file receive data from php and php sent data to .sh file
We planned to migrate data files in Unix to Linux. The file in Unix is in big endian data format where is linux is configured as little endian byre structure. This is causing problem in data computation.
How data can be ported to linux ( converting big endian to little endian).
How linux configured can be configured for big endian byte structure.
I'm trying to write a shell script which finds bits of data from a text file. at the moment i'm using grep and basically i need a function which will look through the text file and take the data out of it. the file has days, months, years etc and i want it so i can type feb 06 and it finds all of the data for feb 06.
the problem i have is i can type feb and all the information comes back for feb, but i can't get it more precise e.g. feb 2009 and it finds just feb 2009, it seems to ignore that latter half. I've tried experimenting with egrep and having two inputs but i can't seem to fuse them together, it only takes the first input.
I would like to write a script that pulls the last line of data out of a txt file and then saves it to another txt file. The txt file that I am looking at resides here. [URL]
I know I can grab that file using wget. I've done a little scripting but nothing major.
basically the situation I'm in is someone mistakenly expanded an NAS without unmounting the drive on the server. This corrupted the superblock and its apparent that all the backups are no good. The drive in question was expanded from about 800gigs to 1.8TBs, its done via an NAS.
At this point I'm most concerned about getting the files off the drive, I can deal with resetting the file system but I really need those files. This happened within a week of me joining this group so I'm kind of doing damage control here, backups were not taken of this particular drive.
Im having xeltek eeprom programer but I cannot read the chip data on the buffer file, when I read the chip using programer the datas are being sent to the buffer I can just see the adress line ,hex line and ascii line then I dont know which is the exactly data ,
I manually rotated my catalina.out file, and now the file jumps to 30+Mb and when I try to view it, less tells me it might be a binary file. It sure appears to have binary data in it, about 30meg of it.
I did the rotate via a copy: copy catalina.out to another file cat /dev/null > catalina.out
I have tried using echo: echo "" > catalina.out ...also with the same result.
This application isn't something I can just bounce when necessary. It kind of appears that the original file is still there - sort of. But is it not readable text anymore.
SunOS 5.10 tomcat 5.5.26 (version required by app vendor)
I have accidentally removed vmware virtual disk, my host operating system is RHEL5.2 with ext3 file system, i have used photorec, magicresue and foremost but still no luck to recover the vmdk file. i have seen in foremost configuration file that there are some predefined files (ex- doc, pdf, jpg, avi, zip, etc),
1. is there any way to add vmdk file extension on that configuration file?
2. if yes how can i do ?
3. by adding vmdk on configuration file, can i specifically use recover option for vmdk?