Programming :: Deleting Unwanted Characters In File
Mar 31, 2010
Currently, I'm working on personal project. and I'm kinda stuck. What I want to do is that open a file, and edit that file (deleting unwanted characters). The problem arises after I deleted unwanted characters, the file still has the same length of the original one. Let's assume that we have a file with "1234" in it. I deleted "3" ( I overwrite "\0" ) so now when I check the file, it's 124. But when I check the length, the both have the same size as 4
Here is an example source code
int length, length2;
num = open("a.dat", 2)
length = lseek(num, 0, 2); // Initial length
lseek(num, 2, 0); // editing
write(num, "\0", 1);
length2 = lseek(num, 0, 2); // Final length
close(num);
When I print those values those are exactly the same. Length2 should be one less than length, but the both are 4. What's wrong in m code? Am I supposed to use different character rather than "\0"?
I have a very, very large log file (360MB) that I'm trying to thin out. As it turns out the majority of this file has entries that aren't necessary so I'm attempting to build a command that will strip these out. The following command works to display only the data that I do not want:
This displays exactly the data I want to delete from the file by displaying the expression and six lines above it and five lines below it. However I'm at a loss as to how to remove this data from the output and display everything else. I looked into the -v option with grep redirecting the output to a new file:
However it doesn't work, the new file is the same size as the old one. What am I doing wrong? Is there a better method of doing this? I'm a bit out of my element since the method I'd normally use can't handle files of this size.
For example, I have a file called "file" like this one: type=strongsubj len=1 word=absolve pos=verb stemmed=y priorpolarity=positive type=strongsubj len=1 word=unique pos=adj stemmed=n priorpolarity=neutral type=strongsubj len=1 word=absolutely pos=adj stemmed=n priorpolarity=neutral type=weaksubj len=1 word=taking pos=verb stemmed=y priorpolarity=positive type=weaksubj len=1 word=friend pos=noun stemmed=n priorpolarity=positive type=weaksubj len=1 word=usually pos=adverb stemmed=n priorpolarity=positive type=strongsubj len=1 word=purecolor pos=anypos stemmed=n priorpolarity=negative type=strongsubj len=1 word=accusingly pos=anypos stemmed=n priorpolarity=negative
I want to add the plural for the noun, for example if find this line: type=weaksubj len=1 word=friend pos=noun stemmed=n priorpolarity=positive will add one more line : type=weaksubj len=1 word=friends pos=noun stemmed=n priorpolarity=positive where we add "s" for the word friend I did try to do like that: <code> cat file | while read LINE ; do
set -- ${line} if [[ "${4#pos1=}" == "noun" ]];then #I tried this line but it doesn't work properly: v3==$(echo $line |sed 's/$3/$s') #I want to find the third word "word=friend" in that line and add "s" after that word # I don't know what command to add this new line "$v3" to the file ??? done </code>
What command could I use in terminal to delete all ASCII characters? That is, delete a-z, A-Z, 0-9, and all punctuation? I have a file containing Chinese characters, and I want to remove everything else and leave just the Chinese.
I can use grep to leave only the lines that have Chinese in them, but this still leaves a lot of non-Chinese stuff on those lines. Does anyone know how I could actually remove everything that isn't Chinese?
Often in bash we read lines from stdin in a loop and implicitly discard the remaining stdin by terminating the loop. Is it possible to discard it without terminating the loop? It could lead to smaller code.
Here's an example which uses two loops and below is the same algorithm assuming unwanted stdin can be discarded
I would like to know how I can get the ouput from the following dmidecode command in example 1 to look like example 2 without having to grep -v all the unwanted lines.Is there a way in awk or sed?Example 1
Code: Processor Information Socket Designation: Socket 1 CPU 1
I am using g++ 4.5.2 I copied and tried a piece simple (Making a Class Writealbe to a Stream) program, from page 363 of book(C++ cookbook), Example 10-6 your can download and test by yourself [URL]
I have a few problem. I have a txt file that convert from pcap to txt file. What I want is to eliminate unwanted text from my txt file. Here is the example of the what I want to do:
There are these shortcuts in the GTK3 file chooser: [URL] ....
With the exception of "desktop" which I included in the red box by mistake, I don't want those shorcuts to be there. I've even deleted most of those folders from my home directory because I have no use for them, but the shortcuts remain even after the folders are gone.
How to remove/disable these shortcuts? And while I'm at it, I notice that it's not using my selected icon theme for those icons. Any way to make it use my choice of icon theme?
I just installed Fedora12 in a Core i3 machine... everything looks fine, but I have a huge problem... every time I upload a file (using ftp or sftp) some wier characters are included inside the file... for example.
I'm trying to get my program to go through the string typed in by the user and strip it of EVERYTHING but the numbers. I can't place my finger on what I'm missing.
Code: Code: #include<iostream> #include<string> using namespace std ; int main()
I tried running a back-up/restore script in a WordPress install to migrate from one server to another... long story made short, I ended up doing it manually and all is well on that front
The one remnant from that botched script is that it tried creating a directory 'wp-backup' and then a file inside that directory - but it tried using '' instead of '/'. So what it created was a file named 'wp-backupindex.php' with a file size of 0 bytes.
The problem is thus: I can't change the permissions nor delete the file, because of the invalid file name. I don't have direct shell access (that cost *extra*, of course) and every time I try with the web-based file manager (Quixplorer) it sees it as 'wp-backupindex.php', as though the '' is acting as an escape sequence in the file name. Same thing in FileZilla, I can't do anything to the file without it complaining about the invalid file name.
how to ixnay this one file given the limitations above (no shell access) short of calling and bugging tech support to delete the file for me?
I'm working on my ncurses application, written in C. I get user input through a loop which uses getchar(). I was able to recognize Ctrl-n by comparing the keypress to ASCII character 16, and this seems to work fine. However, if I noticed that the ASCII character for Ctrl-j (10) is the same as the Line Feed. I tested this, and if I press enter on the keyboard I get the same ASCII value as when I press Ctrl-j.
So, what do I do if I want Ctrl-j to mean something different in my program than pressing enter?The ncurses terminal mode is set to raw, with a 100 millisecond timeout, and keypad is on (I'm already using the up and down arrow-keys).
I see I'm finally posting an AWK question rather than an answer for a change I wanted to make an AWK script that would scramble all the characters in each field, but leave the first and last characters where they were.
In a file i have to grep for a particular word and cut 8 characters of that word and replace the last characters with space if it is _1.Eg: HP4350_1..i did grep|cut -c 2-9|but didn't know how to truncate the last two characters if its '_1'.i used tr '[_1] '[ ]'.but it replaced all the characters where there is a 'underscore' and 1 instead of'_1' together.
I'm trying to make a webpage that will display the bash variables I have in a file. These variables are used in a bash script that is run from on my server.The file looks like this:
I just started using eclipse. Ie, I followed all the instructions to set up C++ and run a simple hello world program.However, I seem to have hit a snag.When I build the solution I get an error. I realized where there should be a > there is a | instead. Every time I type > the | prints instead and I have no idea how to fix this.
I wrote a java program that writes strings to a file. The strings contain foreign language characters. When I run the program in Windows, the output file shows the foreign characters. However, when I attempt the same operation in Linux, the output file shows a white question mark in a black background instead of the foreign characters. The same Linux system could display the foreign characters if I copy the output file from Windows to Linux. I tried to create the output file using gedit that my program would then add additional strings to and chose Unicode-32 for encoding but still the same problem.
What could I do to get the program to display the foreign language characters from output text file?