Programming :: Remove Odd Characters From CSV Files Using A Script?
Apr 27, 2010
I hope you can help. I have a collection of spreadsheets with data that needs to be imported in to SQL. The data has been manually entered although there are portions where data has been copied and pasted from the web.
When converting these sheets to a CSV I get strange characters where it looks as though data has been copied and pasted. Is it possible to write a script (AWK?) to pull out these characters?
I guess the script will need to keep alpha characters, spaces, numerics and commas but nothing else. How easy is this to do?
View 7 Replies
ADVERTISEMENT
Feb 20, 2011
I am doing a mysql query with a bash shell script like:
mysql translator -u root --password=******** -e "SELECT word FROM tagalog ORDER BY RAND() LIMIT 1" | while read line; do
echo $line
So when I echo the value of $line I get:
word
magandang umaga
"word" is the name of the row in the table and maganda umaga is a randomly selected choice from the row. Is there a way i can remove the name of the row from the variable $line. With a result that will allow me to echo $line and output only the randomly selected entry in from the row e.g. magandang umaga
View 13 Replies
View Related
Jun 3, 2010
I'm having a bit of a headbanger trying to work this one out. I'm trying to remove all of the characters on a line apart from the last 17. For example, I need to change this:
Code:
00000000000000000089;0bbfaeb8
01000000000000000089;0bcb5948
00000000000000000089;0bcc4c40
[code]....
View 5 Replies
View Related
Nov 7, 2010
I am using 'sed -e /foo/d' to match lines which I want to delete from a file. I discovered I have some lines which contain random (extended?) characters like 'ủ' which I would also like to delete. The lines in the file should only contain alpha numeric characters.
View 8 Replies
View Related
Apr 14, 2010
How can I remove characters from grep output using sed? code...
View 9 Replies
View Related
Jan 30, 2011
I am reading strings from a file using readline() function,the file contains some strings which has only special characters, I need to avoid the strings which has only special characters, the special characters are not similar. How to do it in python.??
View 2 Replies
View Related
Sep 18, 2011
I have a directory (Linux user) with a number of files which contain an added [!] to the end of each file name so that each file reads out as:
foo something [!].zip
bar something [!].zip
helloworld [!].zip
etc.
What is the quickest way to batch rename these to remove the ending [!] character combination from these file names?
View 2 Replies
View Related
Jul 26, 2010
I am working with a Tcl script and have some strings in the following format (RE):
[a-zA-Z]+[0-9]{6}-[0-9]
There are some leading letters, combinations of capital and lowercase. Then six digits, followed by a hyphen, then one more digit. I would like to remove all of the leading alphabetic characters from the string. The resulting string would then be in this format: [0-9]{6}-[0-9]. In other words, six numeric digits, a hyphen, then one more digit.
I have tried:
Code:
set newstr [string trimleft $origstr alpha]
But that only removes the first alphabetic character, not all of them.
I couldn't get anything with regsub to work correctly, but I am somewhat of a noob with RE's in general and regsub in particular. There are usually 5 leading letters at the beginning of these strings, and I could in most cases get away with using string replace and constant indices to extract the substring. However, my preference is for this to be robust enough to handle all cases with 1 through n leading alphabetic characters.
View 3 Replies
View Related
Jul 19, 2010
I have a bunch of files that I need to rename, ordinarily this is pretty easy task. The problem here is that the file names have Chinese / Japanese characters. ie [$$$$$$$$].SOMETHING BLAH BLAH.ext Where all the "$$$$" are insert Chinese characters. The problem is that sed or perl doesn't seem to handle the Chinese characters correctly so using a regular expression like this 's/^[*.]//' which would normally work doesn't. From what I have read so far I believe these characters are double encoded UTF-8 (not 100% sure) which could be the problem. So far I've tried numerous different regex's as well as playing around with convmv to see if I could convert the filenames to just single encoded characters but I've had no luck.
View 1 Replies
View Related
Apr 8, 2010
I was preparing a script which will remove all my files from directory which are 24 hour old.I tried some thing like thisfind . ( -name 'log.*' -mtime +1 ) -exec rm {}; but it is throughing error like : missing argument to exec.
View 8 Replies
View Related
May 25, 2010
Thanks y'all for the great script and explanation. This helped a lot in my own project. I thought I'd share the efforts.The project is this: I've got lots of duplicate JPGs from all the family members who've named the same photo with different names. Since md5sum generates a "fingerprint" based on the file contents, not the name, I want to use the md5sum of each jpg to uniquely name each photo and also remove exact duplicates.
It has the following flaws:
0) it doesn't handle certain non-alphanumerics
1) it keeps both photo-shopped and unaltered photos (different md5s)
2) it (currently) doesn't preserve descriptive filenames.
(For me, removal of duplicates is more important than keeping the filenames. I may change that to concatenate the md5 and the filename.)Please note that the commented "rename" command should be used to strip non-aphanumerics from the file names, and the script should be launched with the commented "find" command.
View 1 Replies
View Related
Nov 9, 2010
However, the ffmpeg command generates a temporary file blahblah.mpg.tmp of about 1GB per hour of transcoded video.My issue is that I can't seem to delete these files automatically from any bash script.Now from the command line, I can cd to the directory and just rm -f *.tmp and they get deleted. However, from my script, that same command doesn't remove those files. I thought maybe the file was in use, so I put a sleep command in for like an hour before the delete happens, but it still fails. I also put rm -f /mnt/mythtv/*.tmp in a root cronjob and it still doesn't delete the files.
If I just rm *.tmp I do get a prompt about "Are you sure you want to delete this write protected file?". But the -f switch seems to work fine as a normal user from the command line and just delete them.Does anyone have an idea how to troubleshoot this problem? The particular filesystem that the tmp files get generated on is on it's own xfs partition mounted as /mnt/mythtv.
View 8 Replies
View Related
Dec 13, 2010
file = TT.ParlayX_RequestLog_78653_20101212180044.log.17490
1. Want to remove the characters before the first dot (.) including the dot (.)
2. Want to remove the characters after the last dot (.) including the dot (.)
That is, basically, I want the output as:
ParlayX_RequestLog_78653_20101212180044.log
View 7 Replies
View Related
Oct 19, 2010
I want to delete all files within a specific folder without actually deleting the folder, what is a good bash command for this?. I found this one but encountered some errors even though I am executing it within the specific folder:
useratdebian:/home/user/folder# find . -type f -exec rm -rf {} ;
[1] 5052
useratdebian:/home/user/folder# find: missing argument to `-exec'
[1]+ Exit 1 find . -type f -exec rm -rf
The command as it appears is:
find . -type f -exec rm -rf {} ;
how to delete only the files contained within the folder called "folder" for example?
View 4 Replies
View Related
Jun 5, 2009
I want to remove duplicate or multiple similar lines from multiple files. I.e. if I have four files file1.txt file2.txt file3.txt and file4.txt and would like to find and remove similar lines from all these files keeping only one line from these similar lines. I only that uniq can be used to remove similar lines from a sorted file.
View 9 Replies
View Related
Feb 21, 2011
I need a shell script which will search and remove a javascript from all htm, html and php file.
Code:
<script type="text/javascript"> if (navigator.cookieEnabled) {var user = getCookie("seostop");if (user !=1){winchristop();setCookie("seostop", "1", 7, "/");}}function setCookie(name, value, expiredays, path, domain, secure) { if
[code]....
This is the script that i want to remove. I don't want to change the ownership of any of the file from which this script will be removed. Just wanna remove this specific line from all file in all directories.
View 2 Replies
View Related
Feb 16, 2009
I'm working on my ncurses application, written in C. I get user input through a loop which uses getchar(). I was able to recognize Ctrl-n by comparing the keypress to ASCII character 16, and this seems to work fine. However, if I noticed that the ASCII character for Ctrl-j (10) is the same as the Line Feed. I tested this, and if I press enter on the keyboard I get the same ASCII value as when I press Ctrl-j.
So, what do I do if I want Ctrl-j to mean something different in my program than pressing enter?The ncurses terminal mode is set to raw, with a 100 millisecond timeout, and keypad is on (I'm already using the up and down arrow-keys).
View 1 Replies
View Related
Feb 18, 2010
I am trying to create an array containing all ASCII characters, how do I create one:
Code:
#!/bin/bash
CHARLIST=( a b c d e f g h i j k l m n o p q r s t u v w x y z
[code]...
View 6 Replies
View Related
May 12, 2010
I see I'm finally posting an AWK question rather than an answer for a change I wanted to make an AWK script that would scramble all the characters in each field, but leave the first and last characters where they were.
View 14 Replies
View Related
Mar 17, 2010
In a file i have to grep for a particular word and cut 8 characters of that word and replace the last characters with space if it is _1.Eg: HP4350_1..i did grep|cut -c 2-9|but didn't know how to truncate the last two characters if its '_1'.i used tr '[_1] '[ ]'.but it replaced all the characters where there is a 'underscore' and 1 instead of'_1' together.
View 5 Replies
View Related
Nov 12, 2010
how do i delete any single one of the files in my whole disk that say something along the lines to "chromium os"? i.m just wondering because i tried to install through virtual box and failed, and probably have three different partitions consisting of nothing.
View 3 Replies
View Related
Aug 15, 2010
I'm trying to make a webpage that will display the bash variables I have in a file. These variables are used in a bash script that is run from on my server.The file looks like this:
SERVER=canfs01
SHARE=public
USERNAME=guest
[code]....
View 7 Replies
View Related
Mar 3, 2011
I just started using eclipse. Ie, I followed all the instructions to set up C++ and run a simple hello world program.However, I seem to have hit a snag.When I build the solution I get an error. I realized where there should be a > there is a | instead. Every time I type > the | prints instead and I have no idea how to fix this.
View 3 Replies
View Related
Jan 17, 2011
splitting a string by every nth characters. I'm using Python 2.7.1 because I'm using older libraries, if that matters.
For example, if this is my input:
Quote:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ac elit nibh, vitae venenatis ligula. Vestibulum a varius turpis.
Splitting it by every 10th character should produce this list:
Quote:
[ "Lorem ipsu", "m dolor si", "t amet, co", "nsectetur ", "adipiscing", " elit. Sed", " ac elit n", "ibh, vitae", " venenatis", " ligula. V", "estibulum ", "a varius t", "urpis." ]
View 6 Replies
View Related
Feb 19, 2011
I wrote a java program that writes strings to a file. The strings contain foreign language characters. When I run the program in Windows, the output file shows the foreign characters. However, when I attempt the same operation in Linux, the output file shows a white question mark in a black background instead of the foreign characters. The same Linux system could display the foreign characters if I copy the output file from Windows to Linux. I tried to create the output file using gedit that my program would then add additional strings to and chose Unicode-32 for encoding but still the same problem.
What could I do to get the program to display the foreign language characters from output text file?
View 6 Replies
View Related
Nov 11, 2010
My goal is to send escape characters from Linux to make scanner's LED blink. I've started with a simple "beep" command:
echo -e "�7"
and it worked. We are using WaveLink emulator and the escape sequences for the LED are
echo -e "�33%150;200;5L"
Linux returns me this: 150;200;5L
So, it doesn't work. What am I doing wrong for the LED sequences?
View 1 Replies
View Related
Mar 8, 2010
How can I filter ASCII quotes( ' ) and double quotes ( " ) so that I can replace them with the UTF-8 equivalent?If I copy text from a Word Document(ASCII), and upload it to a web page with PHP. The Database(UTF-8) will replace these racters with incorrect character(s).I need some function that will replace these characters but I don't know how to differentiate the ASCII quotes and the UTF-8 Quotes without (somehow) converting the string to hex, then preg_replace'ing the hex code for the symbol.
View 8 Replies
View Related
Oct 13, 2009
I am working on an application that will convert English text into equivalent Indian language text. Since Unicode is the standard, I will be using it. In most of the western languages each code-value directly refers to the glyph index and placing the code-values side by side will give the required display. This one to one mapping is not possible in Indian languages where rendering syllables is required rather than rendering just consonants and vowels. Many of the complex characters are made up by combining several unicode values.
My question here is: How Linux renders this Unicode text correctly? More specifically, what package is used? I believe in Windows they use Uniscribe for rendering.I believe there will be an operating system library for handling the text rendering. Or do I need to write my own rendering engine? How programs like Firefox, GEdit shows unicode text? Do they also have proprietary engines for correct rendering?
View 2 Replies
View Related
Jan 29, 2011
For example, I have a file called "file" like this one:
type=strongsubj len=1 word=absolve pos=verb stemmed=y priorpolarity=positive
type=strongsubj len=1 word=unique pos=adj stemmed=n priorpolarity=neutral
type=strongsubj len=1 word=absolutely pos=adj stemmed=n priorpolarity=neutral
type=weaksubj len=1 word=taking pos=verb stemmed=y priorpolarity=positive
type=weaksubj len=1 word=friend pos=noun stemmed=n priorpolarity=positive
type=weaksubj len=1 word=usually pos=adverb stemmed=n priorpolarity=positive
type=strongsubj len=1 word=purecolor pos=anypos stemmed=n priorpolarity=negative
type=strongsubj len=1 word=accusingly pos=anypos stemmed=n priorpolarity=negative
I want to add the plural for the noun, for example if find this line:
type=weaksubj len=1 word=friend pos=noun stemmed=n priorpolarity=positive
will add one more line :
type=weaksubj len=1 word=friends pos=noun stemmed=n priorpolarity=positive
where we add "s" for the word friend
I did try to do like that:
<code>
cat file | while read LINE ; do
set -- ${line}
if [[ "${4#pos1=}" == "noun" ]];then
#I tried this line but it doesn't work properly:
v3==$(echo $line |sed 's/$3/$s') #I want to find the third word "word=friend" in that line and add "s" after that word
# I don't know what command to add this new line "$v3" to the file ???
done
</code>
View 12 Replies
View Related
Apr 26, 2010
Well, I have a web application in Linux server. All my Java codes are there. FYI, whenever user entered non-ASCII characters(e.g. ∞,�,�) in a text field in my web application, and I check the log of my Java code in Linux server, it returns weird characters.
Suppose user entered ∞ in the text field. I should get ∞ in my log too. However, I got weird characters in return.
View 14 Replies
View Related