Say I want to write some of the more exotic Unicode characters to a file, what's the proper way to do it? when decimal integers are involved, we use %d for floating point we use %f and for hex we use %p.What's the equivalent marker for Unicode values that C understands?
I wrote a java program that writes strings to a file. The strings contain foreign language characters. When I run the program in Windows, the output file shows the foreign characters. However, when I attempt the same operation in Linux, the output file shows a white question mark in a black background instead of the foreign characters. The same Linux system could display the foreign characters if I copy the output file from Windows to Linux. I tried to create the output file using gedit that my program would then add additional strings to and chose Unicode-32 for encoding but still the same problem.
What could I do to get the program to display the foreign language characters from output text file?
I am working on an application that will convert English text into equivalent Indian language text. Since Unicode is the standard, I will be using it. In most of the western languages each code-value directly refers to the glyph index and placing the code-values side by side will give the required display. This one to one mapping is not possible in Indian languages where rendering syllables is required rather than rendering just consonants and vowels. Many of the complex characters are made up by combining several unicode values.
My question here is: How Linux renders this Unicode text correctly? More specifically, what package is used? I believe in Windows they use Uniscribe for rendering.I believe there will be an operating system library for handling the text rendering. Or do I need to write my own rendering engine? How programs like Firefox, GEdit shows unicode text? Do they also have proprietary engines for correct rendering?
From time to time, new characters are added to the unicode standard.For instance, in 2008 a capital sharp s (upper case form of German eszett)was added at position 0x1e9e.What actions need to be taken in order to make the new character part of the various fonts we use on our desktops?
While modifying the definition of my PS1, I saw that "[" and "]" markers should be added to help bash to compute the right display lenght. Many exemples on the web do not use them or even mention them.I searched for a solution to add them automatically, like with sed, but I didn't find any example.Are they still needed and is there a recommandation not to use sed to define PS1?
In previous versions of Fedora I was able to do Ctrl + Shift + U, enter the Unicode number - i.e., 20ac, press Enter and get a euro character. In Fedora 12 I do not have that feature. My language is US English.
I'm using openSUSE 11.2 with GNOME dual-booted with Windows 7, been installed from scratch for like a week. The bottom line is: Nautilus displays a series of matrices, "x"s and other symbols instead of characters in Hebrew.
Screenshot:
Now, it worked fine at the beginning but once I started installing updates it went. I installed a whole bunch of updates and programs so I don't know what changed it. The weird part is (as you can see in the screenshot) that the shortcut to the left of a Hebrew-named folder shows up correctly only the first time Nautilus opens after starting. So as soon as I closed the Nautilus window after taking the screenshot and reopened it, it also displayed like the others. The screenshot is of my ntfs Windows drive, however the problem occurs in my home folder as well.
On SuSE 10.0 I used to be able to use shift + ctrl + unicode code. That does not seem to work now. How can I get this feature again? I miss it. I used to use it a lot to put the copyright symbol over my artwork in Gimp.
I recently intalled Debian lenny and I'm having issues with some of the unicode characters. Instead of displaying the symbols properly it shows one of the following depending on font/app:
1) Square outline with four letters/numbers arranged inside 2) Just a blank square outline 3) Just a blank space
I haven't been able to test all possible characters, but from a quick check it seems that Cyrillic works properly, Japanese doesn't.A few Google searches later and I'm no wiser on how to fix the issue. Any help?
My terminal shows unicode squares (the little square with it's 2 byte unicode value inside it), whenever I press a control character while running a program (ex. cat or ping).See this example. Here I show the key's I pressed then turn off echoctl, and repeat the sequence. http://imagebin.ca/img/mXbutJ1.png
the 0003 is when I pressed Ctrl+C, and the 001A is when I pressed ctrl+z.Can anybody tell me why this is or how to turn it off. This is inside a gnome-terminal session, though I don't think it's gnome-terminal.If, inside this exact same bash session I open screen (by typing "screen"), it doesn't do this anymore, and ctrl+c/z/etc is completely quiet.
For some unknown (to me) reason, "Ctrl+Shift+u, <unicode number>" doesn't work in F12. I had gotten quite used to this method in order to input several symbols and if you know what you want, it is a lot faster compared to using the character map. This was working in all recent Fedora versions.Does anyone know how to enable this functionality?
To know the function on checking whether a character is ascii or unicode character. From the following [uRL]. The function IsTextUnicode is related to Windows VC++ library. I would like to know the library/function which provides such facility.
I'm working on my ncurses application, written in C. I get user input through a loop which uses getchar(). I was able to recognize Ctrl-n by comparing the keypress to ASCII character 16, and this seems to work fine. However, if I noticed that the ASCII character for Ctrl-j (10) is the same as the Line Feed. I tested this, and if I press enter on the keyboard I get the same ASCII value as when I press Ctrl-j.
So, what do I do if I want Ctrl-j to mean something different in my program than pressing enter?The ncurses terminal mode is set to raw, with a 100 millisecond timeout, and keypad is on (I'm already using the up and down arrow-keys).
I see I'm finally posting an AWK question rather than an answer for a change I wanted to make an AWK script that would scramble all the characters in each field, but leave the first and last characters where they were.
In a file i have to grep for a particular word and cut 8 characters of that word and replace the last characters with space if it is _1.Eg: HP4350_1..i did grep|cut -c 2-9|but didn't know how to truncate the last two characters if its '_1'.i used tr '[_1] '[ ]'.but it replaced all the characters where there is a 'underscore' and 1 instead of'_1' together.
I'm trying to make a webpage that will display the bash variables I have in a file. These variables are used in a bash script that is run from on my server.The file looks like this:
I just started using eclipse. Ie, I followed all the instructions to set up C++ and run a simple hello world program.However, I seem to have hit a snag.When I build the solution I get an error. I realized where there should be a > there is a | instead. Every time I type > the | prints instead and I have no idea how to fix this.
How can I filter ASCII quotes( ' ) and double quotes ( " ) so that I can replace them with the UTF-8 equivalent?If I copy text from a Word Document(ASCII), and upload it to a web page with PHP. The Database(UTF-8) will replace these racters with incorrect character(s).I need some function that will replace these characters but I don't know how to differentiate the ASCII quotes and the UTF-8 Quotes without (somehow) converting the string to hex, then preg_replace'ing the hex code for the symbol.
For example, I have a file called "file" like this one: type=strongsubj len=1 word=absolve pos=verb stemmed=y priorpolarity=positive type=strongsubj len=1 word=unique pos=adj stemmed=n priorpolarity=neutral type=strongsubj len=1 word=absolutely pos=adj stemmed=n priorpolarity=neutral type=weaksubj len=1 word=taking pos=verb stemmed=y priorpolarity=positive type=weaksubj len=1 word=friend pos=noun stemmed=n priorpolarity=positive type=weaksubj len=1 word=usually pos=adverb stemmed=n priorpolarity=positive type=strongsubj len=1 word=purecolor pos=anypos stemmed=n priorpolarity=negative type=strongsubj len=1 word=accusingly pos=anypos stemmed=n priorpolarity=negative
I want to add the plural for the noun, for example if find this line: type=weaksubj len=1 word=friend pos=noun stemmed=n priorpolarity=positive will add one more line : type=weaksubj len=1 word=friends pos=noun stemmed=n priorpolarity=positive where we add "s" for the word friend I did try to do like that: <code> cat file | while read LINE ; do
set -- ${line} if [[ "${4#pos1=}" == "noun" ]];then #I tried this line but it doesn't work properly: v3==$(echo $line |sed 's/$3/$s') #I want to find the third word "word=friend" in that line and add "s" after that word # I don't know what command to add this new line "$v3" to the file ??? done </code>
Well, I have a web application in Linux server. All my Java codes are there. FYI, whenever user entered non-ASCII characters(e.g. ∞,�,�) in a text field in my web application, and I check the log of my Java code in Linux server, it returns weird characters.
Suppose user entered ∞ in the text field. I should get ∞ in my log too. However, I got weird characters in return.
i want to print all ASCII characters kind of like a table, but i really don't have an idea of how to do it, i don't know if there is a built-in method or something to accomplish this, if not
I am doing a mysql query with a bash shell script like: mysql translator -u root --password=******** -e "SELECT word FROM tagalog ORDER BY RAND() LIMIT 1" | while read line; do echo $line
So when I echo the value of $line I get: word magandang umaga "word" is the name of the row in the table and maganda umaga is a randomly selected choice from the row. Is there a way i can remove the name of the row from the variable $line. With a result that will allow me to echo $line and output only the randomly selected entry in from the row e.g. magandang umaga