Announcement

Collapse
No announcement yet.

emoji problems

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    #16
    Hmm. Yeah, and on the website I posted emojis to, I still don't see why it renders normally when I open the site with Firefox, but with Chromium it shows the emojis as empty boxes. Chromium versus Firefox? It makes me wonder how other people actually see all the emojis posted on all the websites! Internet searches simply suggest that if you are posting at a website and if you need an emoji, you can just copy-paste an emoji from any emoji website. Now, I've been doing that using Firefox, and I see the result OK; but when I view my post (at that website) using Chromium, I do not see my emojis rendered right; nor do I see other people's emojis rendered right when I use Chromium. But I do see mine and other people's emojis rendered OK when I do use Firefox. Lots of issues here, not sure how they might be related. Thanks for your input on the above. Really makes you wonder, what IS real when I use an emoji!!! According to the Wikipedia article on emojis, I would not be the only human posing that question.
    An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way. Charles Bukowski

    Comment


      #17
      Something maybe obvious just occurred to me. You can post an emoji from ANY OS/Browser, and if you did, in fact, copy-paste the correct emoji code, then even though you might not be able to see it correctly on the resulting post, other people can see it providing they are on a more emoji-capable system. If I'm making any sense here. E.g., I can't seem to post a basic red heart--it shows up as a black heart. But others may still see it correctly as a red heart on their system. Have I been working too hard at this? Maybe. (I don't even want to get into how I got hooked on all this, which, btw, is a mainstream legit site, a dance site, one of many such, pretty friendly community and fully dressed, more or less, dancers ...) That's a nice thing about being 71. You can do damn well whatever you want to do, and it won't make any difference to anyone ...

      Anyone else master emojis on their 18.04? Do I have to upgrade to 20.04 just to do a little harmless flirting?
      An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way. Charles Bukowski

      Comment


        #18
        wrt to the red and black hearts: the unicode code point is U+2764 HEAVY BLACK HEART, but most of the fonts, not all, on my 20.04 render it as a red heart. DejaVu Sans Mono Book shows it as white on black.

        I suspect this is because there are code points for blue, green, yellow, purple and orange hearts, but no red
        Regards, John Little

        Comment


          #19
          This page is using charset iso-8859-1. That's the standard charset for html 4, pretty old. Emoticons are part of utf-8, a completely different charset.
          That means the browser, or some script, or some other magician has to do some magic to show any emoticon at all, because in iso-8859-1 emoticons simply don't exist.
          So it's very well possible everybody sees something different.
          Not all emoticons are part of a font, and not all emoticons are strictly defined in utf-8. So it's very well possible Apple shows a (little bit) different emoticon as in Windows.
          Not all browsers are able to show alle colors of every emoticon.
          If an emoticon (or any character) is missing in the used font, there's some kind of translation table in the operating system to substitute the character. Of course that's not everywhere the same.
          Long story short: it's a complete mess. That's the reason I don't use emoticons.
          In e-mail it's even far worse.
          If any of this mechanisms don't work, you'll see a box or something like that. Or even two boxes.
          So I'm afraid if you want to use emoticons, the next 20 years you'll be surprised about the rendering.

          (I wrote this out of my head. It's a really complicated topic, so I may have been not completely correct about some details.)

          Addition: there are coming more and more emoticons with skin colours, sexual meaning, etc. I wouldn't be surprised if some countries simply forbid the use of these emoticons. I don't see Iran, for example, allowing an emoticon expressing a homosexual relation.

          Addition 2: On this page https://www.fileformat.info/info/emoji/browsertest.htm you can see a lot of emoticons. How they should render (an image), and three different codings, one of them utf-8. I had a look in Firefox and Chrome on Kubuntu. In both a lot of emoticons are simply missing or showing as one or two boxes. Or not coloured. Or looking different. Or...
          It's a pretty good page to show the problem, because they show an image how it should be and the actually rendering when it's coded.
          And even if the browser shows it as it should be on that page with the utf-8 charset (the most used charset today), it doesn't mean it shows well in for example this forum, using charset iso-8859-1.
          Last edited by Goeroeboeroe; Jun 10, 2020, 08:17 AM.

          Comment


            #20
            Thanks, jlittle. And thanks Goeroeboeroe--you are confirming my thoughts (and worst feelings) about all this. I'm new to this, but I'm getting the impression that distinction is made between emoticon, like this: ;-) and emoji, which is the graphical, I hope this looks right: 😉 I think Don B. Cilly in Post #4 was suggesting emoticon. Emoticons may be safer, rather than the graphical emoji.
            The Wikipedia article is good on all this and hits the points you've made,
            https://en.wikipedia.org/wiki/Emoji

            Yeah, Goeroeboeroe, I'm sort of new to social media (participating in a YouTube community), but I'm getting real punchy (adverse) about using these emojis -- too easy to convey either no message or the wrong message (like between a black heart and a red heart).

            Sort of related: Google translators are great, imo. Or ... they can be. But, again, especially with Asian languages, you sure have to be careful going from English to, say, Korean, or Japanese. Too many idiomatic nuances.

            Best to be straight and honest! Use your language, don't send pictures of emotions!
            ;-)

            Thanks. Anyone else care to drop anything here, please do. Kind of an interesting topic, actually. Trivial, but interesting.
            An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way. Charles Bukowski

            Comment


              #21
              Goeroeboeroe, that test page is neat, useful. Had the thought that sometimes you can get around some issues. The red heart shows black for me (Firefox, 18.04). But the beating heart, or two hearts etc. shows red. However, that is only on the sender's browser -- you can't really know how the recipient is seeing it.
              An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way. Charles Bukowski

              Comment


                #22
                Originally posted by Qqmike View Post
                Goeroeboeroe, that test page is neat, useful. Had the thought that sometimes you can get around some issues. The red heart shows black for me (Firefox, 18.04). But the beating heart, or two hearts etc. shows red. However, that is only on the sender's browser -- you can't really know how the recipient is seeing it.
                Yes, that's exactly the reason I don't use emoticons. If (IF) the recipient is using the exact same version of the operating system with the exact same software with the exact same settings (not forcing a different charset in the browser etc.), only then you can be sure it shows the same. Luckily, as far as I know, it never happens a friendly emoticon is changed into a declaration of war...

                Hmm, warning: this got a bit longer then I planned. If your'e not interested in fonts and emoticon, I suggest you drink a nice beer instead of reading this. With or without alcohol.

                The first charset had only 127 characters. The western alphabet and some more. Then you got (oh how luxury!) a charset with 256 characters. For reasons I never understood about 60 of that extra characters were used for symbols to draw a box.
                Of course with 256 characters (actually about 150, without the box symbols, new line symbol, etc.) it's a bit difficult to write in Russian and Greek. Somebody set Chinese?
                So there came more charsets with 256 characters, based on different languages. If somebody wrote something in Russian and I read it with a western charset, it was utterly rubbish. And, of course, Microsoft made it's own charsets, just a little bit different from the standardized iso-charsets.
                Only the first 127 characters have always been exactly the same in every charset. Probably thanks to IBM.

                It used to be that diacriticals (characters like é) also showed quite often as a box, or a question mark, or some other symbol of desperation. Because of the differences between these charsets, specially between the stardardized Apple and others used, and the little bit different Microsoft used.

                In a charset every characters has a number. You type a character, the editor translates it into a number, the computer stores that number. Reading the same process, but in the other order. Different charsets have different characters with the same number, except for the first 127.

                Then utf came along. On websites you always use utf-8 these days, and (almost) every editor uses it too. utf-8 has space enough for ALL the languages of the world and a lot more, like dominoes, runes, etc. And emoticons.
                The old charsets used one byte for every character (one byte can contain 256 numbers). utf-8 has over a million of 'numbers'. So with utf-8 you have to use four bytes for every character. This means every website, document, ... would become four times as big. Every character would use four bytes, in stead of just one. Oops.
                They did something very clever: the first 127 numbers (codepoints) in utf-8 are only one byte.

                So almost every existing websites etc. didn't became bigger at all. Only if necessary there are used more bytes then one, like for Chinese.

                With emoticons you still see the remainings of all of this.
                Old fonts were literally drawn dot by dot. That's the reason you had a different font for every font-size.
                Modern fonts are not drawn. They are mathematically formulas that draw a character live. You only need one font for every font-size, because the browser can simply adjust the formula if the font-size changes.

                The newest fonts, variable fonts, even have cursive, small capitals, thickness, ... build in: you only need one font for every style. And you can specify exactly not only cursive, but how much cursive. Etc.

                When fonts were actually drawings, you got the first emoticon fonts. That were drawings to: drawn pixel for pixel. If the receiver didn't had installed that font, the emoticon didn't show.
                A different method was embedding a little image. That worked always, but you had to make an extra call to the server. And in those days that sometimes took a looooooong time.
                Another approach: including it with 'base64' inside of the document. That's a kind of 'language' to describe an image, among other things. But - of course - not every browser was able to 'read' base64.

                utf-8 changed all this. Now there were enough codepoints for everything you could ever dream of. And if over a million is not enough, it's possible to expand it.
                A lot of emoticons are not just one 'drawing', but two or more. That's the reason you sometimes see two (of more) boxes. The first codepoint, a special code to tell the browser (editor, ...) that the next codepoint should be placed OVER the former codepoint. Or three codepoints over each other.

                But you still need to have a font installed to render the emoticon. If the font you use doesn't have that emoticon (and of course most text fonts don't have a lot of emoticons), the browser/operating system still has to look in other fonts, using the translation table.
                In some years probably every browser/operating system has all the emoticons built in, so they can be rendered even if the font in use doesn't have them.
                But I think this will never work completely without problems. There will always be people building editors/browser/... refusing to use some emoticons with a more explicit meaning than a smile or something like that.

                Modern emoticons are no drawings too. Just like normal characters they are mathematical formulas. Because of that, emoticons can be resized without any problem too, just like normal text.

                But personally I'm quite happy with how it works now. I'm old enough to remember - waking up screaming - the 'good old days' when you had to write the strangest sentences to avoid using diacriticals, because 'n Apple user couldn't read it. Or vice versa.
                Last edited by Goeroeboeroe; Jun 10, 2020, 09:58 AM. Reason: to honor the tradition of that $))^$% typos

                Comment


                  #23
                  Now that was interesting, and informative, and I didn't even need to use any beer, or wine (kept in the frig until this evening :-) )

                  A lot of emoticons are not just one 'drawing', but two or more. That's the reason you sometimes see two (of more) boxes. The first codepoint, a special code to tell the browser (editor, ...) that the next codepoint should be placed OVER the former codepoint. Or three codepoints over each other.
                  Ahhh! That happened recently, and I couldn't figure out how one emoji generated two "places" in text. I thought I had accidentally overwrote one with the other!

                  Thank you for taking time to write this out, Goeroeboeroe.
                  An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way. Charles Bukowski

                  Comment


                    #24
                    Originally posted by Goeroeboeroe View Post
                    ...interesting explanation...
                    Then utf came along.
                    You conflate UTF, and UTF-8, with Unicode, aka ISO 10646. It's Unicode you're talking about, mostly; UTF-8 is an encoding, and there's lots of others. Windows seems to be mostly settling on little-endian UTF-16, and ICU4C, "International Components for Unicode for C", the only sane way to handle Unicode properly, uses UTF-16 internally. UCS-2, a strictly 2 byte encoding, has been subsumed by UTF-16. UCS-4, aka UTF-32, is the wchar_t ("wide character type") of gcc. The PRC uses GB 18030. There's even WTF-8, I kid you not.

                    (UTF-8 is beloved of C programmers, because it never has a null byte, which C uses to terminate strings, and is ASCII compatible. Lots of C code, like much of Linux, can run blissfully unaware of the encoding. Thanks to Linux's domination of internet servers, the internet and the WWW uses UTF-8 mostly. UTF-8 has got lots of weaknesses and gotchas, and can be inefficient for scripts that don't use mostly Latin letters.)
                    Regards, John Little

                    Comment


                      #25
                      I thought my response was long enough without going into all of the details of utf, the body that decides what to put into it, the internal workings of JavaScript considering utf, ,etc, etc.
                      utf stands for Unicode Transformation Format, defined by the Unicode Standard. When building sites, working with editors, etc. every program uses utf-8 (except things like the internal working of JavaScript, for example). So I think it's really no problem if I call utf-8 simply Unicode. I've really never ever seen another implementation from unicode then utf-8.
                      The internet using mostly utf-8 has nothing to do with Linux, servers, or whatever. It's simply the best encoding for websites. Using for example utf-16 would mean every character would use two bytes instead of one, making every text twice as big. There's simply no use for another encoding, utf-8 is the most efficient.

                      (And I'm no expert at all. Since I only have been building websites for years, I've only met utf-8. Whatever I might have known about utf-16, ..., has left my head for ages. Because when talking about emoticons etc., only utf-8 is important. How Windows internally handles texts etc. is of no importance for websites.)

                      Slight correction: it should be something like 'using two bytes for the most used characters in western languages'. A lot of characters use two (or more) bytes in utf-8 too, like Chinese.
                      Last edited by Goeroeboeroe; Jun 11, 2020, 08:34 AM.

                      Comment

                      Working...
                      X