Inserting Special Characters Into Forum Posts

Black Hole

May contain traces of nut
Robobunny said said:
Hello Black Hole, in the "media mistakes" you posted "38m³" (I just copied and pasted from it). I originally wrote "38m^3" as I couldn't find the superscript option (I even tried copy/paste from MS Word but it pastes plain text). I expect I'm missing the obvious, but where is it?
If you typed "3" into Word with superscript style, the character is still just "3" even though it displays as a superscript 3. All that is happening is the character is displayed in a smaller font and displaced vertically, and copy&pasting that into a non-Word-styles-aware program removes the styling. There are however the proper superscript numerals (and many other special characters) available as characters in their own right (but with reservations – see later).

In my case, the answer is I mostly use an iPad for general web browsing and the forum, and there is a useful app for accessing "unusual" characters: Character Pad, which also seems to be available for Android.

The equivalent in Windows is the Character Map applet (type that into the Windows search), or in many text programs (including Word) there is a Special Character tool on (usually) the Insert menu. These display most of* the characters available in any particular font, and inserts ones selected into your text... but it can be quite hard to find the character/glyph you want. No doubt similar are available for Linux, if not already included in your distro (there is one in Mint).

* For some reason, the Win7 version of Character Map does not provide access to code points 0-32 (0-20 hex) or 127-159 (7F-9F hex), even though there are characters assigned to those positions in the relevant code pages.

There are a number of other options in the iOS/iPadOS world: In Settings >> General >> Keyboards >> Text Replacement, I have (for example) defined the typed string "omega" as generating Ω. Also, holding an on-screen keypad button (not a bluetooth keyboard key) pops up a list of alternative characters – try ' (apostrophe) for example. Ditto Android.

In Windows and Linux there are direct entry methods for typing a code to insert any specific character, but there are a variety of "ifs and buts" which make the whole thing quite complex with no one-size-fits-all solution. If you want to avoid any of that complexity and simply copy&paste "special" characters, skip to the tables below NOW.

The tables include codes for direct entry, with the following notes and reservations:
  1. Historically, direct entry in Windows is restricted to "Alt Codes", and these only work through the currently-active code pages (without a leading zero for the DOS code page, or with a leading zero for the Windows code page – there is a difference!). The alt codes in the table must be entered as shown (including any leading zero), and are only valid for system code pages 850 and 1252 (ie Western, appropriate to UK). The procedure is: press and hold the Alt key; enter the code on the numeric keypad (not the top row of the keyboard); release Alt. Numlock must be on to avoid random results!

  2. Historically, alt codes are only valid as decimal numbers in the range 1-255 (ie the extent of the code pages), and entering numbers greater than 255 wrap around modulo 256 (this applies to Win7). To provide access to supported characters outside the code page, later versions of Windows interpret numbers greater than 255 as Unicode, and even provide a hack for hex entry. Support may vary. For details see this useful Wikipedia article and/or this other Wikipedia article.

  3. Linux only supports Unicode direct entry (which I find a pain because I have a selection of alt codes memorised, but you can see why the Linux world wouldn't want to go down the code pages rabbit hole!). The procedure is (Linux Mint): press and hold Shift+Ctrl; type "u" and then the required Unicode (in hex); release Shift+Ctrl.

  4. See the Unicode website for an exhaustive listing of characters available, but you might get on better with a downloadable summary listing in plain text here.
Be aware that characters entered by alt codes, unless they've been internally translated into Unicode, won't necessarily display as the same character on a system using a different code page.

Be aware that the more specialised the character is, the more likely it is some people's systems are not set up to display it the way you see it on your system, and support may depend on a particular installed font.


Neither of the above should apply to content on the Web, because characters get converted to HTML representation.

Once upon a time, when digital electronic communication was in its infancy, there were various telegraphy systems, and for economy it was necessary to represent each character in the message in as few a number of bits as possible. Ignoring encodings such as Morse, where common characters are assigned fewer bits than less common ones, 6 bits is plenty for the English alphabet plus numbers and some punctuation (and all of this was happening in the English-speaking nations). Eventually this expanded to 7-bit ASCII used for teleprinters.

Once computers standardised on an 8-bit byte as the unit of data storage, there was room for 256 characters and control codes in the mapping table. Computers were expanding into foreign language markets, with the need to support accented Latin characters or even non-Latin characters such as Cyrillic, so this was achieved by "code pages" which assigned different character maps according to the system locale, and a means to access them even without a dedicated keyboard button by entering a numeric code ("alt codes").

Thus the actual character obtained by pressing a keyboard key or entering its code would depend on the code page currently in effect, and also whether that code was supported by the installed font (be that on a graphics display or a text-driven VDU).

Various attempts were made to expand the character encoding and disambiguate character mappings, but the one with the most traction now is Unicode. The aim is to represent every single character, punctuation mark, accent, glyph, symbol, emoticon... for any language each with a unique code (not necessarily restricted to 8 or even 16 bits).

The implementation in all major modern operating systems is to map whatever internal encoding is used to the equivalent Unicode, and use the Unicode to access the required character in the font file.

Therefore, if the user has the means to input Unicode directly, they can enter any character in use worldwide... but that does not mean their system supports that character, or that it is included in the current font. More particularly, even if their system displays it, there is no certainty that somebody else viewing the same document will be able to see it. Missing characters may be displayed as a box with an X in it, or some other random character entirely.

The following tables are (what I regard as) the most useful special characters. It's only a subset – if there are obviously useful omissions let me know.

Superscript and Subscript Numerals

Character​
Interpretation​
Alt Code​
Unicode (hex)​
Unicode (decimal)​
⁰︎​
Superscript 020708304
¹​
Superscript 10185 or 251B9*
²​
Superscript 20178 or 253B2*
³​
Superscript 30179 or 252B3*
⁴︎​
Superscript 420748308
⁵︎​
Superscript 520758309
⁶︎​
Superscript 620768310
⁷︎​
Superscript 720778311
⁸︎​
Superscript 820788312
⁹︎​
Superscript 920798313
₀︎​
Subscript 020808320
₁︎​
Subscript 120818321
₂︎​
Subscript 220828322
₃︎​
Subscript 320838323
₄︎​
Subscript 420848324
₅︎​
Subscript 520858325
₆︎​
Subscript 620868326
₇︎​
Subscript 720878327
₈︎​
Subscript 820888328
₉︎​
Subscript 920898329

Fractions

Character​
Interpretation​
Alt Code​
Unicode (hex)​
Unicode (decimal)​
½​
One half0189 or 171BD*
⅓︎​
One third21538531
⅔︎​
Two thirds21548532
¼​
One quarter0188 or 172BC*
¾​
Three quarters0190 or 243BE*
⅕︎​
One fifth21558533
⅖︎​
Two fifths21568534
⅗︎​
Three fifths21578535
⅘︎​
Four fifths21588536
⅙︎​
One sixth21598537
⅚︎​
Five sixths215A8538
⅛︎​
One eighth215B8539
⅜︎​
Three eighths215C8540
⅝︎​
Five eighths215D8541
⅞︎​
Seven eighths215E8542

Miscellaneous Mathematical & Engineering Symbols

Character​
Interpretation​
Alt Code​
Unicode (hex)​
Unicode (decimal)​
±​
Plus or minus0177 or 241B1*
×​
Multiply0215 or 158D7*
·​
Dot product0183 or 250B7*

÷​
Divide0247 or 24620288232
≠︎​
Not equal22608800
≈︎​
Roughly equal22488776
≤︎​
Less than or equal22648804
≥︎​
Greater than or equal22658805
≡︎​
Identical22618801
∴︎​
Therefore22348756
…​
Ellipsis20268230

∞︎​
Infinity20288232
√︎​
Root221A8730
∫︎​
Integrate222B8747
∑​
Sum of series22118721
∆​
Difference22068710
∏​
Product of series220F8719
°​
Degrees0176 or 248B0*
′​
Minutes / Feet20328242
″​
Seconds / Inches20338243
Ω​
Ohms (omega)3A9937
μ​
Micro (mu)0181 or 2303BC956
π​
Pi3C0960
✓︎​
Tick271310003
✘︎​
Cross271810008
←︎​
Left arrow2721908592
→︎​
Right arrow2621928594
↑​
Up arrow2421918593
↓​
Down arrow2521938595

Typographical Characters

Character​
Interpretation​
Alt Code​
Unicode (hex)​
Unicode (decimal)​
‘​
Open quote014520188216
’​
Close quote014620198217
“​
Open speech mark0147201C8220
”​
Close speech mark0148201D8221
–​
En dash015020138211
—​
Em dash015120148212

* Although clearly there is a decimal equivalent of the stated hex Unicode, it is of no use because the Alt entry method will interpret the number as to be translated via the respective code page and therefore not as Unicode.
 
Last edited:
Character pad for Android seems to only be available for older versions of Android. There is something called Unicode Pad - claims to contain ads.
Also, holding an on-screen keypad button (not a bluetooth keyboard key) pops up a list of alternative characters – try ' (apostrophe) for example.
Seems to work on an Android standard Gboard keyboard.
 
Back
Top