Inserting Special Characters Into Forum Posts

Black Hole · Jun 24, 2024

Robobunny said said:
Hello Black Hole, in the "media mistakes" you posted "38m³" (I just copied and pasted from it). I originally wrote "38m^3" as I couldn't find the superscript option (I even tried copy/paste from MS Word but it pastes plain text). I expect I'm missing the obvious, but where is it?

If you typed "3" into Word with superscript style, the character is still just "3" even though it displays as a superscript 3. All that is happening is the character is displayed in a smaller font and displaced vertically, and copy&pasting that into a non-Word-styles-aware program removes the styling. There are however the proper superscript numerals (and many other special characters) available as characters in their own right (but with reservations – see later).

In my case, the answer is I mostly use an iPad for general web browsing and the forum, and there is a useful app for accessing "unusual" characters: Character Pad, which also seems to be available for Android.

The equivalent in Windows is the Character Map applet (type that into the Windows search), or in many text programs (including Word) there is a Special Character tool on (usually) the Insert menu. These display most of* the characters available in any particular font, and inserts ones selected into your text... but it can be quite hard to find the character/glyph you want. No doubt similar are available for Linux, if not already included in your distro (there is one in Mint).

* For some reason, the Win7 version of Character Map does not provide access to code points 0-32 (0-20 hex) or 127-159 (7F-9F hex), even though there are characters assigned to those positions in the relevant code pages.

There are a number of other options in the iOS/iPadOS world: In Settings >> General >> Keyboards >> Text Replacement, I have (for example) defined the typed string "omega" as generating Ω. Also, holding an on-screen keypad button (not a bluetooth keyboard key) pops up a list of alternative characters – try ' (apostrophe) for example. Ditto Android.

In Windows and Linux there are direct entry methods for typing a code to insert any specific character, but there are a variety of "ifs and buts" which make the whole thing quite complex with no one-size-fits-all solution. If you want to avoid any of that complexity and simply copy&paste "special" characters, skip to the tables below NOW.

The tables include codes for direct entry, with the following notes and reservations:

Historically, direct entry in Windows is restricted to "Alt Codes", and these only work through the currently-active code pages (without a leading zero for the DOS code page, or with a leading zero for the Windows code page – there is a difference!). The alt codes in the table must be entered as shown (including any leading zero), and are only valid for system code pages 850 and 1252 (ie Western, appropriate to UK). The procedure is: press and hold the Alt key; enter the code on the numeric keypad (not the top row of the keyboard); release Alt. Numlock must be on to avoid random results!
Historically, alt codes are only valid as decimal numbers in the range 1-255 (ie the extent of the code pages), and entering numbers greater than 255 wrap around modulo 256 (this applies to Win7). To provide access to supported characters outside the code page, later versions of Windows interpret numbers greater than 255 as Unicode, and even provide a hack for hex entry. Support may vary. For details see this useful Wikipedia article and/or this other Wikipedia article.
Linux only supports Unicode direct entry (which I find a pain because I have a selection of alt codes memorised, but you can see why the Linux world wouldn't want to go down the code pages rabbit hole!). The procedure is (Linux Mint): press and hold Shift+Ctrl; type "u" and then the required Unicode (in hex); release Shift+Ctrl.
See the Unicode website for an exhaustive listing of characters available, but you might get on better with a downloadable summary listing in plain text here.

Be aware that characters entered by alt codes, unless they've been internally translated into Unicode, won't necessarily display as the same character on a system using a different code page.

Be aware that the more specialised the character is, the more likely it is some people's systems are not set up to display it the way you see it on your system, and support may depend on a particular installed font.

Neither of the above should apply to content on the Web, because characters get converted to HTML representation.

Once upon a time, when digital electronic communication was in its infancy, there were various telegraphy systems, and for economy it was necessary to represent each character in the message in as few a number of bits as possible. Ignoring encodings such as Morse, where common characters are assigned fewer bits than less common ones, 6 bits is plenty for the English alphabet plus numbers and some punctuation (and all of this was happening in the English-speaking nations). Eventually this expanded to 7-bit ASCII used for teleprinters.

Once computers standardised on an 8-bit byte as the unit of data storage, there was room for 256 characters and control codes in the mapping table. Computers were expanding into foreign language markets, with the need to support accented Latin characters or even non-Latin characters such as Cyrillic, so this was achieved by "code pages" which assigned different character maps according to the system locale, and a means to access them even without a dedicated keyboard button by entering a numeric code ("alt codes").

Thus the actual character obtained by pressing a keyboard key or entering its code would depend on the code page currently in effect, and also whether that code was supported by the installed font (be that on a graphics display or a text-driven VDU).

Various attempts were made to expand the character encoding and disambiguate character mappings, but the one with the most traction now is Unicode. The aim is to represent every single character, punctuation mark, accent, glyph, symbol, emoticon... for any language each with a unique code (not necessarily restricted to 8 or even 16 bits).

The implementation in all major modern operating systems is to map whatever internal encoding is used to the equivalent Unicode, and use the Unicode to access the required character in the font file.

Therefore, if the user has the means to input Unicode directly, they can enter any character in use worldwide... but that does not mean their system supports that character, or that it is included in the current font. More particularly, even if their system displays it, there is no certainty that somebody else viewing the same document will be able to see it. Missing characters may be displayed as a box with an X in it, or some other random character entirely.

The following tables are (what I regard as) the most useful special characters. It's only a subset – if there are obviously useful omissions let me know.

Superscript and Subscript Numerals

Character	Interpretation	Alt Code	Unicode (hex)	Unicode (decimal)
⁰︎	Superscript 0		2070	8304
¹	Superscript 1	0185 or 251	B9	*
²	Superscript 2	0178 or 253	B2	*
³	Superscript 3	0179 or 252	B3	*
⁴︎	Superscript 4		2074	8308
⁵︎	Superscript 5		2075	8309
⁶︎	Superscript 6		2076	8310
⁷︎	Superscript 7		2077	8311
⁸︎	Superscript 8		2078	8312
⁹︎	Superscript 9		2079	8313
₀︎	Subscript 0		2080	8320
₁︎	Subscript 1		2081	8321
₂︎	Subscript 2		2082	8322
₃︎	Subscript 3		2083	8323
₄︎	Subscript 4		2084	8324
₅︎	Subscript 5		2085	8325
₆︎	Subscript 6		2086	8326
₇︎	Subscript 7		2087	8327
₈︎	Subscript 8		2088	8328
₉︎	Subscript 9		2089	8329

Fractions

Character	Interpretation	Alt Code	Unicode (hex)	Unicode (decimal)
½	One half	0189 or 171	BD	*
⅓︎	One third		2153	8531
⅔︎	Two thirds		2154	8532
¼	One quarter	0188 or 172	BC	*
¾	Three quarters	0190 or 243	BE	*
⅕︎	One fifth		2155	8533
⅖︎	Two fifths		2156	8534
⅗︎	Three fifths		2157	8535
⅘︎	Four fifths		2158	8536
⅙︎	One sixth		2159	8537
⅚︎	Five sixths		215A	8538
⅛︎	One eighth		215B	8539
⅜︎	Three eighths		215C	8540
⅝︎	Five eighths		215D	8541
⅞︎	Seven eighths		215E	8542

Miscellaneous Mathematical & Engineering Symbols

Character	Interpretation	Alt Code	Unicode (hex)	Unicode (decimal)
±	Plus or minus	0177 or 241	B1	*
×	Multiply	0215 or 158	D7	*
·	Dot product	0183 or 250	B7	*
÷	Divide	0247 or 246	2028	8232
≠︎	Not equal		2260	8800
≈︎	Roughly equal		2248	8776
≤︎	Less than or equal		2264	8804
≥︎	Greater than or equal		2265	8805
≡︎	Identical		2261	8801
∴︎	Therefore		2234	8756
…	Ellipsis		2026	8230
∞︎	Infinity		2028	8232
√︎	Root		221A	8730
∫︎	Integrate		222B	8747
∑	Sum of series		2211	8721
∆	Difference		2206	8710
∏	Product of series		220F	8719
°	Degrees	0176 or 248	B0	*
′	Minutes / Feet		2032	8242
″	Seconds / Inches		2033	8243
Ω	Ohms (omega)		3A9	937
μ	Micro (mu)	0181 or 230	3BC	956
π	Pi		3C0	960
✓︎	Tick		2713	10003
✘︎	Cross		2718	10008
←︎	Left arrow	27	2190	8592
→︎	Right arrow	26	2192	8594
↑	Up arrow	24	2191	8593
↓	Down arrow	25	2193	8595

Typographical Characters

Character	Interpretation	Alt Code	Unicode (hex)	Unicode (decimal)
‘	Open quote	0145	2018	8216
’	Close quote	0146	2019	8217
“	Open speech mark	0147	201C	8220
”	Close speech mark	0148	201D	8221
–	En dash	0150	2013	8211
—	Em dash	0151	2014	8212

* Although clearly there is a decimal equivalent of the stated hex Unicode, it is of no use because the Alt entry method will interpret the number as to be translated via the respective code page and therefore not as Unicode.

EEPhil · Jun 24, 2024

Character pad for Android seems to only be available for older versions of Android. There is something called Unicode Pad - claims to contain ads.

Black Hole said:
Also, holding an on-screen keypad button (not a bluetooth keyboard key) pops up a list of alternative characters – try ' (apostrophe) for example.

Seems to work on an Android standard Gboard keyboard.

Black Hole · Jun 24, 2024

EEPhil said:
Seems to work on an Android standard Gboard keyboard.

Ditto on Microsoft Swift Keyboard (Android).

EEPhil · Jun 25, 2024

EEPhil said:
There is something called Unicode Pad - claims to contain ads.

Another app is Unicode ( https://play.google.com/store/apps/details?id=vadiole.unicode&hl=en_US ). This one doesn't have ads

Inserting Special Characters Into Forum Posts

Black Hole

May contain traces of nut

EEPhil

Number 28

Black Hole

May contain traces of nut

EEPhil

Number 28