Talk:Control character

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Excessive duplication of the "C0 and C1 control codes" article[edit]

I've just moved a rather large chunk of ASCII-specific details out of the introduction and into the ASCII section, and I'm now wondering whether I should simply have deleted it as it's largely duplicating what's written in the C0 and C1 control codes article. I've also added an "about" header, but somehow I doubt this will be sufficient to prevent future accretion of ASCII-specific details, as by far the majority of people looking for "Control Code" or "Control Character" will be looking for the specifics of ASCII. Perhaps we should re-arrange the articles:

Martin Kealey (talk) 04:05, 9 August 2022 (UTC)[reply]

C0 is ASCII but C1 is not. I believe that these days 8859-1, 8859-15 and UTF-8 are much more common than ASCII. The proposal appears to viloate WP:NPOV. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:45, 9 August 2022 (UTC)[reply]
Given that the term "control character" is used for characters encodings other than ASCII and ASCII-derived encodings, and that pages for two of those encodings, namely Baudot code and EBCDIC, use the phrase and link it to this page, my inclination is not to have control character link to a page specific to ASCII and ASCII-derived encodings.
Baudot and Baudot-derived encodings might now be purely historical, but EBCDIC is still around, so I don't think mention of control characters in non-ASCII and non-ASCII-derived character encodings in a page with "historical" in its name would be appropriate.
My inclination would be to have this page talk about the general notion of control characters, with most of the stuff in the "In ASCII" section moved to the C0 part of C0 and C1 control codes (but with the one-section mention of EBCDIC moved somewhere else in the article, as that says nothing about control characters in ASCII).
Note that there's both ASCII § Control characters, with ASCII control characters as a redirect to that section, which has a lot of text, even though it also has a "main article" hatnote pointing to C0 and C1 control codes, and C0 and C1 control codes § C0 controls. The former section has 17464 characters in its source text, and the latter section has 30785 characters in its source text, so the former has more than half as many characters as the latter. Guy Harris (talk) 22:01, 9 August 2022 (UTC)[reply]
Fair points about "historical" not being appropriate for systems that are still operational, and that (of course) only the C0 half of the C0 & C1 article is actually about ASCII. I agree that the current "ASCII" section of this article is badly inflated and should be heavily pruned.
I'm trying to find a way to meet the expectations of the majority of people searching for "control characters" or "control codes" when they simply want practical answers about ASCII (and have no idea that other coding systems exist), while also having a neutral title for this article describing the general concept of control codes, and reducing the temptation for future editors to accrete further ASCII-specific details onto this article.
Until recently the "Control Codes" section of the ASCII article had a "main article" link to this article; it now links to the C0 & C1 article. Hopefully that will (somewhat) reduce the temptation to add more ASCII-specific information to this article.
Rather than renaming the C0 and C1 control codes article (which you rightly point out exceeds the scope of just ASCII), perhaps we could have ASCII control character and ASCII control code link to the C0 anchor within it, and then put links to one of those here? I know it's generally frowned on to link to a redirection (or disambiguation) page, but this seems like a case where it might actually make it more obvious what's going on.
Thoughts?
Martin Kealey (talk) 13:56, 12 August 2022 (UTC)[reply]
Yes, people using Wikipedia as a manual is an issue.
Your suggestion sounds good. Some may not like links to redirects, but I don't think it's generally frowned on. In fact, MOS:REDIR seems to favor linking to redirects in at least some cases, and WP:NOTBROKEN starts out saying "There is usually nothing wrong with linking to redirects to articles." and proceeds to tell people not to "fix" links to redirects by replacing them with piped links to the ultimate target.
It appears you've already redirected ASCII control character and created ASCII control code. The old target of ASCII control character, ASCII § ASCII control characters, has a lot of text; should that be moved to C0 and C1 control codes § C0 controls, so 0x00 through 0x1F, and 0x7F, have their characteristics and life story fully described in one place? Guy Harris (talk) 19:09, 12 August 2022 (UTC)[reply]

Merge control character with non-printing character?[edit]

Control characters are non-printable characters but is the reverse true? Historically control characters were the first 32 ASCII characters, back when teletypes were all upper case, IIRC. (Or did we just always lock the shift key down? I don't think so.) Then characters like DEL came along that could not be generated with the control key, not to mention modern unicode characters like Zero-Width Non-Joiner. (For the record I don't recall if the old teletypes all had a control key.) Although these new characters cannot be made by using the control key, they certainly control the receiving device. Certainly in common usage all the non-printing ASCII characters are control characters. I can't say about the newer unicode characters. --Kop 18:59, 29 Aug 2004 (UTC)

Teletype Corp's ASR series all used ASCII (IIRC), with or without an actual lower case, and so DEL was an original builtin ASR character. Many of these machines came with a paper tape reader/punch, so DEL couls actually serve its original ordained meaning. As for non-printing ASCII characters being all controls, there are DEL (not in the control range) and SP (also not in the control range) both of which don't print. For that matter the national usuage characters, might in some national usage, not print either. These things, even simple old ASCII, are not so transparent in their meanings and relations and implications. ww 08:17, 10 November 2005 (UTC)[reply]
DEL was commonly Ctrl-/ , and it's shown in the article --A194 44 217 5 (talk) 19:33, 11 May 2010 (UTC)[reply]

missing example[edit]

this somehow duplicates stuff from ASCII code#ASCII control characters, but I miss here the "sucker^H^H^H^H^H^H customer" example. MFH: Talk 00:23, 25 May 2005 (UTC)[reply]

examples at beginning[edit]

I think the examples at the beginning (SYN, ) are not very representative, LF, CR, TAB, NUL would be much more common. B.t.w., NUL would deserve a further study. — MFH:Talk 14:20, 13 March 2006 (UTC)[reply]

BEL[edit]

Is its escape sequence \g or \a ? If I remember correctly, despite the key combination being Ctrl+G, the escape sequence itself is \a (alert)...Medinoc (talk) 08:46, 10 September 2008 (UTC)[reply]

  • After consulting the n1256 ISO/IEC 9899:TC3 document, I can confirm it's \a.Medinoc (talk) 08:48, 10 September 2008 (UTC)[reply]
  • We really need to label those escape sequences somehow anyway. Ultimately the codes are not related to the control characters themselves, but to C. CrispMuncher (talk) 21:28, 10 September 2008 (UTC)[reply]

ANSI X.64 and VT100[edit]

I've snipped a statement that the ANSI X.64 standard is based on the VT100 terminal - it is the other way around. The VT100 manual is up online at http://vt100.net/docs/vt100-ug/ which clearly states this. I also removed the year of the standard because this reference gives the date as 1977 not 1979. I'm not sure if 1979 was plain wrong or if there was a revised version. CrispMuncher (talk) 19:32, 15 September 2008 (UTC)[reply]

wiki formatting: caret notation[edit]

This is not the right place, but how to write a caret notation in italic. — Preceding unsigned comment added by 2A02:8422:1191:6E00:56E6:FCFF:FEDB:2BBA (talk) 20:17, 27 February 2013 (UTC)[reply]

Link to dab page[edit]

I have removed the link to a dab page not sure of the reason it was re added. Mo ainm~Talk 15:04, 3 October 2013 (UTC)[reply]

Base58 Rafiq34 (talk) 11:41, 20 September 2019 (UTC)[reply]

Base58 Rafiq34 (talk) 11:41, 20 September 2019 (UTC)[reply]

External link broken[edit]

http://kikaku.itscj.ipsj.or.jp/ISO-IR/001.pdf

dns error — Preceding unsigned comment added by 2003:E8:23D6:CBC1:C81A:2992:3136:F6E7 (talk) 19:39, 23 July 2017 (UTC)[reply]

Fixed. (It's currently at https://www.itscj.ipsj.or.jp/iso-ir/001.pdf.) Guy Harris (talk) 20:26, 23 July 2017 (UTC)[reply]

Organization of article[edit]

There are several issues with the current organization of the article:

  1. Many control characters exist in multiple character sets, at different code points.
  2. There should be a sections on characters that condition the interpretation of following displayable code points, e.g., FIGS, GE, LTRS, SI, SO.
  3. There are control characters without stand-alone articles, for which anchors would be appropriate
  4. Control character#How control characters map to keyboards has WP:NPOV issues; it is ASCII-specific and PC-specific.

I propose that the article be restructured into generic descriptions, with anchors, of all of the control characters and tables for at least ASCII, Baudot, EBCDIC and Unicode encoding. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:42, 13 August 2021 (UTC)[reply]

ASCII DLE description inverted[edit]

DLE was normally used to indicate that the following character is a payload character and not to be interpreted as a control, which is the opposite of the description given.

The usual convention was that a control byte in the payload would have bit 6 inverted and be prefaced by DLE, for example, DLE in the payload would become DLE 'P' in the transmission. This is easily and unambiguously reversed upon reception, and avoids other control bytes being inadvertently included in the transmitted message.

Sadly this wasn't universally understood, and many manufacturers failed to note the significance of "invert bit 6", and a few did indeed create equipment that required the DLE STX sequence at the start of a packet.

I would like the purpose of DLE to be clarified, and the varied implementations highlighted. Martin Kealey (talk) 01:49, 16 April 2022 (UTC)[reply]

LTR and FIGS in Baudot?[edit]

Should the article mention FIGS and LTRS in the 5-bit Baudot code? Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:42, 1 November 2022 (UTC)[reply]

@Chatul I'm inclined to agree; they change the internal state of the receiver rather than printing anything, which would appear to meet the definition of a control character Martin Kealey (talk) 15:15, 1 November 2022 (UTC)[reply]

Side effects[edit]

Control characters have been used for their side effects, e.g., rattling the print element on an IBM 2741. Would a section on such uses be TMI? -- Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:45, 15 January 2024 (UTC)[reply]