Talk:Second-generation programming language

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

The idea of "generations" of programming languages appears to have arisen as a bit of marketing jargon particularly around the epoch of the so-called "fourth-generation" languages. The proposed distinctions imply that trends in language popularity are progressive rather than being driven by a combination of marketing fads and shifting requirements.

It is increasingly obvious, however, that this is the case: while there is a broad general trend towards greater abstraction from the hardware, it is not monotonic. For instance see the decline in popularity of the more-abstract language Lisp in favor of the closer-to-hardware C and C++ in the 1980s and '90s. Nor is there a determined trend towards application specificity; see, for instance, the demise of special-purpose COBOL for general-purpose Java in business applications.

Of course, changes in language popularity are not driven entirely by marketing. COBOL lacks standard libraries to talk to Internet clients; Java has them. As talking to the Internet becomes more important for the problem domain, usage migrates to a language where it is natural: Java. Likewise in other domains: biological science programming, once dominated by Fortran, acquires a need for text processing due to the rising importance of genomics, and begins to migrate to Perl. These changes are not toward greater application specificity, but rather toward closer fit to changing application requirements.

(Indeed, the newly adopted languages often lack underlying application-specific features the old ones have: Java does not have fixed-point decimal numbers, a COBOL feature valuable for business applications.)

What's my point? The idea of successive "generations" of programming languages replacing one another at higher levels of abstraction and application specificity is not historically accurate after, say, 1960. (COBOL, Fortran, and Lisp all existed in 1960.) Wikipedia should not present it uncritically, but rather note it wherever it appears as folk-history and marketing jargon rather than historical reality. --FOo 15:23, 8 Dec 2003 (UTC)

I'll add another confuser -- I don't think you can separate language from era. We could also look at the term being languages of second-generation programming; where both the knowledge and the tools are advancing which enables and changes how everything is done. First generation programming is wires or toggle switches and the language is thus machine language; second generation goes to keyboards and memory and thus assembly can be and is done; third generation has punched cards or tape and computers with discs that can compile so compiled languages; then we get video editing and floppies in the 80s and languages come with callable libraries extending the language with basis of programming books like Knuth so there's common algorithm and architecture ideas to procedural work, and linguistic concept languages like LISP APL, ADA, C++ object-oriented and so on. Then 5th generation programming could be considered the cell phone and app store era, with collaboration and multi-language CM tools in play and more about glue code between or extending large foundation products. 71.88.51.20 (talk) 13:31, 21 July 2013 (UTC)[reply]

Scope/accuracy of this article[edit]

This article, the other articles in the programming language generation series, and the first comment on this discussion page, all characterize language generations in a way that seems strange to me.

In particular, the idea of calling C a second generation language is bizarre. Its relatively low level notwithstanding, this flies in the face of how the term has always been used. Just the fact that it is possible (important!) to write optimizers for C, and that assembly language programmers shake (or rather, shook) their heads about the inefficiencies of C code generation, would seem to bear this out.

My understanding of the history of this term – but one for which I am as yet unable to find good sources – is as follows:

  • First generation languages are simply the numerical machine code of a particular processor. In general-purpose computing, machine code was only used on the very first computers. Those computers themselves were termed "first generation" hardware, and were based on vacuum tubes or mechanical relays.
  • Second generation languages are symbolic assembly languages that provide a mnemonic sugaring for machine language. By the 1950s, as second-generation (transistorized) hardware became dominant, essentially all processors were shipped with assemblers.
  • Third generation languages were first developed in the 1950s. They originally were just called "higher level languages", in contrast to "assembly language"; "generations" was still primarily a term applied to hardware. Important early HLL efforts, as everybody knows, included FORTRAN, COBOL, PL/I, ALGOL, and LISP.
  • The term fourth generation language was introduced became popular in the 1970s to describe nonprocedural languages – languages with significant processing that occurred "behind the scenes", instead of being directly specified by the programmer. [update: Though according to a citation on the 4GL article, James Martin asserts that the term was in use in the 1960s.] These included constraint-based languages, output-oriented languages (e.g. report writers), application generators, and database languages, among others. A few random names include RAMIS, NOMAD, FOCUS, QUEL, SQL [update: some object to considering SQL and other query languages as 4GLs, but these were in fact archetypes of the concept at the time; note that they were called fourth-generation languages, not fourth-generation programming languages], Smalltalk, and the programming languages associated with most database systems of the day, as well as domain-oriented systems like SPSS. Most were attempts to allow end users to specify processing requirements in the language of their fields – bringing the computer to a subject matter expert, rather than forcing such users to become computer experts. Many of these languages were interpreters. The distinction between fourth generation and third generation languages increased the sense that assemblers were second-generation languages. Note that powerful earlier languages such as LISP, Simula, and even FORTRAN and Algol (when considered alongside their function libraries) were capable of great complexity and sophistication; but that the term 4GL was generally reserved for languages that were geared to an end-user problem space.
  • The term fifth generation language was invented to describe various approaches to language design after – well, after the term 4GL came into wide use. It emerged around the time of Japan's famous fifth generation computer project of 1982. It has been variously applied to the language concept du jour since then. [update: Its value would seem to be in identifying languages that aren't like the 3GL procedural languages nor the 4GL nonprocedural/domain-oriented languages.]

I present this for what it's worth. Having programmed using languages in all generations, fought the good fight to introduce higher-level languages into second-generation shops, and participated in the development of so-called fourth-generation and fifth-generation languages, I would hope this description isn't too far off.

I agree with the comment above that we need to distinguish the historical use of nth-generation language from neologisms based on programming complexity/hierarchy. However, since the latter seems to be the current trend in practice, perhaps both views need to be presented, on these pages and on a new article about programming language generations that all of these should use as a main article. Trevor Hanson 18:18, 7 October 2007 (UTC)[reply]

I'm not sure exactly when these terms were coined. I'm quite certain that machine code and assembler language were both developed without the inventors realising that these terms applied (a bit like World War One not being known by that name until after WWII) and that they have come about from an historical perspective, probably when people starting looking at languages like COBOL and needed a way to distinguish them from what had come before. As far as I am concerned, these are abstract terms which attempt to characterise a language by the proximity of its syntax to the underlying machine code. I can see the point of view that compares C with a 2GL because there is often a one to one relationship between a command and a machine instruction - but I think the portability of C definitely makes it a 3GL. I'm sure that a case can be made for specific languages falling between generations, or having attributes which make it difficult to say which it truly belongs to. However, I don't think this is the point of the series of 1GL, 2GL, 3GL ... articles because they are, or should be, discussing abstract concepts, rather than categorising individual programming languages.
Potentially you could merge this whole series into a single programming language generations article. SilentC 03:33, 8 October 2007 (UTC)[reply]

Something worth considering: It may well be that these terms are ill-defined; that they are inconsistent, self-undermining, and ultimately nonsense.

As I understand it, these terms were largely advanced by software marketing folks. They wanted their own companies' software tools considered to be of a higher "generation" than other people's tools -- particularly database companies pushing "fourth generation languages", meaning "report generators" ... with the implication of higher generation being, of course, that these were newer, more advanced, more powerful, more productive tools.

Naturally, these terms caught on with people who like to think of themselves as expert in the newest, coolest tools. But that's precisely because they aren't very descriptive terms. They're mostly wind. --FOo 05:23, 8 October 2007 (UTC)[reply]

Nevertheless, these terms have been used for a long time – if Martin is correct, the term 4GL dates from the 60s – and not just for marketing purposes, and NOT simply to describe the proximity of a language to the hardware. They largely paralleled the generations of hardware, which really DID have generations in a measurable sense; though this no doubt made it harder to be precise about what exactly constituted a 3GL versus a 4GL. There are assertions here and elsewhere that these were primarily marketing terms – and "mostly wind" – and there is no doubt there was plenty of wind on the topic. Yet I don't think the terms are devoid of meaning; and they certainly are not devoid of history. Our primary focus here should probably be on historical usage, with a nod to whatever currency they may have in contemporary academic taxonomy of programming languages. Does somebody have Jean Sammet's book? What does she say about language generations? Trevor Hanson 07:53, 8 October 2007 (UTC)[reply]
And just to remind everybody, the term is 'nth generation language' rather than, say 'nth-ring meta-level language'. This would intrinsically seem to refer to an evolutionary process of language design over time, rather than a measure of complexity or abstraction. Later generations have built on their predecessors. In this view, a later language that is radically different from its predecessors would be described as a different generation; if s not different, then it's the same generation. I think this is how the term has primarily been used. But again, we probably need to defer to how contemporary educators teach computer languages and computer science history. It doesn't matter so much what we may individually remember or believe about this hierarchy – or its non-existence. Trevor Hanson 08:05, 8 October 2007 (UTC)[reply]
Well sorry but no I disagree with that. The terms might have been derived from the terminology used in hardware, but every definition of language generation that I have ever seen discusses the languages encompassed in terms of the relationship to machine code. Yes each generation builds on the one before, but the building has little to do with the underlying hardware and everything to do with the layers of abstraction between the machine and the programmer. 1GL = machine code. 2GL = human readable mnemonics for machine code. 3GL = abstraction from the machine layer. 4GL = syntax understood by non-programmers. 5GL = visual tools which create underlying code. You can waffle and pontificate as much as you like, but that is the simple definition of the concept that these articles should describe. I'm not debating that the evolution of hardware has enabled these latter generations to come to be, but that is not the point. As a pure abstract concept, language generation is simply about how detailed a knowledge the programmer has to have of the actual machine instructions that underlie his or her code. SilentC 23:38, 8 October 2007 (UTC)[reply]
Sorry, I didn't make myself clear. Of course what you're describing is how we tend to use these terms today (and your summary is good); and moreover, the terms are probably only useful in terms of that hierarchy. And I also agree that the focus of these articles should be on levels of abstraction. However, I submit that a Wikipedia article about these terms needs to be very clear on their historical use; and I don't believe that their reflection of levels of abstraction away from the hardware became explicit until relatively recently, i.e. the 80s. I think some languages that today we'd consider fourth-generation languages were originally considered third-generation, strictly because of the time at which they were developed. APL and Simula, and perhaps Lucid, Clu, SNOBAL, and Smalltalk, might be examples. At any rate, any taxonomy that might call C a 2nd generation language would seem to fly in the face of historical usage, and this fact needs to be clear. Trevor Hanson 00:56, 9 October 2007 (UTC)[reply]
Fair enough. Apologies if I came on a bit strong. This is why I think it would be better if the articles were all merged into an article on generations of programming language: there's obviously some historical context that needs to be wrapped around the whole topic and it probably wouldn't be out of place to mention that the terms have been 'misappropriated' by marketing departments. So it could start out by explaining what we mean by a language generation, and how the term came into common use. Then it could expand on the 5 generations. I think any statement along the lines of 'some people consider C to be a 2GL' can either be omitted, or if kept, require an academic reference. I believe it is generally accepted that C is a 3GL. SilentC 02:25, 9 October 2007 (UTC)[reply]
Yes, I think we're actually on the same page, or would be if I could get some sleep. My starting point was the condition of the current articles, and the need to get some eyeballs on them. (Though there is one thing I like about them: that nice little template on the bottom. I have to agree with you about merging the five into a single article. But the template looks so...logical. :) Maybe we could keep the template, if we cut them all down to little stublets that link to the main article, mostly existing for the purpose of animating the template. Each could show a code snippet of the same algorithm, programmed in languages with different levels of abstraction. James Martin did that in a 4GL book back in the 70s.) Anyway, I would argue that your crisp description of language abstraction levels (modulo a little hand waving about how to describe 4GL and 5GL) is a good counterargument to the position that these terms are either meaningless or not useful. Abstraction seems like a very helpful way to view language relationships. Trevor Hanson 09:16, 9 October 2007 (UTC)[reply]
"Each could show a code snippet of the same algorithm, programmed in languages with different levels of abstraction." Now that is a nice idea. But please, not Hello World! I don't see an issue with keeping the sub-articles. Some editors have a thing about stubs but I believe they often serve a purpose. Bit busy at the moment, but can probably help out in the next couple of weeks if someone else doesn't get to it first. SilentC 23:01, 9 October 2007 (UTC)[reply]

My handful of Eurocents: I agree with the characterization of these terms; they were invented in the 1970s and popular in the 1980s, they haven't really been updated since, and they were never more than a very rough, informal characterization of the languages (over 5000) that already existed at the time. I think it would be best to merge all of these articles into a single one, ("Generations of programming languages"?) and include the above text there. Rp (talk) 14:57, 28 July 2008 (UTC)[reply]

Historical Accuracy[edit]

I'm finally getting a bit of time to deal with this material, which worries me because it appears to be a modern attempt to re-define old terminology rather than an accurate description of the way the terms were defined and used in their time. I've been working with programming languages since the early 1960's and lived through most of this. It is my recollection that the terms "first-generation" and "second-generation" programming languages were not in any real use until the "marketing" need arose to indicate that "third-generation" languages were an improvement. Prior to that time, the terms "assembler language" and "high-level language" were in use, with assembler language having the usual meaning in which the programmer wrote on a symbolic but more-or-less one-to-one equivalent to the machine's language. I say "more or less" because assemblers like FAP even then started to include macro facilities. "Higher level" languages included FORTRAN, COBOL, MAD, IAL etc. Contrary to some claims (occasionally made elsewhere), these languages had constructs for data type declaration that was weakly enforced, subroutines/functions, flow of control, shared (global) storage, and were automatically mapped to assembler or machine code without the programmer having to manage issues like: general layout of memory for data and program, choice of machine-specific instructions and sequences, and use of machine registers and storage, and establishment of machine-level linkage conventions. "Third-generation" languages appeared with ALGOL and PL/I, and were seen as a great improvement because they could be described with context-free grammars, they automatically managed dynamic storage (especially through the use of a "stack"), and therefore permitted recursive programs, they enforced a stronger type-checking discipline (although one that generally allowed for explicit, planned exposure of memory layout) AND introduced many, varied enhancements of capabilities like: array operations, complex interplays of automatic type conversions, task-management, and references to explicitly managed data objects. This generation of languages reached its "peak" with ADA.

LISP, APL, and other "non-commercial" languages were never really included within the "generational" terminology, although LISP was sometimes talked about as "second-generation" because of the era of its introduction. Many practitioners of the main-line saw the way forward as languages that would include data-structure choice, like SETL, and those were called "very-high-level languages".

After that, the story meanders. "Fourth generation" language was term used in a catch-all attempt to market languages that integrated general algorithmic languages with other kinds of language, such as data base manipulation languages. But it never caught on as a useful technical or marketing term, and "fifth-generation" languages encoded the promise that non-determinism would be practical, emerging in languages like PROLOG and its descendants and synergizing with the AI-oriented "fifth-generation computing" of its time. And since then, languages seem to be "functional", "oriented", or domain-specific rather than "generational".

That's history as I saw it pass, and I have programmed in all those languages, (well, ... except IAL), and written compilers or interpreters for many. I've worked with and listened to stories of many of the people involved in creating Fortran and provided an occasional listening ear to Jean Sammet while she was writing her book on the history of programming languages. I know of no-one working in the field before 1970 who would have considered “second-generation” programming languages to be assembler languages. So, I would like to undertake to revise this page (and related pages) along these lines. I'm fairly sure I can find contemporary material to document the progression, but memories are always individual so if someone has a reference (other than to a new article attempting to guess what happened - and I've seen a couple), I'm happy to try to find a wider view.

I'm not sure that separate pages for the "generations" is the right way to do things, especially with the meandering away from "generations" since 1980, but I don't myself propose to change this overall structure, just the content.CSProfBill (talk) 14:28, 13 August 2009 (UTC)[reply]

So, it seems the issues are settled ... who is going to do the editing? I can hav a go, but not in the coming weeks. Rp (talk) 18:29, 14 August 2009 (UTC)[reply]
As I recall from an early 1980s magazine (I think either Practical Computing or Personal Computer World), the terms first-, second- and third-generation were used in the 1970s by a computer historian to describe what had already happened regarding both hardware advancement and programming language sophistication. These terms were later noticed by some advertising/marketing types, who then rushed to describe their product as "fourth-generation" or even "fifth-generation" when they represented no significant advance over third-gen. In my view, 1-3 have historic credibility; 4 & 5 have not. Afraid that's POV though.
Second-generation languages described in that magazine included the autocodes; these are not mentioned in this talk or the article. They resembled assembly languages in that for many instructions there was a 1-1 relationship between the source code and object code; however they also had an expression parser which could produce several object code instructions from one source line. --Redrose64 (talk) 16:41, 15 August 2009 (UTC)[reply]

Reorganization[edit]

Pursuing the suggestions to reorganize this material, I have created a page called Programming language generations, using this and additional material, as explained on its discussion page. If interested, please go there and make further improvements. Thanks CSProfBill (talk) 14:33, 23 September 2009 (UTC)[reply]

I would much prefer to have all pages on n-th generation programming language redirect to that page and use Trevor Hanson's excellent summary above as the core of the article text. If nobody objects I'm going to issue a merge proposal to do it. Rp (talk) 08:33, 8 August 2011 (UTC)[reply]