Talk:Code generation (compiler)

This is the talk page for discussing improvements to the Code generation (compiler) article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Computer science

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

???

This article has not yet received a rating on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Untitled[edit]

Code Generation is not just done from source code to machine code as stated in the introduction! It is rather about transforming data (e.g. models or other source code) written in one language (or conforming to one metamodel; cf. Code Generation by Model Transformation. A Case Study in Transformation Modularity) into code conforming to another language! If this isn't too detailed, could somebody please add this? 141.76.178.215 (talk) 09:50, 23 September 2008 (UTC)[reply]

Please don't mindlessly revert the page:

"automatical" is not a word;
yacc is not generated by a compiler-compiler; it is a compiler-compiler itself. I could see it on a list of compiler-compilers on the compiler-compiler page but I don't see it worthy of special mention on this page.
JITs don't necessarily work from bytecode; in any event, this point should be discussed in just-in-time compiler.
A preprocessor is a simple compiler, but it is not a code generator.

-Eric

Never to fix spelling and grammatical errors.
I agree that yacc is a compiler-compiler; that should be an error. No need to remove the remark at all.
It is not strange to think JIT as a code generator as it generates code.
A preprocessor generates code.

If you don't want your edits to be mindlessly reverted, please don't erase text mindless in the first place. -- Taku 05:53, Apr 4, 2005 (UTC)

Relationship to Metaprogramming?[edit]

This article doesn't even mention the term "metaprogramming", though it would seem that it should. —Preceding unsigned comment added by 65.0.192.82 (talk) 11:02, 25 June 2010 (UTC)[reply]

Product links[edit]

If nobody can justify keeping the product links at the end of the article, then I plan to remove them. There are just too many products that do code generation to allow for any reasonably short and neutral list here. There are ten pages of code generation products listed at http://www.codegeneration.net/generators.php, for example. Is there some non-arbitrary rationale for choosing a few? --Ds13 22:17, 3 January 2006 (UTC)[reply]

Layman use of "code generation" vs compiler code generation[edit]

This article currently confuses code generation as the compiler phase after lexing, parsing, and before assembly, with layman use of the phrase "code generation" for anything that generates source code (perhaps that can be called "source code generation"?). I am planning on splitting these and creating a disamb. —Quarl ^(talk) 2006-01-14 01:59Z

Please do, if I don't beat you to it. I think of code generation as an IDE feature which creates the framework of a method. For example, you double-click on a control (widget) and Visual Studio inserts the method declaration for the control's default event into the form's source code. --Uncle Ed 17:21, 16 November 2006 (UTC)[reply]

Done, created a disambiguation page. Starting off with the three connotations discussed so far here. If there are any others, you now have a place to hang your hat. dr.ef.tymac 02:00, 23 November 2006 (UTC)[reply]

I don't like the decision and in particular the way it was done to remove some general discussion. I see the confusion in the article but I don't think the splinting is a way to go. There is a good amount of gray area, like JIT, which belongs to both fields: compiler construction and layman use. Also, the old article discussed, albeit briefly, why it is crucial that the code generation is fast, etc. Anyway, I am reorganizing the original article to clarify the general and the specifics. -- Taku 00:49, 2 March 2007 (UTC)[reply]

Slow down please Taku. You should state clearly what you are doing before you go tagging articles for speedy deletion. Multiple editors expressed a view supporting the article you just summarily chose to delete. There were reasons for that. Moreover, your "layman use" distinction could use some clarification also. So far, there does not seem to be anything in your supporting rationale (other than personal preference) that justifies unilateral obliteration of the current article structure. dr.ef.tymac 01:21, 2 March 2007 (UTC)[reply]

Follow-up: For starters, you can define precisely what you are talking about when you say "layman use" and clear up that ambiguity, which seems to be a fundamental assumption of your particular viewpoint. dr.ef.tymac 01:27, 2 March 2007 (UTC)[reply]

Sorry for speedy deletion; it's just that I didn't expect objection. Like I said, I thought I was restoring the points that were lost when the old article was spitted. As for the term "layman use", I didn't think the article needs to clarify what it is. Instead I chose to edit the article so to emphasize the distinction between the use in compilers and others (by putting a header for now). Like I said I still cannot see how it is possible to have the points I restored without having the structure in the current article and havn't seen any rational to delete them in the first place. The problem raised, as I understand, is that the article confuses different uses of code generations (so naturally it was proposed to split the article). But I still think that the restored points are important. -- Taku 10:27, 2 March 2007 (UTC) 10:27, 2 March 2007 (UTC)[reply]

Destroyed article content[edit]

Taku: you obliterated a disambiguation page when you created a redirect. That's even worse than speedy deletion because it was done without an admin. You also seem to have a different understanding of what the "problem" was, which is why I requested clarification in the first place. Please consider these points before continuing:

1) your proposed clarifications are not clearly stated what "old article" points exactly do you intend to restore? Can you please post it in the discussion page first, so others can understand what you are even talking about?
2) so far you have given no rationale for why the disambiguation page needed to be deleted you said you want to restore some older content, that's fine, but that is a separate issue. You can easily restore old content without deleting a disambiguation page. Just review the article history and copy what you want to restore. That is, unless the article history has been obliterated, which is apparently what you have done by making the redirect, even despite objections. dr.ef.tymac 15:10, 2 March 2007 (UTC)[reply]

Follow-up: I was mistsaken, the page was actually speedily-deleted, which is how the content was removed, not by the redirect. dr.ef.tymac 15:50, 2 March 2007 (UTC)[reply]

I was clearly too hasty in deleting a disambig page. Also, I created a redirect because otherwise there would be a red link to code generation, which is not good. I didn't state what I wanted to restore because I already restored the material and make modification so that you (and others) can see what I intended. Also, this is not a separate issue; I am not contending to delete a disambig page. My point is that the new structure (having two article) doesn't work and wanted to have the old structure restored with some modification. Since the disambig page and newly created article don't have much (by which I mean we can simply create them if there is objection), I didn't think it was a big deal to delete, which was clearly my mistake. -- Taku 22:47, 2 March 2007 (UTC)[reply]

I mean I am still in the dark because you still keep saying that I destroyed something meaning, which is really not the case. I just wanted to make my point and to do this it is the best and simplest to simply make the article in the way I intend. If there is a objection, it is a matter of seconds to undo the edit, like creating a disambig page and so forth. Since you havn't counter-argue the points I made earlier, I still can see what is your problem. No sarcastic meaning here. -- Taku 22:53, 2 March 2007 (UTC)[reply]

rewrite[edit]

I like Quuxplusone's rewrite (in particular that it kept information that was lost in the original split). But it shows, I think, we really don't need source code generation. If that article, like he said, is simply a place for spam links and we can simply discuss this matter in this article, there is really not a point for having the two articles about code generation. The structure I am now proposing is let this article mainly talk about compiler code generation and discuss other uses in the end of the article. -- Taku 00:30, 7 March 2007 (UTC) In fact, this can also makes it easier to elaborate the compiler code generation techniques outside compiler construction and compare compiler code generation and non-compiler code generation, which was hinted in his rewrite. (like optimization issue, maybe)? -- Taku 00:41, 7 March 2007 (UTC)[reply]

If you want to rewrite this article, ok, but don't delete source code generation. The two articles are separate. If there is spam in the other article it can be removed without deleting the whole article. The two articles are separate. There's no reason to delete the separate article, at least not yet anyway. If you can rewrite this article so it is clear, authoritiative, and obviates the need for another article, that's fine, but so far that hasn't happened, so deletion is not appropriate. The two articles are separate. dr.ef.tymac 00:57, 7 March 2007 (UTC)[reply]

Taku, I'm glad you like my rewrite, because I wrote it by first reverting to the revision before yours, and then building on that. :) I think that source code generation does deserve its own article, because:

That term as I understand it covers things like snippets that clearly aren't related to compiler code generation.
Source code generation is pretty much the opposite of compilation, so it doesn't make sense to have an article on it in Category:Compiler theory.
I think we all may be missing knowledge of some niche in which "source code generation" is used, unrelated to its mainstream "yacc and snippets" meanings. See Generative programming, which seems to be talking about the same thing (hence my merge suggestion); and see Program transformation and Data transformation, which are talking about some software-engineering niche I've never run into, but seems kind of related. Therefore, I'd like to leave Source code generation alone until someone merges it into Generative programming... and then, if you like, we can suggest a merge between Code generation (compiler) and Generative programming, but that seems far-fetched to me. --Quuxplusone 02:14, 7 March 2007 (UTC)[reply]

Hey! how about merge all symbolic-transformation-related articles under Formal language? That way, anytime a new "niche" is discovered, there's no ambiguity as to where it should go! ... *pause* ... Seriously, the apparent anytime "X" is similar to "Y", merge the two strategy seems counter-productive here, "See also" works well, no? dr.ef.tymac 15:10, 7 March 2007 (UTC)[reply]

It really doesn't matter who makes a rewrite. I didn't like the first split because it (excuse me for repeating this) lost some information; so it naturally occurred to me an idea of reverting, that's all. Anyway, I think my problem has to do with the (current) content of source code generation. For example, I don't think "C preprocessor" is often discussed in the same context as db programmer using yacc to write a program, etc. So, in other words, source code generation (in the current form) feels more like just non-compiler code generation. I guess what I am trying to say is that we can make the the scope of this article unambiguous but that just makes that of the other-half more ambiguous. It's just pushing the problem from here to there. Of course, I must realize that merging the problem over there to here doesn't solve the problem. Enough, ranting. I (or we) will collect more examples first then maybe I (or we) can make a better proposal. -- Taku 22:38, 7 March 2007 (UTC)[reply]

Benoit.dinechin 20:22, 10 March 2007 (UTC): Hi there![reply]

I'm concerned by the complete lack of relevance of this page with regards to modern code generation as performed e.g. in compass code generator that I worked with a Cray Research in late 90's or the Open64 code generator I used for two production compilers whose development I led at STMicroelectronics between 2000 and 2004. Now days with a team I develop a JIT code generator for embedded media processors, but it is for a proof of concepts (not for production).

Modern code generation can be defined defined as: transform a processor-independent program representation into a processor-dependent low-level program representation, then optimize it into native machine code. This includes the following tasks (I have first-hand experience on each of them):

Instruction selection and ABI lowering

Basic flow analyses (liveness, loop nesting, dominance).

IF-conversion and predication.

Loop unrolling and tail duplication.

Extended peephole optimizations.

Control-flow simplifications and basic block alignment.

Prepass instruction scheduling and software pipelining.

Register allocation and stack frame layout.

Postpass instruction scheduling.

Instruction bundling (if VLIW) and encoding.

Modern code generation, in particular JIT compilation, adds to this picture SSA construction, optimizations, destruction. SSA is much more difficult to handle in a code generator than in the machine-independent scalar optimizer because the code generator representation exposes the register pinning constraints (from ISA and ABI) and also predicated code (the non-kill definitions impact everywhere).

If previous authors of this page agree, I am willing to reorganize the Compiler Code Generation page along these guidelines.

My recent edit[edit]

To the reverter of my edit: First of all "assembler code" is an improper term. The proper term is assembly language or assembly code. This is a common error.

The output from a code generator can take a variety of forms. It could be C (I know of several compilers for other language that treat C as their assembly language, a sequence of assembler mnemonics that then get assembled (gcc works this way, as do most Unix compilers), or the binary form of object code. So the e.g., is appropriate as it is an example of one form of output. Derek farn 21:46, 1 July 2007 (UTC)[reply]

I suggest you read the article on assembly language. C cannot possibly used as an assembly language. An assembly language directly represents machine code instructions as mnemonics whereas C has many more levels of abstraction. It's all right to list assembly as possible output surely, that was not the spirit of my edit to change it. Though in the article on compilers, which is better sourced than this one it says: "the transformed intermediate language is translated into the output language, usually the native machine language of the system." in the section about code generation. So I don't believe my change was incorrect. And the reason I moved to change it in the first place was that "assembler code" is considered incorrect usage.

And in the current wording it would make it an incorrect statement, as assembly code can not be as the sentence states "readily executed by the machine" it has to undergo another transformation before that is possible.

Also bytecode is not a subset of assembly. It is an abstraction of machine code that works with a virtual machine instead of a computer processor.

All machine code are abstractions. At the code generation level there is no difference between a so called virtual machine and a computer processor. Your Java byte code may be interpreted or it may be executed on the Sun Jini chip. Your x86 code may be executed on an Intel/AMD chip or it may be executed on one of the many x86 emulators. Derek farn 21:46, 1 July 2007 (UTC)[reply]

Any human concept is an abstraction. The only reason I bothered to explain was because you seemed to have a fundamental misunderstanding of what assembly language was based on your edit summary. You said that "Byte code is a subset of assembler code" which from my perspective was untrue. I would have to concede here, because most of my encounters with this word have been in mathematical contexts, so I misunderstood your meaning at first. I still think it is misleading, however.

Those points are ancillary to the reason for my edit in the first place. In its old state, the sentence was wrong and also misleading. The code generator of a processor does not produce assembly code, it produces machine code or bytecode. We can leave out bytecode if you wish because it IS a subset of machine code. I got rid of the wording "machine(often a computer)" becuase compilers and code generators don't exist for machines besides computers. The closest example of such would be punch card type machines, but those didn't use compilers or code generators for their output.

It is a mistake to think the term compiler was invented when computers were invented. There do/did exist compilers for things like punched card weaving machines. Derek farn 21:46, 1 July 2007 (UTC)[reply]

Those cards were made on punch card cutters, which could hardly be considered a compiler or a code generator. Indeed the word compiler isn't new, but it's only meaning before it's current computing related one is "one who compiles". Per wp:cite it is your burden to provide proof that such a device existed that transformed some form of source code into punched cards, AND for it to be included in this article, that it was called a code-generator or considered to be the predecessor of the modern concept of a code generator. If you can provide evidence to substantiate this, I will concede.

Compilers weren't even invented until after the invention of the electronic computer. I figured my long(ish) edit summary was enough, but apparently not. I am going to change it back. If it gets reverted without responding to my points here, I am going to delete that sentence and anything else here that is factually incorrect per wikipedia policies, because this entire article is uncited.--Shadowdrak 19:56, 1 July 2007 (UTC)[reply]

This threatening behavior is completely uncalled for and will only lead to you being banned. Derek farn 21:46, 1 July 2007 (UTC)[reply]

I wasn't threatening, I was stating my intentions. I would be within my rights under Wikipedia:Verifiability to do what I said, and I was merely stating this. My change agreed better with the rest of the computer science articles in its field, and I believe it provided better context. In the future, I would prefer it if you would stick to bottom posting style rather than breaking up my post.--Shadowdrak 23:33, 1 July 2007 (UTC)[reply]

False statements[edit]

There are technical problems with this artical.

Some compilers do not work in phases. They do not have seperate lexing, parsing, optimizing etc phases.

It is quite simple to parse and genetate stack machine code as you go. META II works by emmitimg stack machine assembly directly out of the parser.

EXP = TERM 
      $('+' TERM .OUT('ADD')
       /'-' TERM .OUT('SUB'));

Steamerandy (talk) 12:14, 10 October 2018 (UTC)[reply]