Modifying generated code - language-agnostic

Modifying generated code - language-agnostic

I'm wrapping a C++ library in PHP using SWIG and there have been some occasions where I want to modify the generated code (both generated C++ and PHP):
Fix code-generation errors
Add code that makes sense in PHP, but not in C++ (e.g. type checking)
Add documentation tags (e.g. phpDoc)
I'm currently automating these modifications with patch. This approach works, but it seems high-maintenance and fragile. Is there a better way of doing this?

The best bet is to have your code generator generate correct code for your needs. Hand-tweaking generated output is unsustainable. You'll have to tweak it again any time the input changes.
If a tool is producing flatly erroneous output, it's ideal to repair it and submit patches back to the maintainer. If the output is correct for some circumstances but wrong for yours, I'd suggest to add an option that changes the behavior to what you need.
Sometimes, you can use a short program that automatically does an intelligent job of patching your generated code, so that you don't need a manual process to make patches.
Alternatively, you could write your own code generator, but I suspect that's much more work than you want. It also depends on what you're doing. Sometimes code-generation is really just macro-expansion, and there are plenty of good tools for that in the wild.
Good luck!

You may end up having a maintenance nightmare later on. Instead of SWIG you might consider using another generative approach that:
Let you add your custom code directly on the model (so that you won't need to add it post-generation)
Let you define your own generator. This feature alone could take out the need to add custom code all along.
The problem of using third-party generators is that they never really generate what you want. The problem of writing your own code generators is that it's much more work. You choose.
But correcting an automation with another automation...

Code generation is quite a wide topic and there are definitely many other approaches, which might be more interresting to you as mentioned above.
But if you do not want to use other tool, depending on what code is generated and on the PHP OO capabilities, you might use the Generation Gap pattern.

Related

how to write and parse an a2l file

I need some help from the community. I'm trying to write a macro in vba that allows me to generate an a2l file. I found some guides on the net but it is not enough. It's my very first approaching and I find them unclear.
I ask you if anyone can help me find a detailed guide on what are the characteristics of an a2l file and how to interpret it.
Thanks for any help.

Do you mean that A2L = ASAM 2MC? If yes,
Please check (PyA2L) - looks like some people use it even I haven't tried yet. Maybe you can do something with it.
Personally I made my own A2L parser (my own project AutoExtractGui) but I realized that it takes quite big efforts, still having some bugs/issues/... not easy. I am using C# and even C# is one of most high-level language it needs very long code for A2L parsing especially.
Even you try to make your own parser, still you need to understand the A2L's format, meaning, how to use the contents, ... this is additional task you need to study/understand/look inside deeply. Good to study, it is true, but it also needs your big efforts. ASAM standard is still being updated and tools (INCA/CANape/...) are also being updated, A2L contents are also updated time by time. If you make your own parser then you should be ready to consume efforts for those topics.
Maybe such already-existing tools/projects might help your job I guess.

What is the best solution to obfuscate Angular JS controller code

I want to Obfuscate + Minify my Angular JS code in order to not make it public and if someone tries to decode it, then make it a hurdle. Code is running up on the server.
Note: In future we are planning to shift http to https.
I have seen a lot of options like Gulp, Google Closure Compiler, UglifyJS etc and many tool which a user can download and obfuscate the code like jsob, javascript obfuscate etc.
I need a suggestion and have few questions.
What is the more better approach apart from encryption?
If I shift to https shall I still require obfuscations?
What are the better and easy approaches with pros and cons?
If I use a tool like JavaScript obfuscate, then what will be its pros and cons? Am I able to get It back, I mean decode?
Or If someone is able to look into gulp file will it be easy to get my code?

1 - It really depends on what you are trying to achieve. If you really want to protect your code to hide your business logic, you should go for a resilient solution, instead of relying on a minification or obfuscation tool per se which is far too easy to defeat.
2 - Https simply means that the communication between your browser and website is encrypted. Https can also be decrypted, so it would make sense to apply other protection mechanisms
4 - JavaScript Obfuscator and several other tools do not protect the code, they are simple obfuscators and so they can be easily reversed in minutes and that's why some people think it's not worth protecting code on the client-side. In fact, you can get most of the original code using a simple JS optimizer. ClosureCompiler and UglifyJS have precisely this different approach, they reduce the size of the code and optimize it, they do not offer code protection.
3, 5 - I found this blog post from js13kGames competition creator quite useful for my case. He suggests a solution that seems to be more appropriate - Jscrambler. IMO you should give it a try as it combines code transformations with anti-debugging and anti-tampering features. You can also lock your code to a predefined list of domains or set an expiry date to deliver expirable demos, for example. Maybe it could be a fit for your case too as it supports Angular.

I've found a nice solution using gulp-uglify.
If you use implicit anotation, first use gulp-ng-annotate for not losing service names on uglify process.

Minify ceylon-sdk and ceylon-language when compiling to javascript

For an in-browser application written in ceylon-js it would be desirable to reduce the size of the ceylon.language-1.2.0.js file to only that what is actually needed.
This question was answered already.
How to use ceylon js (also with google closure compiler)
But the given solution involves manually editing javascript code resulting from compilation. This is not desirable since a compiler should produce code that hasn´t to be edited manually after compilation (abstraction).
And it is not clear to me if google closure compiler can cope with the ceylon flavour of it in a reliable way.
Is it instead a solution to copy ceylon.language source in ceylon into the project and import only those parts of ceylon.language into the project that are required by it? Then compile to javascript. And then leave away ceylon.language-1.2.0.js at all from the client / in-browser application.
Now my questions:
What parts are needed in the most simple browser application? I think of something like Array(String) and the like.
Has that solution a chance to work absolutely reliable?
Will there be a better solution coming from the authors of ceylon that make this attempt obsolete?

The compilation of the language module to JS is a tricky process, because of the native stuff involved and because there are a couple of declarations that have to be in a certain order for things to work.
Minification is still pending, we are going to do it but it's not the highest priority right now, and we have to determine the best way to solve this problem; one option that has been discussed is to have a version of the language module without any metamodel info, for example.

Do you use the debugger of the language to understand code?

Do you use the debugger of the language that you work in to step through code to understand what the code is doing, or do you find it easy to look at code written by someone else to figure out what is going on? I am talking about code written in C#, but it could be any language.

I use the unit tests for this.

Yes, but generally only to investigate bugs that prove resistant to other methods.
I write embedded software, so running a debugger normally involves having to physically plug a debug module into the PCB under test, add/remove links, solder on a debug socket (if not already present), etc - hence why I try to avoid it if possible. Also, some older debugger hardware/software can be a bit flaky.

I will for particularly complex sections of code, but I hope that generally my fellow developers would have written code that is clear enough to follow without it.

Depending on who wrote the code, even a debugger doesn't help to understand how it works: I have a co-worker who prides himself on being able to get as much as possible done in every single line of code. This can lead to code that is often hard to read, let alone understand what it does in the long run.
Personally I always hope to find code as readable as the code I try to write.

I will mostly use debugger to setup breakpoints on exceptions.
That way I can execute any test or unit test I wrote, and still be able to be right where the code fails if any exception should occur.

I won't say I used all the time, but I do use it fairly often. The domain I work in is automation and controls. You often need the debugger to see the various internal states of the system. It is usually difficult to impossible to determine these simply from looking at code.

Yes, but only as a last resort when there's no unit test coverage and the code is particularly hard to follow. Using the debugger to step through code is a time consuming process and one I don't find too fun. I tend to find myself using this technique a lot when trying to follow VBA code.

Does generated code need to be human readable?

I'm working on a tool that will generate the source code for an interface and a couple classes implementing that interface. My output isn't particularly complicated, so it's not going to be hard to make the output conform to our normal code formatting standards.
But this got me thinking: how human-readable does auto-generated code need to be? When should extra effort be expended to make sure the generated code is easily read and understood by a human?
In my case, the classes I'm generating are essentially just containers for some data related to another part of the build with methods to get the data. No one should ever need to look at the code for the classes themselves, they just need to call the various getters the classes provide. So, it's probably not too important if the code is "clean", well formatted and easily read by a human.
However, what happens if you're generating code that has more than a small amount of simple logic in it?

I think it's just as important for generated code to be readable and follow normal coding styles. At some point, someone is either going to need to debug the code or otherwise see what is happening "behind the scenes".

Yes!, absolutely!; I can even throw in a story for you to explain why it is important that a human can easily read the auto generated code...
I once got the opportunity to work on a new project. Now, one of the first things you need to do when you start writing code is to create some sort of connection and data representation to and from the database. But instead of just writing this code by hand, we had someone who had developed his own code generator to automatically build base classes from a database schema. It was really neat, the tedious job of writing all this code was now out of our hands... The only problem was, the generated code was far from readable for a normal human.
Of course we didn't about that, because hey, it just saved us a lot of work.
But after a while things started to go wrong, data was incorrectly read from the user input (or so we thought), corruptions occurred inside the database while we where only reading. Strange.. because reading doesn't change any data (again, so we thought)...
Like any good developer we started to question our own code, but after days of searching.. even rewriting code, we could not find anything... and then it dawned on us, the auto generated code was broken!
So now an even bigger task awaited us, checking auto generated code that no sane person could understand in a reasonable amount of time... I'm talking about non indented, really bad style code with unpronounceable variable and function names... It turned out that it would even be faster to rewrite the code ourselves, instead of trying to figure out how the code actually worked.
Eventually the developer who wrote the code generator remade it later on, so it now produces readable code, in case something went wrong like before.
Here is a link I just found about the topic at hand; I was acctually looking for a link to one of the chapters from the "pragmatic programmer" book to point out why we looked in our code first.

I think that depends on how the generated code will be used. If the code is not meant to be read by humans, i.e. it's regenerated whenever something changes, I don't think it has to be readable. However, if you are using code generation as an intermediate step in "normal" programming, the generated could should have the same readability as the rest of your source code.
In fact, making the generated code "unreadable" can be an advantage, because it will discourage people from "hacking" generated code, and rather implement their changes in the code-generator instead—which is very useful whenever you need to regenerate the code for whatever reason and not lose the changes your colleague did because he thought the generated code was "finished".

Yes it does.
Firstly, you might need to debug it -- you will be making it easy on yourself.
Secondly it should adhere to any coding conventions you use in your shop because someday the code might need to be changed by hand and thus become human code. This scenario typically ensues when your code generation tool does not cover one specific thing you need and it is not deemed worthwhile modifying the tool just for that purpose.

Look up active code generation vs. passive code generation. With respect to passive code generation, absolutely yes, always. With regards to active code generation, when the code achieves the goal of being transparent, which is acting exactly like a documented API, then no.

I would say that it is imperative that the code is human readable, unless your code-gen tool has an excellent debugger you (or unfortunate co-worker) will probably by the one waist deep in the code trying to track that oh so elusive bug in the system. My own excursion into 'code from UML' left a bitter tast in my mouth as I could not get to grips with the supposedly 'fancy' debugging process.

The whole point of generated code is to do something "complex" that is easier defined in some higher level language. Due to it being generated, the actual maintenance of this generated code should be within the subroutine that generates the code, not the generated code.
Therefor, human readability should have a lower priority; things like runtime speed or functionality are far more important. This is particularly the case when you look at tools like bison and flex, which use the generated code to pre-generate speedy lookup tables to do pattern matching, which would simply be insane to manually maintain.

You will kill yourself if you have to debug your own generated code. Don't start thinking you won't. Keep in mind that when you trust your code to generate code then you've already introduced two errors into the system - You've inserted yourself twice.
There is absolutely NO reason NOT to make it human parseable, so why in the world would you want to do so?
-Adam

One more aspect of the problem which was not mentioned is that the generated code should also be "version control-friendly" (as far as it is feasible).
I found it useful many times to double-check diffs in generated code vs the source code.
That way you could even occasionally find bugs in tools which generate code.

It's quite possible that somebody in the future will want to go through and see what your code does. So making it somewhat understandable is a good thing.
You also might want to include at the top of each generated file a comment saying how and why this file was generated and what it's purpose is.

Generally, if you're generating code that needs to be human-modified later, it needs to be as human-readable as possible. However, even if it's code that will be generated and never touched again, it still needs to be readable enough that you (as the developer writing the code generator) can debug the generator - if your generator spits out bad code, it may be hard to track down if it's difficult to understand.

I would think it's worth it to take the extra time to make it human readable just to make it easier to debug.

Generated code should be readable, (format etc can usually be handled by a half decent IDE). At some stage in the codes lifetime it is going to be viewed by someone and they will want to make sense of it.

I think for data containers or objects with very straightforward workings, human readability is not very important.
However, as soon as a developer may have to read the code to understand how something happens, it needs to be readable. What if the logic has a bug? How will anybody ever discover it if no one is able to read and understand the code? I would go so far as generating comments for the more complicated logic sections, to express the intent, so it's easier to determine if there really is a bug.

Logic should always be readable. If someone else is going to read the code, try to put yourself in their place and see if you would fully understand the code in high (and low?) level without reading that particular piece of code.
I wouldn't spend too much time with code that never would be read, but if it's not too much time i would go through the generated code. If not, at least make comment to cover the loss of readability.

If this code is likely to be debugged, then you should seriously consider to generate it in a human readable format.

There are different types of generated code, but the most simple types would be:
Generated code that is not meant to be seen by the developer. e.g., xml-ish code that defines layouts (think .frm files, or the horrible files generated by SSIS)
Generated code that is meant to be a basis for a class that will be later customized by your developer, e.g., code is generated to reduce typing tedium
If you're making the latter, you definitely want your code to be human readable.
Classes and interfaces, no matter how "off limits" to developers you think they should be, would almost certainly fall under generated code type number 2. They will be hit by the debugger at one point of another -- applying code formatting is the least you can do the ease that debugging process when the compiler hits those generated classes

Like virtually everybody else here, I say make it readable. It costs nothing extra in your generation process and you (or your successor) will appreciate it when they go digging.
For a real world example - look at anything Visual Studio generates. Well formatted, with comments and everything.

Generated code is code, and there's no reason any code shouldn't be readable and nicely formatted. This is cheap especially in generated code: you don't need to apply formatting yourself, the generator does it for you everytime! :)
As a secondary option in case you're really that lazy, how about piping the code through a beautifier utility of your choice before writing it to disk to ensure at least some level of consistency. Nevertheless, almost all good programmers I know format their code rather pedantically and there's a good reason for it: there's no write-only code.

Absolutely yes for tons of good reasons already said above. And one more is that if your code need to be checked by an assesor (for safety and dependability issues), it is pretty better if the code is human redeable. If not, the assessor will refuse to assess it and your project will be refected by authorities. The only solution is then to assess... the code generator (that's usually much more difficult ;))

It depends on whether the code will only be read by a compiler or also by a human. In addition, it matters whether the code is supposed to be super-fast or whether readability is important. When in doubt, put in the extra effort to generate readable code.

I think the answer is: it depends.
*It depends upon whether you need to configure and store the generated code as an artefact. For example, people very rarely keep or configure the object code output from a c-compiler, because they know they can reproduce it from the source every time. I think there may be a similar analogy here.
*It depends upon whether you need to certify the code to some standard, e.g. Misra-C or DO178.
*It depends upon whether the source will be generated via your tool every time the code is compiled, or if it will you be stored for inclusion in a build at a later time.
Personally, if all you want to do is build the code, compile it into an executable and then throw the intermediate code away, then I can't see any point in making it too pretty.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008