For learning purposes I would like to write Ethereum Smart Contracts directly in Assembler. By that I mean I want to write the opcodes from the yellow paper somewhere, which then gets converted to bytecode.
However there doesn't seem to be a pure Assembler for the EVM, only inline-assembly in various languages or LLL, which comes close to Assembler.
Is the only way to do this to write my own Assembler? It seems really surprising, since there are so many online disassemblers for it.
You can use ./evm compile from the Go Ethereum project (make sure to download the Geth & Tools archive):
https://geth.ethereum.org/downloads/
Related
Theres a contract on BSC that isnt verified and I am really keen to get the code behind it. I have both the full bytecode and ABI. Is it possible to obtain readable source code using this?
Thanks!
BSCScan has an integrated decompiler that produces pseudocode from the input binary bytecode.
It's not perfect - some of the resulting code performs overly complicated operations that can be written on one line in Solidity, some functions are not able to decompile at all, ... but it can help with manually reconstructing the source code.
There are other decompilers available online as well. Usually it helps to decompile the binary using multiple tools so that you get a better sense of what the source code should do.
There are several packages out there that help in automating the task of writing bindings between C\C++ and other languages.
In my case, I'd like to bind Python, some options for such packages are: SWIG, Boost.Python and Robin.
It seems that the straight forward process is to use these packages to create C\C++ linkable libraries (with mostly static functions) and have the higher language be extended using them.
However, my situation is that I already have a developed working system in C++ therefore plan to embed Python into it so that future development will be in Python.
It's not clear to me how, and if at all possible, to use these packages in helping to extend embedded Python in such a way that the Python code would be able to interact with the various Singleton instances already running in the system, and instantiate C++ classes and interact with them.
What I'm looking for is an insight regarding the design best fitted for this situation.
Boost.python lets you do a lot of those things right out of the box, especially if you use smart pointers. You can even inherit from C++ classes in Python, then pass instances of those back to your C++ code and have everything still work. My favorite resource on how to do various stuff is this (especially check out the "How To" section): http://wiki.python.org/moin/boost.python/ .
Boost.python is especially good if you're using smart pointers or intrusive pointers, as those translate transparently into PyObject reference counting. Also, it's very good at making factory functions look like Python constructors, which makes for very clean Python APIs.
If you're not using smart pointers, it's still possible to do all the things you want, but you have to mess with various return and lifetime policies, which can give you a headache.
To make it short: There is the modern alternative pybind11.
Long version: I also had to embed python. The C++ Python interface is small so I decided to use the C Api. That turned out to be a nightmare. Exposing classes lets you write tons of complicated boilerplate code. Boost::Python greatly avoids this by using readable interface definitions. However I found that boost lacks a sophisticated documentation and dor some things you still have to call the Python api. Further their build system seems to give people troubles. I cant tell since i use packages provided by the system. Finally I tried the boost python fork pybind11 and have to say that it is really convenient and fixes some shortcomings of boost like the necessity of the use of the Python Api, ability to use lambdas, the lack of an easy comprehensible documentation and automatic exception translation. Further it is header only and does not pull the huge boost dependency on deployment, so I can definitively recommend it.
I'm creating a game in XNA and was thinking of creating my own scripting language (extremely simple mind you). I know there's better ways to go about this (and that I'm reinventing the wheel), but I want the learning experience more than to be productive and fast.
When confronted with code at run time, from what I understand, the usual approach is to parse into a machine code or byte code or something else that is actually executable and then execute that, right? But, for instance, when Chrome first came out they said their JavaScript engine was fast because it compiles the JavaScript into machine code. This implies other engines weren't compiling into machine code.
I'd prefer not compiling to a lower language, so are there any known modern techniques for parsing and executing code without compiling to low level? Perhaps something like parsing the code into some sort of tree, branching through the tree, and comparing each symbol and calling some function that handles that symbol? (Wild guessing and stabbing in the dark)
I personally wouldn't roll your own parser ( turning the input into tokens ) or lexer ( checking the input tokens for your language grammar ). Take a look at ANTLR for parsing/lexing - it's a great framework and has full source code if you want to dig into the guts of it.
For executing code that you've parsed, I'd look at running a simple virtual machine or even better look at llvm which is an open-source(ish) attempt to standardise the virtual machine byte code format and provide nice features like JITing ( turning your script compiled byte code into assembly ).
I wouldn't discourage you from the more advanced options that you machine such as native machine code execution but bear in mind that this is a very specialist area and gets real complex, real fast!
Earlz pointed out that my reply might seem to imply 'don't bother doing this yourself. Re-reading my post it does sound a bit that way. The reason I mentioned ANTLR and LLVM is they both have heaps of source code and tutorials so I feel this is a good reference source. Take it as a base and play
You can try this framework for building languages (it works well with XNA):
http://www.meta-alternative.net/mbase.html
There are some tutorials:
http://www.meta-alternative.net/calc.pdf
http://www.meta-alternative.net/pfront.pdf
Python is great as a scripting language. I would recommend you make a C# binding for its C API and use that. Embedding Python is easy. Your application can define functions, types/classes and variables inside modules which the Python interpreter can access. The application can also call functions in Python scripts and get a result back. These two features combined gives you a two-way communication scheme.
Basically, you get the Python syntax and semantics for free. What you would need to implement is the API your application exposes to Python. An example could be access to game logic functions and render functions. Python scripts would then define functions which calls these, and the host application would invoke the Python functions (with parameters) to get work done.
EDIT: Seems like IronPython can save you even more work. It's a C# implementation of CPython, and has its own embedding API: http://www.ironpython.net/
I am in the process of building interactive front-ends to a
distributed application which to date has been used to run workloads
that had a batch-job like structures and needed no UI at all. The application is mostly written in Perl and C and runs on a mix of Unix and Windows machines, but I think this isn't relevant to the UI.
The first such frontend is going have a command-line user interface --
currently, I envision something similar to the CLIs of the Procurve
switches and Cisco routers that I have worked with.
Like modern network gear CLIs, commands are going to resemble
simple sentences, (i.e. show vlans ports 1-4) and the CLI will
have some implicit state, much in the way that Unix shells and
cmd.exe in Windows have environment variables and current working
directories. Moreover, I'd like to implement great tab completion that
is aware of the application's state as much as possible and I want to be able to do that with as
little application-specific code as possible.
The low-level functionality (terminal I/O) seems easy to implement on
top of GNU Readline or similar libraries, but that's only where the
real fun starts. So far I have looked at the Perl modules
Term::Shell
and
Term::ShellUI,
but I'm not convinced that I want to use either of them. I am still
considering rolling my own solution and at the moment I am primarily looking for
inspiration.
Can you recommend any application or library, regardless of
implementation language, that implements a good CLI from which I can
borrow ideas?
I suggest you take a look at the philosophy underlying Microsoft PowerShell. From the idea of piping typed objects between commands to the consistency of its commands and argument syntax, I think it can be a source of inspiration.
You could try having a look at libcli :
"Libcli provides a shared library for
including a Cisco-like command-line
interface into other software."
http://code.google.com/p/libcli/
BTW - I forgot to mention that it is GNU Lesser GPL and actually used by Cisco in some products.
As for your last sentence/question, I'm particularly fond of zsh completion and line editing (zle).
Say there is some functionality needed for an application under development which could be achieved by making a system call to either a command line program or utilizing a library. Assuming efficiency is not an issue, is it bad practice to simply make a system call to a program instead of utilizing a library? What are the disadvantages of doing this?
To make things more concrete, an example of this scenario would be an application which needs to download a file from a web server, either the cURL program or the libcURL library could be used for this.
Unless you are writing code for only one OS, there is no way of knowing if your system call will even work. What happens when there is a system update or an OS upgrade?
Never use a system call if there is a library to do the same function.
I prefer libraries because of the dependency issue, namely the executable might not be there when you call it, but the library will be (assuming external library references get taken care of when the process starts on your platform). In other words, using libraries would seem to guarantee a more stable, predictable outcome in more environments than system calls would.
There are several factors to take into account. One key one is the reliability of whether the external program will be present on all systems where your software is installed. If there is a possibility that it will be missing, then maybe it is better to do it inside your program.
Weighing against that, you might consider that the extra code loaded into your program is prohibitive - you don't need the code bloat for such a seldom-used part of your application.
The system() function is convenient, but dangerous, not least because it invokes a shell, usually. You may be better off calling the program more directly - on Unix, via the fork() and exec() system calls. [Note that a system call is very different from calling the system() function, incidentally!] OTOH, you may need to worry about ensuring all open file descriptors in your program are closed - especially if your program is some sort of daemon running on behalf of other users; that is less of a problem if your are not using special privileges, but it is still a good idea not to give the invoked program access to anything you did not intend. You may need to look at the fcntl() system call and the FD_CLOEXEC flag.
Generally, it is easier to keep control of things if you build the functionality into your program, but it is not a trivial decision.
Security is one concern. A malicious cURL could cause havoc in your program. It depends if this is a personal program where coding speed is your main focus, or a commercial application where things like security play a factor.
System calls are much harder to make safely.
All sorts of funny characters need to be correctly encoded to pass arguments in, and the types of encoding may vary by platform or even version of the command. So making a system call that contains any user data at all requires a lot of sanity-checking and it's easy to make a mistake.
Yeah, as mentioned above, keep in mind the difference between system calls (like fcntl() and open()) and system() calls. :)
In the early stages of prototyping a c program, I often make external calls to programs like grep and sed for manipulation of files using popen(). It's not safe, it's not secure, and it's certainly not portable. But it can allow you to get going quickly. That's valuable to me. It lets me focus on the really important core of the program, usually the reason I used c in the first place.
In high level languages, you'd better have a pretty good reason. :)
Instead of doing either, I'd Unix it up and build a script framework around your app, using the command line arguments and stdin.
Other's have mentioned good points (reliability, security, safety, portability, etc) - but I'll throw out another. Performance. Generally it is many times faster to call a library function or even spawn a new thread then it is to start an entire new process (and then you still have to correctly check/verify it's execution and parse it's output!)