multiple java agent proxy the same class - javaagents

If I use a sequence of java agents to to make bytecode enhancement for the same class . Will the input of the latter be the output of the former? For example, agent A adds to two local variables to methods M, the input of agent B will contain the same two local vairables in method M?

Yes, if the transformer for agent B will be called after the the transformation of agent A:
When there are multiple transformers, transformations are composed by chaining the transform calls. That is, the byte array returned by one call to transform becomes the input (via the classfileBuffer parameter) to the next call.
(Source)
The order in which the transformers are called is also specified there:
Transformations are applied in the following order:
Retransformation incapable transformers
Retransformation incapable native transformers
Retransformation capable transformers
Retransformation capable native transformers
For retransformations, the retransformation incapable transformers are not called, instead the result of the previous transformation is reused. In all other cases, this method is called. Within each of these groupings, transformers are called in the order registered. Native transformers are provided by the ClassFileLoadHook event in the Java Virtual Machine Tool Interface).

Related

Chisel Output with SystemVerilog Interfaces/Structs

I'm finding when generating Verilog output from the Chisel framework, all of the 'structure' defined in the chisel framework is lost at the interface.
This is problematic for instantiating this work in larger SystemVerilog designs.
Are there any extensions or features in Chisel to support this better? For example, automatically converting Chisel "Bundle" objects into SystemVerilog 'struct' ports.
Or creating SV enums, when the Chisel code is written using the Enum class.
Currently, no. However, both suggestions sound like very good candidates for discussion for future implementation in Chisel/FIRRTL.
SystemVerilog Struct Generation
Most Chisel code instantiated inside Verilog/SystemVerilog will use some interface wrapper that deals with converting the necessary signal names that the instantiator wants to use into Chisel-friendly names. As one example of doing this see AcceleratorWrapper. That instantiates a specific accelerator and does the connections to the Verilog names the instantiator expects. You can't currently do this with SystemVerilog structs, but you could accomplish the same thing with a SystemVerilog wrapper that maps the SystemVerilog structs to deterministic Chisel names. This is the same type of problem/solution that most people encounter/solve when integrating external IP in their project.
Kludges aside, what you're talking about is possible in the future...
Some explanation is necessary as to why this is complex:
Chisel is converted to FIRRTL. FIRRTL is then lowered to a reduced subset of FIRRTL called "low" FIRRTL. Low FIRRTL is then mapped to Verilog. Part of this lowering process flattens all bundles using uniquely determined names (typically a.b.c will lower to a_b_c but will be uniquified if a namespace conflict due to the lowering would result). Verilog has no support for structs, so this has to happen. Additionally, and more critically, some optimizations happen at the Low FIRRTL level like Constant Propagation and Dead Code Elimination that are easier to write and handle there.
However, SystemVerilog or some other language that a FIRRTL backend is targeting that supports non-flat types benefits from using the features of that language to produce more human-readable output. There are two general approaches for rectifying this:
Lowered types retain information about how they were originally constructed via annotations and the SystemVerilog emitter reconstructs those. This seems inelegant due to lowering and then un-lowering.
The SystemVerilog emitter uses a different sequence of FIRRTL transforms that does not go all the way to Low FIRRTL. This would require some of the optimizing transforms run on Low FIRRTL to be rewritten to work on higher forms. This is tractable, but hard.
If you want some more information on what passes are run during each compiler phase, take a look at LoweringCompilers.scala
Enumerated Types
What you mention for Enum is planned for the Verilog backend. The idea here was to have Enums emit annotations describing what they are. The Verilog emitter would then generate localparams. The preliminary work for annotation generation was added as part of StrongEnum (chisel3#885/chisel3#892), but the annotations portion had to be later backed out. A solution to this is actively being worked on. A subsequent PR to FIRRTL will then augment the Verilog emitter to use these. So, look for this going forward.
On Contributions and Outreach
For questions like this with (currently) negative answers, feel free to file an issue on the respective Chisel3 or FIRRTL repository. And even better than that is an RFC followed by an implementation.

Chisel code translating into Verilog/C++

So, I have a theoretical question about the Chisel code transformation.
I already know that the Chisel code is compiled to Java bytecodes, it then runs in the JVM and it emits equivalent Verilog and C++ source codes (for older versions of Chisel).
But I'm having a lot of trouble in understanding that process.
For instance, in the Chisel source code, I can see that there is a Reg class, for example, that creates a definition of a register. I can then import and use this class in the design of the hardware. But I cannot understand where the separation between the description of the Reg class itself and the actual usage of it lies. It's so confusing.
For example, suppose I'm developing a project that USES a Reg object, where there's a source code called whatever.scala, and inside this source code there are Reg objects. As I understand it, the description of the register itself (the Reg.scala) and the source code that uses it (whatever.scala) are all compiled at the same time, and that's precisely the point a cannot get.
To make it short, in my point of view, there is a separation between describing a library, and actually using this library after it was built. You must first compile the library, then you import it into your project and use it. But in Chisel, these two steps seem to happen at the same time.
Is there any intermediate process between the JVM code emission and the creation of the Chisel AST?
Chisel is a high level highly parameterized embedded DSL for generating hardware design.
A chisel program typically consists of several steps:
A chisel3 program first constructs an internal representations of an idealized circuit as an abstact syntax tree (AST). At the end of generation, the AST is serialized in to FIRRTL (an intermediate representation) representation. See: chisel3
The firrtl transformation engine process the high level FIRRTL produced with some number of transformation passes. These passes can optimize the code, do width inferences, and finally emit verilog or low firrtl. See: firrtl
Typically during development the circuit is then unit tested. There are two simple ways to do this.
The verilog emitted can be converted into an executable simulation via verilator and a c++ compiler. The simulation can be executed with a test harness that validates the circuit. See: chisel-testers
Or, the emitted firrtl can be simulated using the firrtl-interpreter a lightweight scala program, capable of running the same unit tests used with the chisel-testers. See: firrtl-interpreter
These steps can be run together, using chisel-tester can execute all the above steps automatically. Or done individually, each step can produce output files for the user to add custom integration or to target the verilog for FPGA or a chip tape-out.
The JVM is simply the execution environment used to run scala programs and is not necessary to understand or interact with in order to build circuits using Chisel.
To address the Chisel vs. your project question:
Chisel is a Scala library that is compiled to JVM bytecode. A project that uses Chisel is a Scala program that links against Chisel. This project is also compiled to JVM bytecode, but includes calls to the separately compiled Chisel library*. This project using Chisel is then executed, running on the JVM. The execution of this program constructs a hardware AST that is ultimately emitted as Verilog.
* Many projects (like rocket-chip) do include the Chisel source code as a subproject. Chisel is usually compiled first and then linked against. However, it should make no difference if it were compiled all at once--it's just Scala code that other Scala code invokes.

Given a pointer to a __global__ function, can I retrieve its name?

Suppose I have a pointer to a __global__ function in CUDA. Is there a way to programmatically ask CUDART for a string containing its name?
I don't believe this is possible by any public API.
I have previously tried poking around in the driver itself, but that doesn't look too promising. The compiler emitted code for <<< >>> kernel invocation clearly registers the mangled function name with the runtime via __cudaRegisterFunction, but I couldn't see any obvious way to perform a lookup by name/value in the runtime library. The driver API equivalent cuModuleGetFunction leads to an equally opaque type from which it doesn't seem possible to extract the function name.
Edited to add:
The host compiler itself doesn't support reflection, so there are no obvious fancy language tricks that could be pulled at runtime. One possibility would be to add another preprocessor pass to the compilation trajectory to build a static kernel function lookup table before the final build. That would be rather a lot of work, but it could be done, at least for "classic" compilation where everything winds up in a single translation unit.

Callback functions: what are they in computer programming languages?

I see a lot of callback functions in low-level API's like Win32. But I am confused on what a callback function or callback subroutine is. Is an event in c# considered a callback function?
A callback function is a function that is passed to something else, which will later call the function to notify the user of something. This implies that there must be a way to pass a reference to a function to another, for instance a type of function pointer. In .NET, delegates are used.
An event handler method is an example of a callback function.
In .NET a delegate is the closest match to a Win32 API type callback, though a delegate is far more functional. Events themselves are based on underlying delegates.
The most common use for a callback in the Win32 API is to enumerate resource or something similar. For example the EnumChildWindows API will kick off the enumeration of all the child windows of a specific window and call your custom callback routine for each child window found. Within that callback you can perform any actions that are relevant to your requirement that relate to the specific child window, for example you might be trying to enumerate the windows to programatically find a specific window based on some custom criteria that relates to that window, and once you find the window you can force the termination of the enumeration by returning false from the callback.
In .NET this pattern of using a callback is not required because a more formalized solution is available using the IEnumerable interface.
Callbacks are a specific case of continuations. To quote PFPL, ch 30:
[first class] continuations ... are ordinary values with an indefinite lifetime that
can be passed and returned at will in a computation. Continuations never
“expire”, and it is always sensible to reinstate a continuation without compromising safety. Thus continuations support unlimited “time travel” — we can go back to a previous point in the computation and then return to
some point in its future, at will.
Why are continuations useful? Fundamentally, they are representations
of the control state of a computation at a given point in time. Using continuations we can “checkpoint” the control state of a program, save it in a
data structure, and return to it later
Thus callbacks are just yet another example of continuations. Their use for asynchronous event processing follows from the ability to restore execution to some state via the continuation.
Continuations are particularly easy to use in languages with first class functions, and higher-order functions.
References: Practical Foundations for Programming
Languages, Robert Harper, 2011.

What is the difference between message-passing and method-invocation?

Is there a difference between message-passing and method-invocation, or can they be considered equivalent? This is probably specific to the language; many languages don't support message-passing (though all the ones I can think of support methods) and the ones that do can have entirely different implementations. Also, there are big differences in method-invocation depending on the language (C vs. Java vs Lisp vs your favorite language). I believe this is language-agnostic. What can you do with a passed-method that you can't do with an invoked-method, and vice-versa (in your favorite language)?
Using Objective-C as an example of messages and Java for methods, the major difference is that when you pass messages, the Object decides how it wants to handle that message (usually results in an instance method in the Object being called).
In Java however, method invocation is a more static thing, because you must have a reference to an Object of the type you are calling the method on, and a method with the same name and type signature must exist in that type, or the compiler will complain. What is interesting is the actual call is dynamic, although this is not obvious to the programmer.
For example, consider a class such as
class MyClass {
void doSomething() {}
}
class AnotherClass {
void someMethod() {
Object object = new Object();
object.doSomething(); // compiler checks and complains that Object contains no such method.
// However, through an explicit cast, you can calm the compiler down,
// even though your program will crash at runtime
((MyClass) object).doSomething(); // syntactically valid, yet incorrect
}
}
In Objective-C however, the compiler simply issues you a warning for passing a message to an Object that it thinks the Object may not understand, but ignoring it doesn't stop your program from executing.
While this is very powerful and flexible, it can result in hard-to-find bugs when used incorrectly because of stack corruption.
Adapted from the article here.
Also see this article for more information.
as a first approximation, the answer is: none, as long as you "behave normally"
Even though many people think there is - technically, it is usually the same: a cached lookup of a piece of code to be executed for a particular named-operation (at least for the normal case). Calling the name of the operation a "message" or a "virtual-method" does not make a difference.
BUT: the Actor language is really different: in having active objects (every object has an implicit message-queue and a worker thread - at least conceptionally), parallel processing becones easier to handle (google also "communicating sequential processes" for more).
BUT: in Smalltalk, it is possible to wrap objects to make them actor-like, without actually changing the compiler, the syntax or even recompiling.
BUT: in Smalltalk, when you try to send a message which is not understoof by the receiver (i.e. "someObject foo:arg"), a message-object is created, containing the name and the arguments, and that message-object is passed as argument to the "doesNotUnderstand" message. Thus, an object can decide itself how to deal with unimplemented message-sends (aka calls of an unimplemented method). It can - of course - push them into a queue for a worker process to sequentialize them...
Of course, this is impossible with statically typed languages (unless you make very heavy use of reflection), but is actually a VERY useful feature. Proxy objects, code load on demand, remote procedure calls, learning and self-modifying code, adapting and self-optimizing programs, corba and dcom wrappers, worker queues are all built upon that scheme. It can be misused, and lead to runtime bugs - of course.
So it it is a two-sided sword. Sharp and powerful, but dangerous in the hand of beginners...
EDIT: I am writing about language implementations here (as in Java vs. Smalltalk - not inter-process mechanisms.
IIRC, they've been formally proven to be equivalent. It doesn't take a whole lot of thinking to at least indicate that they should be. About all it takes is ignoring, for a moment, the direct equivalence of the called address with an actual spot in memory, and consider it simply as a number. From this viewpoint, the number is simply an abstract identifier that uniquely identifies a particular type of functionality you wish to invoke.
Even when you are invoking functions in the same machine, there's no real requirement that the called address directly specify the physical (or even virtual) address of the called function. For example, although almost nobody ever really uses them, Intel protected mode task gates allow a call to be made directly to the task gate itself. In this case, only the segment part of the address is treated as an actual address -- i.e., any call to a task gate segment ends up invoking the same address, regardless of the specified offset. If so desired, the processing code can examine the specified offset, and use it to decide upon an individual method to be invoked -- but the relationship between the specified offset and the address of the invoked function can be entirely arbitrary.
A member function call is simply a type of message passing that provides (or at least facilitates) an optimization under the common circumstance that the client and server of the service in question share a common address space. The 1:1 correspondence between the abstract service identifier and the address at which the provider of that service reside allows a trivial, exceptionally fast, mapping from one to the other.
At the same time, make no mistake about it: the fact that something looks like a member function call doesn't prevent it from actually executing on another machine or asynchronously, or (frequently) both. The typical mechanism to accomplish this is proxy function that translates the "virtual message" of a member function call into a "real message" that can (for example) be transmitted over a network as needed (e.g., Microsoft's DCOM, and CORBA both do this quite routinely).
They really aren't the same thing in practice. Message passing is a way to transfer data and instructions between two or more parallel processes. Method invocation is a way to call a subroutine. Erlang's concurrency is built on the former concept with its Concurrent Oriented Programing.
Message passing most likely involves a form of method invocation, but method invocation doesn't necessarily involve message passing. If it did it would be message passing. Message passing is one form of performing synchronization between to parallel processes. Method invocation generally means synchronous activities. The caller waits for the method to finish before it can continue. Message passing is a form of a coroutine. Method-invocation is a form of subroutine.
All subroutines are coroutines, but all coroutines are not subroutines.
Is there a difference between message-passing and method-invocation, or can they be considered equivalent?
They're similar. Some differences:
Messages can be passed synchronously or asynchronously (e.g. the difference between SendMessage and PostMessage in Windows)
You might send a message without knowing exactly which remote object you're sending it to
The target object might be on a remote machine or O/S.