Equivalent of C++ std::vector, std::deque and std::map in FreePascal - containers

What's equivalent of std::vector, std::deque and std::map in ObjectPascal (FreePascal compiler)?
In brief:
(vector) is an auto-resizing contiguous array
(deque) is an auto-sizing hybrid array of arrays giving near O(1) random
access while allowing O(1) push/pop from either end
(map, unordered_map) is an associative array

In general it is not logical to assume there are direct substitutes in some different language.
Currently FPC generics are a mix of old school C++ like generics (based on token replay), and Delphi more .NET styled generics (fully declarative, but more limited for value types for languages without autoboxing).
Anyway, I'll give it a try:
TList or its generic variants. (TList<> in Delphi or fgl.Tf*List in unit fgl)
No standard type, I have an array of array generic class, but that is optimized to avoid some of the problems of ordered Lists (insertion performance) while still being an ordered type. I've put it on http://www.stack.nl/~marcov/genlight.pas, maybe it gives you some ideas on how to approach the problem.
None yet. TDictionary once http://bugs.freepascal.org/view.php?id=27206 is committed. Currently usually TAVLTree is used.
There is also some generics including a simple deque in packages/fcl-stl, I suggest you check it out.

Related

Is everything really a string in TCL?

And what is it, if it isn't?
Everything I've read about TCL states that everything is just a string in it. There can be some other types and structures inside of an interpreter (for performance), but at TCL language level everything must behave just like a string. Or am I wrong?
I'm using an IDE for FPGA programming called Vivado. TCL automation is actively used there. (TCL version is still 8.5, if it helps)
Vivado's TCL scripts rely on some kind of "object oriented" system. Web search doesn't show any traces of this system elsewhere.
In this system objects are usually obtained from internal database with "get_*" commands. I can manipulate properties of these objects with commands like get_property, set_property, report_property, etc.
But these objects seem to be something more than just a string.
I'll try to illustrate:
> set vcu [get_bd_cells /vcu_0]
/vcu_0
> puts "|$vcu|"
|/vcu_0|
> report_property $vcu
Property Type Read-only Value
CLASS string true bd_cell
CONFIG.AXI_DEC_BASE0 string false 0
<...>
> report_property "$vcu"
Property Type Read-only Value
CLASS string true bd_cell
CONFIG.AXI_DEC_BASE0 string false 0
<...>
But:
> report_property "/vcu_0"
ERROR: [Common 17-58] '/vcu_0' is not a valid first class Tcl object.
> report_property {/vcu_0}
ERROR: [Common 17-58] '/vcu_0' is not a valid first class Tcl object.
> report_property /vcu_0
ERROR: [Common 17-58] '/vcu_0' is not a valid first class Tcl object.
> puts |$vcu|
|/vcu_0|
> report_property [string range $vcu 0 end]
ERROR: [Common 17-58] '/vcu_0' is not a valid first class Tcl object.
So, my question is: what exactly is this "valid first class Tcl object"?
Clarification:
This question might seem like asking for help with Vivado scripting, but it is not. (I was even in doubt about adding [vivado] to tags.)
I can just live and script with these mystic objects.
But it would be quite useful (for me, and maybe for others) to better understand their inner workings.
Is this "object system" a dirty hack? Or is it a perfectly valid TCL usage?
If it's valid, where can I read about it?
If it is a hack, how is it (or can it be) implemented? Where exactly does string end and object starts?
Related:
A part of this answer can be considered as an opinion in favor of the "hack" version, but it is quite shallow in a sense of my question.
A first class Tcl value is a sequence of characters, where those characters are drawn from the Basic Multilingual Plane of the Unicode specification. (We're going to relax that BMP restriction in a future version, but that's not yet in a version we'd recommend for use.) All other values are logically considered to be subtypes of that. For example, binary strings have the characters come from the range [U+000000, U+0000FF], and integers are ASCII digit sequences possibly preceded by a small number of prefixes (e.g., - for a negative number).
In terms of implementation, there's more going on. For example, integers are usually implemented using 64-bit binary values in the endianness that your system uses (but can be expanded to bignums when required) inside a value boxing mechanism, and the string version of the value is generated on demand and cached while the integer value doesn't change. Floating point numbers are IEEE double-precision floats. Lists are internally implemented as an array of values (with smartness for handling allocation). Dictionaries are hash tables with linked lists hanging off each of the hash buckets. And so on. THESE ARE ALL IMPLEMENTATION DETAILS! As a programmer, you can and should typically ignore them totally. What you need to know is that if two values are the same, they will have the same string, and if they have the same string, they are the same in the other interpretation. (Values with different strings can also be equal for other reasons: for example, 0xFF is numerically equal to 255 — hex vs decimal — but they are not string equal. Tcl's true natural equality is string equality.)
True mutable entities are typically represented as named objects: only the name is a Tcl value. This is how Tcl's procedures, classes, I/O system, etc. all work. You can invoke operations on them, but you can only see inside to a limited extent.
Vivado TCL is not TCL. Vivado will not really document their language they call TCL, but refer you to the real TCL language documentation. Where Vivado TCL and TCL differ, you are left on your own without help. TCL was a poor choice for a scripting language given the very large data bases, so they had to bastardize it to get it half functional. You are better off getting help on the Xilinx forums then in general TCL forums. Why they went with TCL rather than python is beyond anyone's comprehension.

Clojure practice - use functions of complex datatypes or their elements?

It is idiomatic in lisps such as Clojure to use simple data-structures and lots of functions. Still, there are many times when we must work with complex data-structures composed of many simpler ones.
My question is about a matter of good style/practice. In general, should we create functions that take the entire complex object, and within that extract what we need, or should they take exactly and only what they need?
For concreteness, I compare these two options in the following psuedo code
(defrecord Thing
[a b])
(defn one-option
[a]
.. a .. ))
(one-option (:a a-thing))
;; ==============
(defn another-option
[a-thing]
.. (:a a-thing) .. ))
The pros of one-option-f is that the function is simpler, and has fewer responsibilities. It is also more compositional - it can be used in more places.
The downside is that we may end up repeating ourselves. If, for example, we have many functions which transform one Thing into another Thing, many of which need use one-option, within each one we will find ourselves repeating the extraction code. Of course, another option is to create both, but this also adds a certain code overhead.
I think the answer is "It depends" and "It doesn't matter as much in Clojure as it does in object systems". Clojure has a number of largely orthogonal mechanisms designed to express modes of composition. The granularity of function arguments will tend to fall out of how the structure of the program is conceived of in the large.
So much for the hot air. There are some specifics that can affect argument granularity:
Since Clojure data structures are immutable, data hiding and access
functions/methods have little relevance. So the repetition caused by
accessing parts of a complex structure is trifling.
What appear in object-design as associations are rendered by small
collections of typed object pointers in each object. In Clojure,
these tend to become single(ton) global maps. There are a standard
Clojure functions for accessing and manipulating hierarchical
structures of this kind (get-in and assoc-in, for examples).
Where we are looking for dynamically-bound compliance with an
interface, Clojure protocols and datatypes are cleaner and more
adaptable than most object systems. In this case, the whole object is
passed.
Do both. To begin with, provide functions that transform the simplest structures, then add convenience functions for handling more complex structures which compose the functions that handle the simple ones.

Performance comparison: one argument or a list of arguments?

I am defining a new TCL command whose implementation is C++. The command is to query a data stream and the syntax is something like this:
mycmd <arg1> <arg2> ...
The idea is this command takes a list of arguments and returns a list which has the corresponding data for each argument.
My colleague commented that it is best just to use a single argument and when multi values are needed, just call the command multiple times.
There are some other discussions, but one thing we cannot agree with each other is, the performance.
I think my version, list of argument should be quicker because when we want multi arguments, it is one time cost going through TCL interpreter.
His comment is new to me -
function implementation is cached
accessing TCL function is quicker than accessing TCL data
Is this reasoning sound?
If you use Tcl_EvalObjv to invoke the command, you won't go through the Tcl interpreter. The cost will be one hash-table lookup (or less, if you reuse the Tcl_Obj* containing the command name) and then you'll be in the implementation of the command. Otherwise, constructing a list Tcl_Obj* (e.g., with Tcl_NewListObj) and then calling Tcl_EvalObj is nearly as cheap, as that's a special case because the list construction code is guaranteed to produce lists that are also substitution-free commands.
Building a normal string and passing that through Tcl_Eval (or Tcl_EvalObj) is significantly slower, as that has to be parsed. (OTOH, passing the same Tcl_Obj* through Tcl_EvalObj multiple times in a row will be faster as it will be compiled internally to bytecode.)
Accessing into values (i.e., into Tcl_Obj* references) is pretty fast, provided the internal representation of those values matches the type that the access function requires. If there's a mismatch, an internal type conversion function may be called and they're often relatively expensive. To understand internal representations, here's a few for you to think about:
string — array of unicode characters
integer — a C long (except when you spill over into arbitrary precision work)
list — array of Tcl_Obj* references
dict — hash table that maps Tcl_Obj* to Tcl_Obj*
script — bytecoded version
command — pointer to the implementation function
OK, those aren't the exact types (there's often other bookkeeping data too) but they're what you should think of as the model.
As to “which is fastest”, the only sane way to answer the question is to try it and see which is fastest for real: the answer will depend on too many factors for anyone without the actual code to predict it. If you're calling from Tcl, the time command is perfect for this sort of performance analysis work (it is what it is designed for). If you're calling from C or C++, use that language's performance measurement idioms (which I don't know, but would search Stack Overflow for).
Myself? I advise writing the API to be as clear and clean as possible. Describe the actual operations, and don't distort everything to try to squeeze an extra 0.01% of performance out.

Why all the functions from object oriented language allows to return only one value (General)

I am curious to know about this.
whenever I write a function which have to return multiple values, either I have to use pass by reference or create an array store values in it and pass them.
Why all the Object Orinented languages functions are not allowed to return multiple parameters as we pass them as input. Like is there anything inbuilt structure of the language which is restricting from doing this.
Dont you think it will be fun and easy if we are allowed to do so.
It's not true that all Object-Oriented languages follow this paradigm.
e.g. in Python (from here):
def quadcube (x):
return x**2, x**3
a, b = quadcube(3)
a will be 9 and b will be 27.
The difference between the traditional
OutTypeA SomeFunction(out OutTypeB, TypeC someOtherInputParam)
and your
{ OutTypeA, OutTypeB } SomeFunction(TypeC someOtherInputParam)
is just syntactic sugar. Also, the tradition of returning one single parameter type allows writing in the easy readable natural language of result = SomeFunction(...). It's just convenience and ease of use.
And yes, as others said, you have tuples in some languages.
This is likely because of the way processors have been designed and hence carried over to modern languages such as Java or C#. The processor can load multiple things (pointers) into parameter registers but only has one return value register that holds a pointer.
I do agree that not all OOP languages only support returning one value, but for the ones that "apparently" do, this I think is the reason why.
Also for returning a tuple, pair or struct for that matter in C/C++, essentially, the compiler is returning a pointer to that object.
First answer: They don't. many OOP languages allow you to return a tuple. This is true for instance in python, in C++ you have pair<> and in C++0x a fully fledged tuple<> is in TR1.
Second answer: Because that's the way it should be. A method should be short and do only one thing and thus can be argued, only need to return one thing.
In PHP, it is like that because the only way you can receive a value is by assigning the function to a variable (or putting it in place of a variable). Although I know array_map allows you to do return something & something;
To return multiple parameters, you return an single object that contains both of those parameters.
public MyResult GetResult(x)
{
return new MyResult { Squared = Math.Pow(x,2), Cubed = Math.Pow(x,3) };
}
For some languages you can create anonymous types on the fly. For others you have to specify a return object as a concrete class. One observation with OO is you do end up with a lot of little classes.
The syntactic niceties of python (see #Cowan's answer) are up to the language designer. The compiler / runtime could creating an anonymous class to hold the result for you, even in a strongly typed environment like the .net CLR.
Yes it can be easier to read in some circumstances, and yes it would be nice. However, if you read Eric Lippert's blog, you'll often read dialogue's and hear him go on about how there are many nice features that could be implemented, but there's a lot of effort that goes into every feature, and some things just don't make the cut because in the end they can't be justified.
It's not a restriction, it is just the architecture of the Object Oriented and Structured programming paradigms. I don't know if it would be more fun if functions returned more than one value, but it would be sure more messy and complicated. I think the designers of the above programming paradigms thought about it, and they probably had good reasons not to implement that "feature" -it is unnecessary, since you can already return multiple values by packing them in some kind of collection. Programming languages are designed to be compact, so usually unnecessary features are not implemented.

"Necessary" Uses of Recursion in Imperative Languages

I've recently seen in a couple of different places comments along the lines of, "I learned about recursion in school, but have never used it or felt the need for it since then." (Recursion seems to be a popular example of "book learning" amongst a certain group of programmers.)
Well, it's true that in imperative languages such as Java and Ruby[1], we generally use iteration and avoid recursion, in part because of the risk of stack overflows, and in part because it's the style most programmers in those languages are used to.
Now I know that, strictly speaking, there are no "necessary" uses of recursion in such languages: one can always somehow replace recursion with iteration, no matter how complex things get. By "necessary" here, I'm talking about the following:
Can you think of any particular examples of code in such languages where recursion was so much better than iteration (for reasons of clarity, efficiency, or otherwise) that you used recursion anyway, and converting to iteration would have been a big loss?
Recursively walking trees has been mentioned several times in the answers: what was it exactly about your particular use of it that made recursion better than using a library-defined iterator, had it been available?
[1]: Yes, I know that these are also object-oriented languages. That's not directly relevant to this question, however.
There are no "necessary" uses of recursion. All recursive algorithms can be converted to iterative ones. I seem to recall a stack being necessary, but I can't recall the exact construction off the top of my head.
Practically speaking, if you're not using recursion for the following (even in imperative languages) you're a little mad:
Tree traversal
Graphs
Lexing/Parsing
Sorting
When you are walking any kind of tree structure, for example
parsing a grammar using a recursive-descent parser
walking a DOM tree (e.g. parsed HTML or XML)
also, every toString() method that calls the toString() of the object members can be considered recursive, too. All object serializing algorithms are recursive.
In my work recursion is very rarely used for anything algorithmic. Things like factorials etc are solved much more readably (and efficiently) using simple loops. When it does show up it is usually because you are processing some data that is recursive in nature. For example, the nodes on a tree structure could be processed recursively.
If you were to write a program to walk the nodes of a binary tree for example, you could write a function that processed one node, and called itself to process each of it's children. This would be more effective than trying to maintain all the different states for each child node as you looped through them.
The most well-known example is probably the quicksort algorithm developed by by C.A.R. Hoare.
Another example is traversing a directory tree for finding a file.
In my opinion, recursive algorithms are a natural fit when the data structure is also recursive.
def traverse(node, function):
function(this)
for each childnode in children:
traverse(childnode, function)
I can't see why I'd want to write that iteratively.
It's all about the data you are processing.
I wrote a simple parser to convert a string into a data structure, it's probably the only example in 5 years' work in Java, but I think it was the right way to do it.
The string looked like this:
"{ index = 1, ID = ['A', 'B', 'C'], data = {" +
"count = 112, flags = FLAG_1 | FLAG_2 }}"
The best abstraction for this was a tree, where all leaf nodes are primitive data types, and branches could be arrays or objects. This is the typical recursive problem, a non-recursive solution is possible but much more complex.
Recursion can always be rewritten as iteration with an external stack. However if you're sure that you don't risk very deep recursion that would lead to stackoverflow, recursion is a very convenient thing.
One good example is traversing a directory structure on a known operating system. You usually know how deep it can be (maximum path length is limited) and therefore will not have a stackoverflow. Doing the same via iteration with an external stack is not so convenient.
It was said "anything tree". I may be too cautious, and I know that stacks are big nowadays, but I still won't use recursion on a typical tree. I would, however, do it on a balanced tree.
I have a List of reports. I am using indexers on my class that contains this list. The reports are retrieved by their screen names using the indexers. In the indexer, if the report for that screen name doesn't exist it loads the report and recursively calls itself.
public class ReportDictionary
{
private static List<Report> _reportList = null;
public ReportColumnList this[string screenName]
{
get
{
Report rc = _reportList.Find(delegate(Report obj) { return obj.ReportName == screenName; });
if (rc == null)
{
this.Load(screenName);
return this[screenName]; // Recursive call
}
else
return rc.ReportColumnList.Copy();
}
private set
{
this.Add(screenName, value);
}
}
}
This can be done without recursion using some additional lines of code.