Adding user defined functions to a simple calculator YACC [closed] - function

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I've been searching all over the internet for a comprehensible example to how you can define and call a function in a simple calculator interpreter. Maybe I've found the answer but since I'm not familiar with YACC I couldn't see it.
So the question is, how do you set up a symbol table for user defined functions and how do you store/call these functions in a calculator interpreter?
I'm basically looking to achieve something like this:
def sum(a,b) { a + b }
sum(5,5)
result:
10
Any pointers or examples would be appreciated

That's definitely diving in to the concepts required to interpret (or compile) a programming language, which makes it difficult to provide an answer in a format suitable for StackOverflow. Here's a quick outline:
You need a symbol table which can hold both functions and variables. Or two symbol tables. In the first case, the mapped value will be some kind of variant type, such as a discriminated union; you might need that anyway if you have more than one type of variable. In the second case, you can use a specific type for the mapped value of function names. I'd go for the first option, because it allows functions to be first-class objects.
You need some kind of type which represents the "value" of a function definition. The obvious type is the Abstract Syntax Tree (AST) of an expression (or a program), and doing that will generally simplify your code so I'd highly recommend it. That means that the calculator/parser will not actually evaluate 5+5 (even if that is the literal input) or a+b, but rather will return an AST to whoever called the parser. Consequently, you will need:
A function which can evaluate an AST. That's usually straightforward to write, since it's just a depth-first tree walk. But now you need to worry about variable scope because when you do evaluate they body of your function sum, you probably want to only locally set the values of the parameters.
If you manage all that, you will have gone several steps beyond the usual "let's build a calculator with flex and bison" project, and I definitely encourage you to do so. You may want to take a look at the classic text Structure and Interpretation of Computer Programs (Abelson & Sussman, 1996; often referred to simply as SICP).

Related

How to create quasi-copy of a file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 months ago.
Improve this question
I would like to create quasi-copy of my directory with sensitive data.
Then I would like to share this quasi-copy with others to provide so called 'real data'.
This 'real data' would allow others to do tests in matters related to storage performance.
My question is how to create copy of any file ( text, jpeg, sqlite.db, ... ) that will not contain any of its original data, but from point of view of compression, de-duplication and so on would be very similar.
I appreciate any pointers to tools, libs that helps with creating such quasi copy.
I appreciate any pointers what to measure and how to measure similarity of original file and its quasi copy.
I don't know whether a "quasi-copy" is an established notion and whether there are accepted rules and procedures. But here is a crude take on how to "mask" data for protection: replace words by equal-length sequences of (perhaps adjusted) random characters. One cannot then do a very accurate storage analysis of real data but that has to suffer after any data scrambling.
One way to build such a "quasi-word," wrapped in a program for convenience
use warnings;
use strict;
use feature 'say';
use Scalar::Util qw(looks_like_number);
my $word = shift // die "Usage: $0 word\n";
my #alphabet = 'a'..'z';
my $quasi_word;
foreach my $c (split '', $word) {
if (looks_like_number($c)) {
$quasi_word .= int rand 10;
}
else {
$quasi_word .= $alphabet[int rand 26];
}
}
say $quasi_word;
This doesn't cut it at all for a de-duplication analysis. For that one can replace repeated words by the same random sequence, for example as follows.
First make a pass over the words from the file and build a frequency hash, of how many times each word appears. Then as each word is processed it is first checked whether it repeats, and if it does a random replacement is built only the first time and later that is used every time.
Further adjustments for specific needs should be easy to add.
Any full masking (scrambling/tokenization...) of data of course cannot allow a precise analysis of compression of real data using such a mangled set.
If you know specific sensitive parts then only those can be masked and that would improve the accuracy of the following analyses considerably.
This will be slow but if a set of files need be built once in a while it shouldn't matter.
A question was raised of the "criteria" for the "similarity" of masked data, and the question itself was closed for lack of detail. Here is a comment on that.
It seems that only "measure" of "similarity" is simply whether the copy behaves the same in the storage performance analysis as the real data would. But, one can't tell without using real data for that analysis! (What clearly would reveal that data.)
The one way I can think of is to build a copy using a candidate approach and then use it (yourself) for individual components of that analysis. Does it compress (roughly) the same? How about de-duplication? How about ...? Etc. Then make your choices.
If the used approach is flexible enough the masking can then be adjusted for whichever part of analysis "failed" -- the copy behaved substantially differently. (If compression was very different perhaps refine your algorithm to study words and produce more similar obfuscation, etc.)

How do I know which is a function and which is an operator? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
MySQL have both functions and operators. However, it is not that clear for an arbitrary keyword whether it is a function or an operator.
For example, I believe ASCII() is a function (it appears in the string functions section of the manual). However, LIKE appears there as well, and it does not appear to be a function; for example, since the syntax does not force (...) after the LIKE keyword, and the docs mention that
By default, there must be no whitespace between a function name and the parenthesis following it.
In some cases it is that clear. For example, the IN keyword appears in the Comparison Functions and Operators section of the manual (a non-disclosing section name), and it appears there with the name IN() (as if it was a function), but the examples show SELECT 2 IN (0,3,5,7);, which hints that this is an operator (watch the space after the keyword).
In the same section there is INTERVAL(). Reading carefully shows the following line in the description of this keyword:
It is required that N1 < N2 < N3 < ... < Nn for this function to work correctly.
which hints that this is indeed a function, and not an operator. LEAST(), which also appears there, does not mention whether it is a function or an operator.
My questions are as follows:
Are there any internal differences between the concepts of function and of operator in MySQL?
Is there a way to figure out, given a keyword, whether it is a function of an operator?
Can a keyword be both, depending on context? I know that some keywords can both a function and a type, for example.
I wish to know that both in order to understand the abstract structure of MySQL, and in order to use it for syntax highlighting.
MySQL provides a list of "non-typed" operators here.
Basically, a function is followed by a list of arguments enclosed in parentheses. Even functions that don't take arguments, such as now() require the parentheses.
An operator, on the other hand, is part of the syntax of the MySQL query language. These are known to the parser, which recognizes them. Operators often use "infix" notation, where the operator appears between the arguments. However, this is not required (just consider the unary minus operator).
A cursory look at the list of operators shows that something can be both an operator and a function. An example is mod().
The most important difference to me is that users can define functions. But users cannot (yet) define operators. Unlike object oriented languages, SQL does not offer a way to provide additional definitions for operators.
And, for your purpose, you should peruse the manual pages to get the lists of things that you care about.

Multiple logics (N number of clients) handled with only one function. All call the same function, HOW TO? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a corporate application written in python/Django (no python experience required to answer this). Its SAAS basically.
Few of the clients seems to have different requirement for a few modules.
Lets say there is a URL
www.xyz.com/groups
which is used by all clients but a few of the clients want to have different output on call of the same URL.
I want to know how can i do that without writing new function for each client or writing conditions in a single function.
Its a silly question i know but there must be some solution to it, i know.
If your code is required to do "A" for case "a" and "B" for case "b" and "C" for case "c", then regardless of what solution you pick, somewhere in the code has to exists something that decides whever or not case 'a/b/c' occurs, and something must exist that will dispatch correct 'A/B/C' action for that case, and of course those A/B/C actions have to be written somewhere in the code, too.
Step outside of the code and think about it. If it is specified and must happen - it has to be coded somewhere. You cannot escape that. Now, if the cases/actions are trivial and typical, you might find some more-or-even-more configurable library that accidentally allows you to configure such cases and actions, and off you go, you have it "with no code" and no "clutter". But looking formally, the code is deep there in the library. So, the decider, dispatcher and actions are coded. Just not by you - by someone that guessed your needs.
But if your needs are nontrivial and highly specific, for example, if it require your various conditions to decide which a/b/c case is it - then most probably you will have to code the 'decider' part for yourself. That means lots of tree-of-IFs, nested-switches, rules-n-loops, or whatever you like or feel adequate. After this, you are left with dispatch/execute phase, and this can be realized in a multitude of ways - i.e. strategy pattern - it is exactly it: dispatch (by concrete class related to case) and execute (the concrete strategy has the concrete code for the case).
Let's try something-like-OO approach:
For example, if you have cases a/b/c for UserTypes U1,U2,U3, you could introduce three classes:
UserType1 inherits from abstract UserType or implements "DoAThing" interface
UserType2 inherits from abstract UserType or implements "DoAThing" interface
UserType3 inherits from abstract UserType or implements "DoAThing" interface
UserType1 implements virtual method 'doTheThing' that executes actionA
UserType2 implements virtual method 'doTheThing' that executes actionB
UserType3 implements virtual method 'doTheThing' that executes actionC
your Users stop keeping "UserType" of type "int" equal to '1/2/3' - now their type is an object: UserType1, UserType2 or UserType3
whenever you must do the thing for a given user, you now just:
result = user.getType().doTheThing( ..params..)
So, instead of iffing/switching, you use OO: tell, don't ask rule. If the action-to-do is dependent solely on UserType, then let the UserType perform it. The resulting code is as short as possible - but at the cost of number of classes to create and, well, ...
... the decider, dispatcher and actions are still in the code. Actions - obvious - in the various usertype clasess. Dispatch - obvious - virtual call by common abstract base method. And decider..? Well: someone at some point had to choose and construct the correct UserType object for the user. If user was stored in the database, if "usertype" is just an integer 1/2/3, then somewhere in your ORM layer those 1/2/3 numbers had to be decoded and translated into UserType1/2/3 classes/objects. That means, that you'd need there a tree-of-ifs or a switch or etc. Or, if you have an ORM smart enough - you just set up a bunch of rules and it did it for you, but that's just again delegating part of the job to more-or-even-more configurable library. Not mentioning that your UserType1/2/3 classes in fact became somewhat .. strategies.
Ok, let's attack the 'choose' part.
You can build a tree of ifs or switches somewhere to decide and assign, but imperative seems to smell. Or, with OO, you can try to polymorphize something so that "it will just do the right thing", but it will not solve anything since again you will have to choose the object type somewhere. So, let's try data-driven: let's use lookups.
we've got five implementations of an action
create a hash/dictionary/map
add usertype1->caseA to the map
add usertype2->caseC to the map
add usertype3->caseB to the map
add usertype4->caseA to the map
add usertype5->caseE to the map
....
now, whenever you have a user and need to decide, just look it up. Instead of a "case" you may hold a ready to use object of a strategy. Or a callable method. Or a typename. Or whatever you need. The point is that instead of writing
if( user.type == 1) { ... }
else if( user.type == 2) ...
or switching, you just look it up:
thing = map[ user.type ]
if ( thing is null ) ???
but, mind that without some care, you might sometimes NOT find a match in the map. And also, the map must be PREDEFINED for ALL CASES. So, simple if X < 100 may turn up into a hundred of entries 0..99 inside the map.
Of course, instead of a map, you may use some rule-engine and you could define a mapping like
X<100 -> caseA
X>=100 -> caseB
and then 'run' the rules against your usertype and obtain a 'result' that will tell you "caseA".
And so on.
Each of the parts - decide, dispatch, execute - you may implement in various ways, shorter or longer, more or less extensible, more or less configurable, as OO/imperative/datadriven/functional/etc - but you cannot escape them:
you have to define the discriminant of the cases
you have to define the implementation of the actions
you have to define the mapping case-to-action
How to do them, is a matter of your aesthetics, language features, frameworks, libraries and .. time you want to spend on creating and mantaining it.

Homework - Differences in accessing values

So I had to write a program in Pascal (A bubble sort, it was pretty simple) and at the end my professor asked a question about our code. He had us write two separate print procedures. The first printArray took in an Array of Integers as it's parameter, where printArray2 took in a type called arrayType which is defined as such:
TYPE
arrayType = ARRAY[1..20] OF INTEGER;
I'm kind of rambling now, but his question was "What was the difference in how the values are accessed when using the different print procedures?"
Just wondering if someone could maybe give me a hint. My original thought was it had something to do with how the memory locations are accessed, but I don't really know how to word it correctly.
Well, hopefully one of you fine people can help me out.
I assume your teacher has introduced you to the concepts of pass by value and pass by reference. I believe you're teacher is trying to get you to think about those concepts as it applies to a primitive array declaration vs declaring your own arrayType. That should at least give you a hint on your homework assignment.
This depends a bit on Pascal dialect+compiler, but I assume its the difference between typed array and open array, the latter of which has a different range (0..number_of_elements-1) than the former (1..number_of_elements)

Law of Demeter violation proves useful. Am I missing something? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have some code like this in my application. It writes out some XML:-
public void doStuff( Business b, XMLElement x)
{
Foo f = b.getFoo();
// Code doing stuff with f
// b is not mentioned again.
}
As I understand it, the Law of Dementer would say this is bad. "Code Complete" says this is increasing coupling. This method should take "f" in the first place.
public void doStuff( Foo f, XMLElement x)
{
// Code doing stuff with f
}
However, now I have come to change this code, and I do actually need to access a different method on b.
public void doStuff( Business b, XMLElement x)
{
Foo f = b.getFoo();
// Code doing stuff with f
// A different method is called on b.
}
This interface has made life easier as the change is entirely inside the method. I do not have to worry about the many places it is called from around the application.
This suggests to me that the original design was correct. Do you agree? What am I missing?
PS. I do not think the behaviour belongs in b itself, as domain objects do not know about the external representation as XML in this system.
First thing is that this isn't necessarily a Law of Demeter violation, unless you are actually calling methods on the Foo object f in doStuff. If you are not, then you are probably fine; you're only using the interface of the Business object b. So I'll assume that you are calling at least one method on 'f'.
One thing you might be "missing" is testability, specifically unit tests. If you have:
public void doStuff( Business b, XMLElement x)
{
Foo f = b.getFoo();
// stuff using f.someMethod
// business stuff with b
// presumably something with x
}
... then if you want to test that doStuff does the right thing for different Foo, you have to first create (or mock) a new Business object with each Foo 'f' that you want, then plug that object into doStuff (even if the rest of the Business-specific stuff is identical). You're testing at one remove from your method, and while your source code may stay simple, your test code gets messier. So, if you really need both f and b in doStuff, then arguably they should both be parameters.
For more reading on it, this person is one of the most emphatic Law of Demeter campaigners I've come across; frequently provides rationales for it.
I think it's difficult to give you a clear-cut answer because
1) the problem statement is pretty abstract, and
2) there is no "absolute" good design - it depends on what's around your classes too, and what was a good design initially might evolve into something you want to refactor as your system grows and evolves, and your understanding of the domain becomes more refined.
I don't see the first example as a "massive" violation of the Demeter principle, but again everything is in the details, it depends on how much is going on in your commented section - you can always add more indirection if you need to. You could for instance have your method "DoStuff" on a WriteBusinessObjectToXmlService class, and if the amount of work you were doing involving f was growing, you could extract it into its method "DoStuffWithF(f, x)", or even create a separate class WriteFToXmlService, with DoStuff(f, x).
If we follow with this logic further we will come up with an idea that a global-everything-objects-repository object (or service locator) should be used which contains links to everything in the system. Than we will not need to change method signatures at all because this repository is all we need.
The problem is that the purpose of the method has changed, but the signature has not. If Foo is everything the method needs than it should accept Foo only. This way we can tell that it operates solely on Foo. This will communicate the purpose of the method more clearly. If it suddenly needs Business too we need to change the method signature because it should indicate other method purpose and requirements
Maybe it is now justified to pass in Business or the method needs a third parameter: the return type of the other method called on the Business object. It depends on the rest of the body of the doStuff method.