How do you make mathematical equations readable and maintainable? - language-agnostic

Given maths is not my strongest point I'm implementing a bezier curve for 3D animation.
The formula is shown here, and as you can see it is quite nasty. In my programming I use descriptive names, and like to break complex lines down to smaller manageable ones.
How is the best way to handle a scenario like this?
Is it to ignore programming best practices and stick with variable names such as x, y, and t?

In my opinion when you have a predefined mathematical equation it is perfectly acceptable to use short variable names: x, y, t, P_0 etc. which correspond to the equation. Make sure to reference the formula clearly though.

if the formulas is extrated to its own function i'd certainly use the canonical maths representation, and maybe add the wiki page url in a comment
if its imbedded in code with a specific usage of the function then keeping the domain names from your code might be better
it depends

Seeing as only the mathematician in you is actually going to understand the formula, my advice would be to go with a style that a mathematician would be most comfortable with (so letters as variables etc...)
I would also definitely put a comment in there somewhere that clearly states what the formula is, and what it does, for example "This method returns a series of points along a quadratic Bezier curve". That way whenever the programmer in you revisits the code you can safely ignore the mathematical complexity with the assumption that your inner mathematician has already checked to make sure its all ok.

I'd encourage you to use mathematic's best practices and denote variables with letters. Just provide explanation for the variables above the formula. And if you can split the formula to smaller subformulas, even better.

Don't bother. Just reference the documentation (the wikipedia page in this case or even better your own documentation) and make sure the variable names match your documentation. Code comments are just not well suited (nor need them to) describe mathematical formulation.
Sometimes a reference is better than 40 lines of comments or even suggestive variable names.

Make the formula in C# (or other language of preference) resemble the mathematical formula as closely as possible, and include a reference to the formula, including a description of the variables. The idea in coding is to be readable, and if you're dealing with mathematical formulae the most readable representation is the one that looks most like mathematics.

You could key the formula into wolfram alpha ... it will try to simplify for you.
It'll also output in a mathematica friendly style ... funnily enough ;)
Kindness,
Dan

I tend to break an equation down into its root parts.
def sum(array)
array.inject(0) { |result, item| result + item }
end
def average(array)
sum(array) / array.length
end
def sum_squared_error(array)
avg = average(array)
array.inject(0) { |result, item| result + (item - avg) ** 2 }
end
def variance(array)
sum_squared_error(array) / (array.length - 1)
end
def standard_deviation(array)
Math.sqrt(variance(array))
end

You might consider using a domain-specific language to handle this. Mathematica would allow you to write out the equation just as it appears in mathematical notion.
The more your final form resembles the original equation, the more maintainable it will be in the long run (otherwise you have to interpret the code every time you see it).

Related

How can i calculate such integral in maple?

Beloew Hyperlink shows Orthogonal Functions.
I used different commands in maple but i can't apply these Integral expressions in Maple.
How can i integrate such conditional Integrals ??? (For Example the Integral with red box around it)
Orthogonal Functions
(This is more of a math Question than a programming Question, so it probably should've gone to math.stackexchange.com.)
You need to use an assuming clause to tell Maple that m and n are integer, and you need to use option AllSolutions to int to tell it to do a case-by-case analysis of the parameters. For example,
int(sin(n*Pi*x/L)*sin(m*Pi*x/L), x= 0..L, AllSolutions)
assuming n::posint, m::posint, L>0;
I've assumed positivity of all parameters simply to reduce the number of cases presented in Maple's answer.

What is an Abstract Syntax Tree/Is it needed?

I've been interested in compiler/interpreter design/implementation for as long as I've been programming (only 5 years now) and it's always seemed like the "magic" behind the scenes that nobody really talks about (I know of at least 2 forums for operating system development, but I don't know of any community for compiler/interpreter/language development). Anyways, recently I've decided to start working on my own, in hopes to expand my knowledge of programming as a whole (and hey, it's pretty fun :). So, based off the limited amount of reading material I have, and Wikipedia, I've developed this concept of the components for a compiler/interpreter:
Source code -> Lexical Analysis -> Abstract Syntax Tree -> Syntactic Analysis -> Semantic Analysis -> Code Generation -> Executable Code.
(I know there's more to code generation and executable code, but I haven't gotten that far yet :)
And with that knowledge, I've created a very basic lexer (in Java) to take input from a source file, and output the tokens into another file. A sample input/output would look like this:
Input:
int a := 2
if(a = 3) then
print "Yay!"
endif
Output (from lexer):
INTEGER
A
ASSIGN
2
IF
L_PAR
A
COMP
3
R_PAR
THEN
PRINT
YAY!
ENDIF
Personally, I think it would be really easy to go from there to syntactic/semantic analysis, and possibly even code generation, which leads me to question: Why use an AST, when it seems that my lexer is doing just as good a job? However, 100% of my sources I use to research this topic all seem adamant that this is a necessary part of any compiler/interpreter. Am I missing the point of what an AST really is (a tree that shows the logical flow of a program)?
TL;DR: Currently in route to develop a compiler, finished the lexer, seems to me like the output would make for easy syntactic analysis/semantic analysis, rather than doing an AST. So why use one? Am I missing the point of one?
Thanks!
First off, one thing about your list of components does not make sense. Building an AST is (pretty much) the syntactic analysis, so it either shouldn't be in there, or at least come before the AST.
What you got there is a lexer. All it gives you are individual tokens. In any case, you will need an actual parser, because regular languages aren't any fun to program in. You can't even (properly) nest expressions. Heck, you can't even handle operator precedence. A token stream doesn't give you:
An idea where statements and expressions start and end.
An idea how statements are grouped into blocks.
An idea Which part of the expression has which precedence, associativity, etc.
A clear, uncluttered view at the actual structure of the program.
A structure which can be passed through a myriad of transformations, without every single pass knowing and having code to accomodate that the condition in an if is enclosed by parentheses.
... more generally, any kind of comprehension above the level of a single token.
Suppose you have two passes in your compiler which optimize certain kinds of operators applies to certain arguments (say, constant folding and algebraic simplifications like x - x -> 0). If you hand them tokens for the expression x - x * 1, these passes are cluttered with figuring out that the x * 1 part comes first. And they have to know that, lest the transformation is incorrect (consider 1 + 2 * 3).
These things are tricky enough to get right as it is, so you don't want to be pestered by parsing problems as well. That's why you solve the parsing problem first, in a separate parsing step. Then you can, say, replace a function call with its definition, without worrying about adding parenthesis so the meaning remains the same. You save time, you separate concerns, you avoid repetition, you enable simpler code in many other places, etc.
A parser figures all that out, and builds an AST which consequently holds all that information. Without any further data on the nodes, the shape of the AST alone gives you no. 1, 2, 3, and much more, for free. None of the bazillion passes that follow have to worry about it anymore.
That's not to say you always have to have an AST. For sufficiently simple languages, you can do a single-pass compiler. Instead of generating an AST or some other intermediate representation during parsing, you emit code as you go. However, this becomes harder for less simple languages and you can't reasonably do a lot of stuff (such as 70% of all optimizations and diagnostics -- and yes I just made that number up). Generally, I wouldn't advise you to do this. There are good reasons single-pass compilers are mostly dead. Even languages which permit them (e.g. C) are nowadays implemented with multiple passes and ASTs. It's a simple way to get started, but will severely limit you (and the language, if you design it) later.
You've got the AST at the wrong point in your flow diagram. Typically, the output of the lexer is a series of tokens (as you have in your output), and these are fed to the parser/syntactic analyzer, which generates the AST. So the output of your lexer is different from an AST because they are used at different points in the compilation process and fulfill different purposes.
The next logical question is: What, then, is an AST? Well, the purpose of parsing/syntactic analysis is to turn the series of tokens generated by the lexer into an AST (or parse tree). The AST is an intermediate representation that captures the relationship between syntactical elements in a way that is easier to work with programmatically. One way of thinking about this is that a text program is a one dimensional construct, and can only represent ideas as a sequence of elements, while the AST is freed from this constraint, and can represent the underlying relationships between those elements in 2 dimensions (as typically drawn), or any higher dimension space if you so choose to think about it that way.
For instance, a binary operator has two operands, let's call them A and B. In code, this may be spelled 'A * B' (assuming an infix operator - another advantage of an AST is to hide such distinctions that may be important syntactically, but not semantically), but for the compiler to "understand" this expression, it must read 5 characters sequentially, and this logic can quickly become cumbersome, given the many possibilities in even a small language. In an AST representation, however, we have a "binary operator" node whose value is '*', and that node has two children, values 'A' and 'B'.
As your compiler project progresses, I think you will begin to see the advantages of this representation.

algorithm to solve related equations

I am working on a project to create a generic equation solver... envision this to take the form of 25-30 equations that will be saved in a table- variable names along with the operators.
I would then call this table for solving any equation with a missing variable and it would move operators/ other pieces to the other side of the missing variable
e.g. 2x+ 3y=z and if x were missing variable. I would call equation with values for y and z and it would convert to solve for x=(z-3y)/2
equations could be linear, polynomial, binary(yes/no result)...
i am not sure if i can get any light-weight library available or whether this needs to built from scratch... any pointers or guidance will be appreciated
See Maxima.
I rather like it for my symbolic computation needs.
If such a general black-box algorithm could be made accurate, robust and stable, pigs could fly. Solutions can be nonexistent, multiple, parametrized, etc.
Even for linear equations it gets tricky to do it right.
Your best bet is some form of Newton algorithm, but generally you tailor it to your problem at hand.
EDIT: I didn't see you wanted something symbolic, rather than numerical. It's another bag of worms.

What's that CS "big word" term for the same action always having the same effect

There's a computer science term for this that escapes my head, one of those words that ends with "-icity".
It means something like a given action will always produce the same result, IE there won't be any hysteresis, or the action will not alter the functioning of the system...
Ring a bell, anyone? Thanks.
Apologies for the tagging, I'm only tagging it Java b/c I learned about this in a Java class back in school and I figure that crowd tends to have more CS background...
This could mean two different things:
deterministic - meaning that given the same initial state, the same operation (with exactly the same data) will always produce the same resulting state (and optional output.) - http://en.wikipedia.org/wiki/Deterministic_algorithm
i.e. same action has the same effect - assuming you start from the same place in the same system. (Nothing random about it, nothing fed in from the outside that could effect the result...)
idempotent - meaning applying a function to a value once e.g. f(x) = v produces the same result as applying the function multiple times e.g. f(f(f(x))) = v - http://en.wikipedia.org/wiki/Idempotence
i.e. one or more function applications yields the same value given the same initial value
you mean idempotent ??
Referential transparency is also used in some CS circles.
Nullipotent?
deterministic ,.,-=
Are you looking for invariant?
http://en.wikipedia.org/wiki/Invariant_%28computer_science%29
In computer science, a predicate is
called an invariant to a sequence of
operations if the predicate always
evaluates at the end of the sequence
to the same value as before starting
the sequence.
side effect-free?
In math, a function 'f' is idempotent if multiple applications do not change the result.
you mean idempotence?
or the action will not alter the functioning of the system...
Are you looking for ‘idempotence’?
The "ends with -icity" part of your question makes me think you might be looking for monotonicity, even though it does not quite match description/definition of the word. From the Wikipedia article:
In mathematics, a monotonic function (or monotone function) is a function which preserves the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of order theory.
In the following illustrations (also borrowed from the Wikipedia article) three functions are drawn:
A:
B:
C:
A and B and both monotonic (increasing and decreasing respectively), while C is not monotonic.
You mean an atomic block of code?
The A in ACID.
Atomicity - states that database modifications must follow an “all or nothing” rule. Each transaction is said to be “atomic.” If one part of the transaction fails, the entire transaction fails.
It sounds like what you're describing would be a memoryless function. Although the term memorylessness is usually used for stochastic distributions, I don't quite remember if it has a programming equivalent...

One line functions in C?

What do you think about one line functions? Is it bad?
One advantage I can think of is that it makes the code more comprehensive (if you choose a good name for it). For example:
void addint(Set *S, int n)
{
(*S)[n/CHAR_SIZE] |= (unsigned char) pow(2, (CHAR_SIZE - 1) - (n % CHAR_SIZE));
}
One disadvantage I can think of is that it slows the code (pushing parameters to stack, jumping to a function, popping the parameters, doing the operation, jumping back to the code - and only for one line?)
is it better to put such lines in functions or to just put them in the code? Even if we use them only once?
BTW, I haven't found any question about that, so forgive me if such question had been asked before.
Don't be scared of 1-line functions!
A lot of programmers seem to have a mental block about 1-line functions, you shouldn't.
If it makes the code clearer and cleaner, extract the line into a function.
Performance probably won't be affected.
Any decent compiler made in the last decade (and perhaps further) will automatically inline a simple 1-line function. Also, 1-line of C can easily correspond to many lines of machine code. You shouldn't assume that even in the theoretical case where you incur the full overhead of a function call that this overhead is significant compared to your "one little line". Let alone significant to the overall performance of your application.
Abstraction Leads to Better Design. (Even for single lines of code)
Functions are the primary building blocks of abstract, componentized code, they should not be neglected. If encapsulating a single line of code behind a function call makes the code more readable, do it. Even in the case where the function is called once. If you find it important to comment one particular line of code, that's a good code smell that it might be helpful to move the code into a well-named function.
Sure, that code may be 1-line today, but how many different ways of performing the same function are there? Encapsulating code inside a function can make it easier to see all the design options available to you. Maybe your 1-line of code expands into a call to a webservice, maybe it becomes a database query, maybe it becomes configurable (using the strategy pattern, for example), maybe you want to switch to caching the value computed by your 1-line. All of these options are easier to implement and more readily thought of when you've extracted your 1-line of code into its own function.
Maybe Your 1-Line Should Be More Lines.
If you have a big block of code it can be tempting to cram a lot of functionality onto a single line, just to save on screen real estate. When you migrate this code to a function, you reduce these pressures, which might make you more inclined to expand your complex 1-liner into more straightforward code taking up several lines (which would likely improve its readability and maintainability).
I am not a fan of having all sort of logic and functionality banged into one line. The example you have shown is a mess and could be broken down into several lines, using meaningful variable names and performing one operation after another.
I strongly recommend, in every question of this kind, to have a look (buy it, borrow it, (don't) download it (for free)) at this book: Robert C. Martin - Clean Code. It is a book every developer should have a look at.
It will not make you a good coder right away and it will not stop you from writing ugly code in the future, it will however make you realise it when you are writing ugly code. It will force you to look at your code with a more critical eye and to make your code readable like a newspaper story.
If used more than once, definitely make it a function, and let the compiler do the inlining (possibly adding "inline" to the function definition). (<Usual advice about premature optimization goes here>)
Since your example appears to be using a C(++) syntax you may want to read up on inline functions which eliminate the overhead of calling a simple function. This keyword is only recommendation to the compiler though and it may not inline all functions that you mark, and may choose to inline unmarked functions.
In .NET the JIT will inline methods that it feels is appropiate, but you have no control over why or when it does this, though (as I understand it) debug builds will never inline since that would stop the source code matching the compiled application.
What language? If you mean C, I'd also use the inline qualifier. In C++, I have the option of inline, boost.lamda or and moving forward C++0x native support for lamdas.
There is nothing wrong with one line functions. As mentioned it is possible for the compiler to inline the functions which will remove any performance penalty.
Functions should also be preferred over macros as they are easier to debug, modify, read and less likely to have unintended side effects.
If it is used only once then the answer is less obvious. Moving it to a function can make the calling function simpler & clearer by moving some of the complexity into the new function.
If you use the code within that function 3 times or more, then I would recommend to put that in a function. Only for maintainability.
Sometimes it's not a bad idea to use the preprocessor:
#define addint(S, n) (*S)[n/CHAR_SIZE] |= (unsigned char) pow(2, (CHAR_SIZE - 1) - (n % CHAR_SIZE));
Granted, you don't get any kind of type checking, but in some cases this can be useful. Macros have their disadvantages and their advantages, and in a few cases their disadvantages can become advantages. I'm a fan of macros in appropriate places, but it's up to you to decide when is appropriate. In this case, I'm going to go out on a limb and say that, whatever you end up doing, that one line of code is quite a bit.
#define addint(S, n) do { \
unsigned char c = pow(2, (CHAR_SIZE -1) - (n % CHAR_SIZE)); \
(*S)[n/CHAR_SIZE] |= c \
} while(0)