What should I call a class that contains a sequence of states - language-agnostic

I have a GUI tool that manages state sequences. One component is a class that contains a set of states, your typical DFA state machine. For now, I'll call this a StateSet (I have a more specific name in mind for the actual class that makes sense, but this name I think will suffice for the purpose of this question.)
However, I have another class that has a collection (possibly partially unordered) of those state sets, and lists them in a particular order. and I'm trying to come up with a good name for it - not just for internal code, but for customers to refer to it.
The role of this particular second collection is to encapsulate the entire currently used/available collection of StateSets that the user has created. All of the StateSets will be used eventually in the application. A good analogy would be a hand of cards versus the entire table: The 'table' contains all of the currently available hands, while the 'hand' contains a particular collection of cards.
I've got these as starter ideas I could throw out for the class name; I'm not comfortable with either at the moment:
Sequence (maybe...with something else tacked on to the name)
StateSetSet (reasonable for code, but not for customers)
And as ewernli mentions, these are really technical terms, which don't really convey a the idea well. Any other suggestions or ideas?

Sequence - Definitely NOT. It's too generic, and doesn't have any real semantic meaning.
StateSetSet - While more semantically correct, this is confusing. You have a sequence, which implies order, which is different from a set, which does not.
That being said, the best option, IMO, is StateSetSequence, as it implies you have a sequence of StateSet instances.

What is the role/function of you StateSetSet?
StateSetSet or Sequence are technical terms.
Prefer a term that convey the role/function of the class.
That could well be something like History, Timeline, WorldSnapshot,...
EDIT
According to your updated description, StateSet looks to me like StateSpace (the space of all possible states). If the user can then interactively create something, it might be appropriate to speak of a Workspace. If the user creates various state spaces of interest, I would then go for StateSpaceWorkspace. Isn't that a cool name :)

"StateSets" may be sufficient.
Others:
StateSetList
StateSetLister
StateSetListing
StateSetSequencer

I like StateSetArrangement, implying an ordering without implying anything about the underlying storage mechanisms.

Related

Is there a programming term for selecting or inserting similar to "upsert"?

Is there a programming term for a widget/service that has the ability to either select an existing item or create/insert a new item? Maybe like how "upsert" is a combination of updating existing or inserting new?
Edit
Sorry, maybe my use of "upsert" as an example term is causing confusion. I'm looking for a term that describes (selecting or creating), not (creating or updating). For example, a widget that can query existing contacts to assign to a company, but also allows ad-hoc creation of contacts as well. I'm hoping to use the term in the name of such widgets to succinctly describe their function.
In web services based on representational state transfer (REST), the PUT method has semantics which can mean create or modify. Based on this, I would propose put as a possibility.
This also happens to match Java usage of the word: consider the put method on Hashtable. It maps a key to a value and returns the old value corresponding to the key, if it existed. This implies it can either exist or not already, and the action will succeed. Mozilla uses the same semantics for put on IDBObjectStore accessed in JavaScript.
It seems like where put is used for updating a collection, it typically has the meaning of create or update.
Based on the clarification - something like provide, specify, define or identify might be acceptable options for communicating the contents of a collection created by giving existing instances and/or creating new ones on the spot. That is, I can provide, specify, define or identify a real number by making reference to some well-known ones (pi, sqrt(2), etc.) or by creating a new one (e.g., by listing some digits or describing how it is computed).

(Somewhat) complicated database structure vs. simple — with null fields

I'm currently choosing between two different database designs. One complicated which separates data better then the more simple one. The more complicated design will require more complex queries, while the simpler one will have a couple of null fields.
Consider the examples below:
Complicated:
Simpler:
The above examples are for separating regular users and Facebook users (they will access the same data, eventually, but login differently). On the first example, the data is clearly separated. The second example is way simplier, but will have at least one null field per row. facebookUserId will be null if it's a normal user, while username and password will be null if it's a Facebook-user.
My question is: what's prefered? Pros/cons? Which one is easiest to maintain over time?
First, what Kirk said. It's a good summary of the likely consequences of each alternative design. Second, it's worth knowing what others have done with the same problem.
The case you outline is known in ER modeling circles as "ER specialization". ER specialization is just different wording for the concept of subclasses. The diagrams you present are two different ways of implementing subclasses in SQL tables. The first goes under the name "Class Table Inheritance". The second goes under the name "Single Table Inheritance".
If you do go with Class table inheritance, you will want to apply yet another technique, that goes under the name "shared primary key". In this technique, the id fields of facebookusers and normalusers will be copies of the id field from users. This has several advantages. It enforces the one-to-one nature of the relationship. It saves an extra foreign key in the subclass tables. It automatically provides the index needed to make the joins run faster. And it allows a simple easy join to put specialized data and generalized data together.
You can look up "ER specialization", "single-table-inheritance", "class-table-inheritance", and "shared-primary-key" as tags here in SO. Or you can search for the same topics out on the web. The first thing you will learn is what Kirk has summarized so well. Beyond that, you'll learn how to use each of the techniques.
Great question.
This applies to any abstraction you might choose to implement, whether in code or database. Would you write a separate class for the Facebook user and the 'normal' user, or would you handle the two cases in a single class?
The first option is the more complicated. Why is it complicated? Because it's more extensible. You could easily include additional authentication methods (a table for Twitter IDs, for example), or extend the Facebook table to include... some other facebook specific information. You have extracted the information specific to each authentication method into its own table, allowing each to stand alone. This is great!
The trade off is that it will take more effort to query, it will take more effort to select and insert, and it's likely to be messier. You don't want a dozen tables for a dozen different authentication methods. And you don't really want two tables for two authentication methods unless you're getting some benefit from it. Are you going to need this flexibility? Authentication methods are all similar - they'll have a username and password. This abstraction lets you store more method-specific information, but does that information exist?
Second option is just the reverse the first. Easier, but how will you handle future authentication methods and what if you need to add some authentication method specific information?
Personally I'd try to evaluate how important this authentication component is to the system. Remember YAGNI - you aren't gonna need it - and don't overdesign. Unless you need that extensibility that the first option provides, go with the second. You can always extract it at a later date if necessary.
This depends on the database you are using. For example Postgres has table inheritance that would be great for your example, have a look here:
http://www.postgresql.org/docs/9.1/static/tutorial-inheritance.html
Now if you do not have table inheritance you could still create views to simplify your queries, so the "complicated" example is a viable choice here.
Now if you have infinite time than I would go for the first one (for this one simple example and prefered with table inheritance).
However, this is making things more complicated and so will cost you more time to implement and maintain. If you have many table hierarchies like this it can also have a performance impact (as you have to join many tables). I once developed a database schema that made excessive use of such hierarchies (conceptually). We finally decided to keep the hierarchies conceptually but flatten the hierarchies in the implementation as it had gotten so complex that is was not maintainable anymore.
When you flatten the hierarchy you might consider not using null values, as this can also prove to make things a lot harder (alternatively you can use a -1 or something).
Hope these thoughts help you!
Warning bells are ringing loudly with the presence of two the very similar tables facebookusers and normalusers. What if you get a 3rd type? Or a 10th? This is insane,
There should be one user table with an attribute column to show the type of user. A user is a user.
Keep the data model as simple as you possibly can. Don't build it too much kung fu via data structure. Leave that for the application, which is far easier to alter than altering a database!
Let me dare suggest a third. You could introduce 1 (or 2) tables that will cater for extensibility. I personally try to avoid designs that will introduce (read: pollute) an entity model with non-uniformly applicable columns. Have the third table (after the fashion of the EAV model) contain a many-to-one relationship with your users table to cater for multiple/variable user related field.
I'm not sure what your current/short term needs are, but re-engineering your app to cater for maybe, twitter or linkedIn users might be painful. If you can abstract the content of the facebookUserId column into an attribute table like so
user_attr{
id PK
user_id FK
login_id
}
Now, the above definition is ambiguous enough to handle your current needs. If done right, the EAV should look more like this :
user_attr{
id PK
user_id FK
login_id
login_id_type FK
login_id_status //simple boolean flag to set the validity of a given login
}
Where login_id_type will be a foreign key to an attribute table listing the various login types you currently support. This gives you and your users flexibility in that your users can have multiple logins using different external services without you having to change much of your existing system

Database design: Custom data layout and rights management

I'm workling on a project for my University and im a little curious about my current database design:
First of all, what is the common naming practice for a database table: singular or plural? I once read something about i but i cannot remember it?! (e.g. table: user or users)
The second question is a little mor specific to the project:
The users can login into the website an have to choose 10 elements out of a list and attach each of the elements a priority from 1 to 4. My first try was to save the choice of the user in a single row as a CSV (e.g. 1,2,3;4,5,6;7,8;9,10 which represents the choice of element 1,2,3 with priority 1 etc.). The second attempt was to save each choice as a single entry like: [user_id]|[choosen_element]|[choosen_priority]. What do you think is the better variant or is there a even better one that i havent thought of?
The third question is more about the login and rights management:
The elements that the users can choose from are in groups. Each element can be in multiple groups. There are moderators who have the same groups that the elements have or a subset of it, and they can edit all elements of the group they are in posession of. Besides the groups there are also the rights for the users e.g. user, moderator, admin etc.
In my last design i defined the rights of the users as part of the groups table so that every user that is in the moderators group can edit items of the groups that he is also in.
In my first attempt i had the groups and the rights in a seperate table with a seperate logic in my application!
Is it better to seperate the aplication rights from the groups?
Here is a plot of my current layout if i missed something, or if somebody just likes to look at pictures ;)
http://screens.rofln.de/2012-06-19-4f3o3A.png
Thanks!
Btw.: Im working with PHP and a MySql if someone whants to know!
This is subjective, but if you go by conventions supplied by popular ORMs, it seems pluralized is pretty common. I don't think which you chose matters, only that you are consistent once you have chosen.
A record representing each choice makes most sense. This allows for ordering as well as queries to find highly rated elements, etc. Finally, reading the data in your application, and varying how it is displayed, will be easier since you'll be working with a list of items rather than a packed value.
This is hard to answer, since I'm not familiar with your problem domain. I'd recommend developing use cases and then applying them to your proposed model to see where the cracks are.
It does not matter whether you use singular or plural, what matters is that you are consistent in your use of the standard.
Comma separated values in MySQL are bad, mostly because it is not a congruent way of using a relational database. A standard database relationship, or a many-to-many table is a better idea.
When you make your rights management system more flexible, it becomes more complex. A good heuristic in this case is to build the simplest system that satisfies your requirements, but no simpler.
Speaking of simplicity, why do you have a separate table for userdata? Do you expect some user to have two sets of names and details?

Namespaces and records in erlang

Erlang obviously has a notion of namespace, we use things like application:start() every day.
I would like to know if there is such a thing as namespace for records. In my application I have defined record user. Everything was fine until I needed to include rabbit.hrl from RabbitMQ which also defines user, which is conflicting with mine.
Online search didn't yield much to resolve this. I have considered renaming my user record and prefixing it with something, say "myapp_user". This will fix this particular issue, until I suspect I hit another conflict say with my record "session".
What are my options here? Is adding a prefix myapp_ to all my records a good practice, or is there a real support for namespaces with records and I am just not finding it?
EDIT: Thank you everyone for your answers. What I've learned is that the records are global. The accepted answer made it very clear. I will go with adding prefixes to all my records, as I have expected.
I would argue that Erlang has no namespaces whatsoever. Modules are global (with the exception of a very unpopular extension to the language), names are global (either to the node or the cluster), pids are global, ports are global, references are global, etc.
Everything is laid flat. The namespacing in Erlang is thus done by convention rather than any other mean. This is why you have <appname>_app, <appname>_sup, etc. as module names. The registered processes also likely follow that pattern, and ETS tables, and so on.
However, you should note that records themselves are not global things: as JUST MY correct OPINION has put it, records are simply a compiler trick over tuples. Because of this, they're local to a module definition. Nobody outside of the module will see a record unless they also include the record definition (either by copying it or with a header file, the later being the best way to do it).
Now I could argue that because you need to include .hrl files and record definitions on a per-module basis, there is no such thing as namespacing records; they're rather scoped in the module, like a variable would be. There is no reason to ever namespace them: just include the right one.
Of course, it could be the case that you include record definitions from two modules, and both records have the same name. If this happens, renaming the records with a prefix might be necessary, but this is a rather rare occurrence in my experience.
Note that it's also generally a bad idea to expose records to other modules. One of the problems of doing so is that all modules depending on yours now get to include its .hrl file. If your module then change the record definition, you will have to recompile every other module that depends on it. A better practice should be to implement functions to interact with the data. Note that get(Key, Struct) isn't always a good idea. If you can pick meaningful names (age, name, children, etc.), your code and API should make more sense to readers.
You'll either need to name all of your records in a way that is unlikely to conflict with other records, or you need to just not use them across modules. In most circumstances I'll treat records as opaque data structures and add functionality to the module that defines the record to access it. This will avoid the issue you've experienced.
I may be slapped down soundly by I GIVE TERRIBLE ADVICE here with his deeper knowledge of Erlang, but I'm pretty sure there is no namespaces for records in Erlang. The record name is just an atom grafted onto the front of the tuple that the compiler builds for you behind the scenes. (Records are pretty much just a hack on tuples, you see.) Once compiled there is no meaningful "namespace" for a record.
For example, let's look at this record.
-record(branch, {element, priority, left, right}).
When you instantiate this record in code...
#branch{element = Element, priority = Priority, left = nil, right = nil}.
...what comes out the other end is a tuple like this:
{branch, Element, Priority, nil, nil}
That's all the record is at this point. There is no actual "record" object and thus namespacing doesn't really make any sense. The name of the record is just an atom tacked onto the front. In Erlang it's perfectly acceptable for me to have that tuple and another that looks like this:
{branch, Twig, Flower}
There's no problem at the run-time level with having both of these.
But...
Of course there is a problem having these in your code as records since the compiler doesn't know which branch I'm referring to when I instantiate. You'd have to, in short, do the manual namespacing you were talking about if you want the records to be exposed in your API.
That last point is the key, however. Why are you exposing records in your API? The code I took my branch record from uses the record as a purely opaque data type. I have a function to build a branch record and that is what will be in my API if I want to expose a branch at all. The function takes the element, priority, etc. values and returns a record (read: a tuple). The user has no need to know about the contents. If I had a module exposing a (biological) tree's structure, it too could return a tuple that happens to have the atom branch as its first element without any kind of conflict.
Personally, to my tastes, exposing records in Erlang APIs is code smell. It may sometimes be necessary, but most of the time it should remain hidden.
There is only one record namespace and unlike functions and macros there can only be one record with a name. However, for record fields there is one namespace per record, which means that there is no problems in having fields with the same name in different records. This is one reason why the record name must always be included in every record access.

Do you use particular conventions for naming complementary variables?

I often find myself trying to come up with good names for complementary pairs of variables; where two variables denote opposing concepts, two participants in some sort of duologue, and so on.
This might be better explained by a counter-example - I maintain an app that prints two graphics as part of a print advertisement. They're stored in the database as TopLogo and LowerLogo, which I have to stop and double-check every time I use them because I'm expecting top to complement bottom, and lower should complement upper.
There's some obvious examples that I think work well:
client / server
source / target for copying/moving data or files from one variable to another
minimum / maximum
but there's some concepts that just don't lend themselves to such neat naming schemes. For example, when paging through records, does 'last' mean 'final' or 'previous' ? I recently saw some code that used firstPage, previousPage, nextPage and finalPage to avoid the ambiuous lastPage completely, which I thought was very beat, hence this question.
Do you have any particularly neat variable name pairs you'd care to share with us? (Bonus points if they're the same length, which makes the code so much neater in monospaced fonts.)
Like with all kinds of code style conventions, consistency is what you should strive for.
I would have the development team agree on "standard" pairs of prefixes for common scenarios like "source/destination" or "from/to" and then stick with them for the whole project. As long as every developer is aware of what is meant with a particular prefix in the codebase, it is easier to avoid misunderstandings.
Exceptions to the rule should be clarified in the documentation if the variable is part of a public API, or in comments within the code, if it's visibility is restricted to a single class or method.
In my databases you'll find many valid-state temporal ("history") tables containing a pair of columns named start_date and end_date. No bonus points for me, then, because I'd rather use the commonly used 'end' than try to come up with an intuitive alternative with the same number of characters as the word 'start'.
I tend to prefer these generic terms even when more context-specific terms may be viable e.g. preferring employee_start_date over employee_hire_date (what if their employment started for a reason other than being formally hiring e.g. their company was the subject of an acquisition). That said, I'd prefer person_birth_date over person_start_date :)
While one does try to be semantically coherent in obvious cases -- e.g., maximum goes with minimum, and not "lowest" -- in well-structured OO code (which isn't all code, I know) the problem disappears with a good IDE. Classes are short, methods are short, and variables are few in each method. So it doesn't matter what you call the variable pairs so long as they're clear. Your code might not look professional, but real quality is in the code, not in the look of your code.
The problem further disappears if there is good JavaDoc or whatever the documentation system is, and if have good Class names that go with them. For instance, if you have an instance of a Connection class and it has a method a method called setDestination, that's okay, but if you know that setDestination takes one parameter called destination and it's of the Server class, you're cool... even though you might prefer to call it target, aimHere, placeToSendTheData, or whatever (and the corresponding names, source, comingFromHere, and placeToGetTheDataFrom). Plus the doc system says what the thing is for, and that is priceless.
This next thing might sound stupid and I'm sure I'll get voted down here on StackOverflow, but unique non-professional sounding variable names have a great advantage: I know that my variables have names like placeWeWantTheDataToGo (and the IDE takes care of typing it), but the "serious" guys who do the JDK would never use such silly names. So I know immediately that the variable is one of mine. Incidentally, when I worked with developers in Spain and Italy, they write code with Spanish variable names (not always, but usually). This causes the same effect: we can quickly see that the Conexion class is ours, but the Connection class is not.
[Also, instead of typing your variable names, assign them a constant String somewhere in your code and use that, so if they called it lower or downer instead of low, you're still okay.]
Yes, I do try to name complementary sets of variables systematically so that the symmetry is clear. It is not always easy; sometimes, not even possible. Well, not possible using the rules I lay down for myself - which means I usually try to have the names the same length. The 'top' and 'lower' example would drive me batty (assuming I'm not batty already, which is far from certain); I'd probably use 'upper' and 'lower' because those are the same length; 'top' and 'bottom' would frustrate me too because of the difference in length.