Working with split secret key - hsm

I need to import splits of a secret key into a HSM device. A Key Encryption Key (KEK) 3DES key has been split for transport and need to be recombined in the destination HSM.
How can this be done ? Are the splits actually being recombined in the HSM itself, or are they being recombined outside of the HSM and then the result is imported into the HSM ?
Thank you !

If all the key parts are available outside of the HSM then you'd normally just XOR the values together and set the key. You could use CKM_XOR_BASE_AND_DATA or possibly a proprietary command as well.
CKM_XOR_BASE_AND_DATA however requires at least one key to be already present. You could use to combine keys sequentially, of course, if you want the holder the different parts not to be able to view the other parts.
Note that I assume here that the keys have been split using T = N key sharing using XOR. In principle any of the secret sharing could have been used.

Related

MySQL is there a point to having a primary key on a lookup table which referrers to a primary key on another table which is indexed?

I'm just doing some basic normalisation but I don't have the answer for this, wondering if you guys can give me some info on right/wrong, do's/dont's etc.
So if I have:
I've always set a primary key (unique auto incrementer on lookup tables), in the image the lookup tables would be "page_downloads" and "page_includes" but I can guarantee those columns will never get used as they will only be queried via the page_id, same for so many definition tables.
So my question is: "Is there any point? What is the best practice thing to do? Always create the primary key even though it will never be used or don't bother creating it as it is fine to use the indexed int column which refers to a primary key in another table. Eg the relationship in the picture (page_id to page_id). Thoughts?"
Thanks
D
No. While every table should have a PRIMARY KEY, it need not be a surrogate. In this instance, (page_id,file_id) is a valid compound PRIMARY KEY (as is (file_id,page_id)).
To add some info to Strawberry's valid observations.
There's no absolute answer or best practice regarding the surrogate keys and usually this boils down to individual preference. There are both advantages and disadvantages to using surrogate keys. Among the advantages, one could consider:
Immutability Surrogate keys do not change while the row exists.
This has the following advantages:
Applications cannot lose their reference to a row in the database
(since the identifier never changes). The primary or natural key data
can always be modified, even with databases that do not support
cascading updates across related foreign keys. Requirement
changes[edit] Attributes that uniquely identify an entity might
change, which might invalidate the suitability of natural keys.
Consider the following example:
An employee's network user name is chosen as a natural key. Upon
merging with another company, new employees must be inserted. Some of
the new network user names create conflicts because their user names
were generated independently (when the companies were separate). In
these cases, generally a new attribute must be added to the natural
key (for example, an original_company column). With a surrogate key,
only the table that defines the surrogate key must be changed. With
natural keys, all tables (and possibly other, related software) that
use the natural key will have to change.
Some problem domains do not clearly identify a suitable natural key.
Surrogate keys avoid choosing a natural key that might be incorrect.
Performance[edit] Surrogate keys tend to be a compact data type, such
as a four-byte integer. This allows the database to query the single
key column faster than it could multiple columns. Furthermore a
non-redundant distribution of keys causes the resulting b-tree index
to be completely balanced. Surrogate keys are also less expensive to
join (fewer columns to compare) than compound keys.
Compatibility While using several database application
development systems, drivers, and object-relational mapping systems,
such as Ruby on Rails or Hibernate, it is much easier to use an
integer or GUID surrogate keys for every table instead of natural keys
in order to support database-system-agnostic operations and
object-to-row mapping.
Uniformity When every table has a uniform surrogate key, some
tasks can be easily automated by writing the code in a
table-independent way.
Validation It is possible to design key-values that follow a
well-known pattern or structure which can be automatically verified.
For instance, the keys that are intended to be used in some column of
some table might be designed to "look differently from" those that are
intended to be used in another column or table, thereby simplifying
the detection of application errors in which the keys have been
misplaced. However, this characteristic of the surrogate keys should
never be used to drive any of the logic of the applications
themselves, as this would violate the principles of Database
normalization.

Resources ID in REST Api = primary key in database

I am creating a RESTful API.
My table of e.g. users has a primary key 1,2,3, ...
Now to name my resources in the API I want some more complex name. A hash of something which also will be a unique identifier but a little more difficult to guess.
Should I save this hash in an extra column in my user table or kick the 1,2,3, ... out of the primary key and use the unique hash as global id (database & API)
Why the complexity?
REST API URLs are meant to be discoverable. Obfuscating the resource identifier is anything but discoverable. If you want to keep people from accessing certain data, then secure that data through authentication and authorization. If you're really creating a RESTful API, part of that is discoverability.
So, from the API perspective, the only sane reason I can imagine for doing something like that is avoiding a strong coupling between the URIs and the PK. Like, for instance, you expect to change storage in the future and you don't want to be stuck with a sequential PK forever. If that's the case, I'd say to use a random UUID Version 4, store as a binary value in the database, and use the hex representation to construct the URI. That's what I did in this situation and it works fine.
Now, from the database perspective, I would recommend checking how your database deals with random values as primary key before adopting that. For instance, MySQL insert performance degrades terribly with random values in the clustered index, and it's better to have an unique index for the hash/uuid column, and an auto-increment column as PK.
Other than that, if all you want is to obfuscate the URI, I wouldn't change the database, and simply apply some reversible encoding to the integer value, to use it as part of the URI.

Unique keys in distributed RDBMS

Imagine there's a Relational Database System (let's say MySQL) that is clustered in many servers (maybe 100 servers). In this Database System there is a table called "users", and "users" contains a primary key (UINT for instance).
This user ID must be unique among all the servers. This user ID may be auto incrementing.
So how does a distributed database system handle these types of problems ? How does a RDBMS generates a unique index that is unique among all the servers ?
I don't want any SQL code of how to do so in MySQL, I just need to know how it is done in such a case.
[Edit]
Both answers sounds OK.
This is another case, let's take StackOverflow for an example. This Question URL is http://stackoverflow.com/questions/18359434. Another URL is http://stackoverflow.com/questions/18359435, which points to the question that was asked after this question. Obviously stackoverflow has multiple database servers. But the ID for questions are auto-incrementing.
So what's the approach that StackOverflow is using ?
StackOverflow is getting a huge amount of traffic, about 100 both alexa and Quantacast ranks.
The canonical solution is to use uuid() (see here) rather than an integer for such a unique identifier. This is guaranteed to be unique in space as well as time.
A more "hacked" solution is to use two-part primary keys. Have the first be an identifier of "what system am I on" and the second be an auto-incremented number, unique to that system.
Another "hacked" solution is to give each system ranges. Say you are using big integers, then 1,000,000,000 might start the value on one system, 2,000,000,000 on another, and so on.
I would not recommend that you actually try to implement an auto-incremented number across a distributed system. This would basically entail having a single system that maintained the most recent number, and having the other systems ask it for the next number. However you implement this, you will introduce a bottleneck into the system.
In this case I'd use a GUID primary key and I wouldn't have this issue (not sure MySQL knows this though).
The alternative old-fashioned way is to use primary key ranges - that is have one instance use keys from 1.000.000 to 1.999.999, the next use range 2.000.000 to 2.999.999, etc, thus ensuring each instance cannot use the keys of another.

Why should primary keys of DB not be shown in html code, e.g. in select fields?

anywhere I read that values in select boxes (or anything else in the html code) should not be the primary key of the database table. For example:
<select>
<option value="1">Value 1</option>
<option value="2">Value 2</option>
</select>
In the database there are lookup tables with these values as primary key (1, 2, 3,....). So the data from the select box I store in a table which references this lookup table is a number like 1, 2, 3.... (as the value of the options fields).
I read to better not use the same values in html and as key due to security reasons, but what's the matter with that? I don't understand why this should be a security reason?
Sounds like security-through-obscurity, aka no security at all to me.
A good primary key in a database is purely for uniqueness in the system and shouldn't be related to the meaning of the data. If the primary key was related to the data (say people's social security numbers, stuff like that) then you've got a security issue in exposing the keys, as they are exposing information that could be used maliciously. In that case, whilst you could argue that the best approach from a technical point of view might be to change the application to stop it using those meaningful keys, it may be a more palatable approach to map the keys to some other meaningless key to overcome the issue.
Another scenario that springs to mind where exposing the keys might be interpreted as a security issue is where inadequate authentication and authorisation is in place for writable data in your application/data layer, allowing someone with knowledge of those keys to interfere with the data in the application. Again, securing the system is the better approach.
Aside from security, I can't think of a specific issue if the keys really do identify the data being interacted with and your application is looking up the keys when it generates the page.
I would be concerned about how the information is processed from the URL. What happens if I posted content using value="does_this_break_the_code" or value="can_I_read_secret_info"
It would be wise to exercise caution in using surrogate keys in URLs or in HTML or application code. I wouldn't say the same thing about keys in general.
A surrogate key is not supposed to have business meaning or to have dependencies in application code or external processes. That's often an important consideration for example if key values need to change as a result of the database design evolving or data sets being merged. By using surrogate keys as "magic numbers" in code or in URLs you could compromise the very thing that makes surrogate keys useful. Also surrogate keys are much less convenient to users (and possibly developers) because the values are meaningless to them and therefore less readable than using a natural key.
I suggest you use natural keys in your URLs and persistent code. Keep surrogate keys internal to the database, which is where they are supposed to be.
Primary keys should be used as a unique identifier for each item in the DB, chances are it isn't a part number or anything that relates to the actual item. Generally speaking the PK doesn't MEAN anything, and in the world of semantics, everything should mean something. If there is a better unique identifier, by all means use it, because your PK isn't helpful to anything but your database.
Say you have a database of cars, all cars have a unique identifier called a VIN (Vehicle Identification Number) in the VIN is encoded a bunch of info about each specific car down to the plant that made it. The VIN only identifies that one specific car. the PK on the item could be anything, the car gets dropped from the DB, now the PK doesn't exist, but that VIN is still out there somewhere. It's a much better unique ID than the PK, so that's what should probably be displayed to the users.

Understanding keys in databases

This question is geared towards MySQL, since that is what I'm using -- but I think that it's probably the same or similar for almost every major database implementation.
How do keys work in a database? By that I mean, when you set a field to 'primary key', 'unique key' or an 'index' -- what do each of these do, and when should I use each one?
Right now I have a table containing a few fields, one of them being a GUID (minus the { and } around it). I set the GUID field to the primary key and I see that it created a binary tree. So it improves search performance -- but what differentiates that from other types of keys?
I realize this may not really be programming related (although it is development related) -- I wasn't sure where exactly to ask this but SO is what I use the most so I'll ask here. Migrate as necessary
There are probably hundreds of references for this elsewhere on the web, so a bit of Googling will help you get deep into understanding DB design. That said, the basic gist is:
primary key: a field or combination of fields which must be unique for each row, and which is/are indexed to provide rapid lookup of a row given a key value; cannot contain NULL, and a table can only have one primary key. Generally indexed in a clustered index, which means that the data in the table is reordered to match the order of the index, a process that greatly improves serial data retrieval. (This is the main reason a table can only have one primary key -- the order of the data can't match the order of more than one index!)
unique key: same as a primary key, but on some DB platforms, can contain NULL values so long as they don't violate the uniqueness constraint. (In other words, if the unique key contains a single column, there can only be one row in the table with NULL in that column; if the key contains more than one column, then the table can only contain rows with NULLs in the columns such that there's no non-unique duplication of NULL values across the columns in the key.) On other platforms (including MySQL), unique constraints can contain multiple NULLs; the uniqueness constraint only applies to non-NULL values of the referenced columns. There can be more than one of these per table. Indexed in a non-clustered index.
index: a field or combination of fields which are pre-indexed for more rapid retrieval given a value for the field(s) in the index. A table can have more than one index.
When you define a primary key, the database creates an index based on that key. It needs to be unique. In general you can create an index that to speed up access to data based on non-unique query data. The indexed retrieval time for a uniquely keyed data should be better than for non-uniquely keyed indexes, so I try to use unique indexes where possible.
At the most basic, primary keys represent how the records will be physically stored in memory / on disk, you would want the unique field you're going to search on the most to be this as it will greatly reduce searching.
Unique key's are fields that can only contain unique values.
An index is a specialized "map" to the database file that queries can reference.
These are extremely simplified answers, but I think that's the gist of it.
One more thing, any key is essentially a separate table that is sorted by the index that points directly to the row(s) that match the key.
A BTree style index is stored in a balanced tree, a balanced tree is a tree structure where traveling left is smaller and traveling right is larger.
5
3 7
2 4 6 8
Would be an example of a balanced tree. The other major type is a Hash, where a mathematical expression turns the key into the relative memory location of the key.
In order to really understand keys, you have to understand them at three levels: conceptual, logical, and physical. I'm going to reverse my habitual order, and discuss physical first.
Most programmers tend to think at the physical level. At the physical level, a key is a surrogate (stand-in) for the address of a row. When a row is to be referenced, a copy of the key can be used to specify the row. When a reference to a row is made in another row, the copy is known as a foreign key.
Most experienced programmers have a thorough understanding of pointers and addresses, and would understand exactly how the data structure worked if only it used pointers and addresses. Before the relational databases became dominant, there were in fact databases that used pointers to records embedded in other records to tie the data together.
A disadvantage to using keys instead of pointers is that the DBMS has to use an index to translate a key reference back to a pointer in order to retrieve the row in question. An advantage is that the level of indirection allows the DBMS to shuffle all the rows in a table for whatever purpose, as long as the DBMS updates all the relevant indexes accordingly.
Viewed at this level, keys might as well be simple, integer, and autoincremented. These work faster than other kinds of keys, and they sidestep certain data management issues that arise when user supplied data is missing or inconsistent. However, sidestepping data management issues at this level can create a minefield at the two higher levels.
At the logical level, a key is a minimal subset of the data in a tuple (row) that allows a single matching tuple to be specified, and when the DBMS retrieves the container for that tuple, all the attributes in the tuple are now available. Every relation has at least one candidate key. In the worst case, the entire tuple is the only candidate key. When multiple candidate keys exist for a single relation (table), common practice is to choose one candidate key as the primary key, and to make all references via this primary key.
(Actually, relation and table are not synonymous, but I'm simplifying here. Likewise, tuple and row are not synonymous, although they look identical at first glance.)
The primary reason to declare a primary key is to rule out duplicate keys or missing keys.
Sometimes database people choose to leave duplicate and missing key avoidance up to the programmers whose applications write to the database. More commonly, a primary key constraint serves to reflect an error back to a program that violates a primary key constraint.
When a DBMS sets up a primary key constraint, it also builds an index on the primary key. This allows the DBMS to find duplicates quickly, and it also speeds up certain queries that use the key column(s).
At the conceptual level, keys are the means by which the user community identifies instances of entities, whether those entities are persons (employees, travellers, etc.), things (bank accounts, hotel rooms, etc.) or whatever. The key is data and the entity identified by the key is not data. The key can thus be seen a surrogate for the entity in the database.
At the conceptual level, keys are always natural, and never automatically supplied by the system. However, in the real world, keys are often mismanaged, and the consequences of mismanagement are overcome by what is called "common sense". Instilling common sense into an automated system is generally not feasible.
I never really described an index in the above, but it's implicit in what I said. An index is a data structure that serves to map from a key to a pointer. In all the databases you are likely to use, indexes are declared by the database builder (or perhaps a DBA) and managed by the DBMS.