What happens to composite and multi-valued attributes in 1NF? - relational-database

I have a normalization problem like the following:
R={
att1(nb1, nb2, nb3),
att2, val1, val2, def1, class1,
class2{notion1, notion2},
def2,col1
}
Here, attr1 is a multi-valued attribute and class2 is a composite attribute.
How do I convert R to 1NF?
Is it like the following?
R={
nb1, nb2, nb3,
att2, val1, val2, def1, class1,
notion1, notion2,
def2,col1
}

Yes, your answer is correct. As it is said in Wikipedia:
A relation is in first normal form if the domain of each attribute contains only atomic values, and the value of each attribute contains only a single value from that domain.
In other words, you cannot have attributes that:
are structured, that is contain components, or
are repeated (or both).
So, att1 and class2 must be substituted by their components.
Note that in the result relation you have different rows with the same values for all the other attributes different from nb1, nb2 and nb3.
This normal form has been introduced initially in a paper by E. Codd, in 1971: E. F. Codd, Further normalization of the database relational model, Courant Institute: Prentice-Hall, ISBN 013196741X,
A relation is in first normal form if it has the property that none of its domains has elements which are themselves sets.
(see Wikipedia citation).
This normal form is nowday presented in all the books on relational theory only for historical reasons, since this property is now considered part of the Relational Database Model. See for instance the book Fundamentals of database systems, by R. Elmasri, S. Navathe, Addison Wesley (pag.519 of the 6th edition, ISBN: 978-0-13-608620-8):
First Normal Form
First normal form (1NF) is now considered to be part of the formal definition of a relation in the basic (flat) relational model; historically, it was defined to disallow multivalued attributes, composite attributes, and their combinations. It states that the domain of an attribute must include only atomic (simple, indivisible) values and that the value of any attribute in a tuple must be a single value from the domain of that attribute. Hence, 1NF disallows having a set of values, a tuple of values, or a combination of both as an attribute value for a single tuple. In other words, 1NF disallows relations within relations or relations as attribute values within tuples. The only attribute values permitted by 1NF are single atomic (or indivisible) values.

Related

What is the difference between findBy with underscore and findBy without it?

Example: What is the difference between :
List<UserCompany> findByCompany_IdAndCompany_IsActivated(params)
and
List<UserCompany> findByCompanyIdAndCompanyIsActivated(params)
There is no difference if your model is unambiguous with respect to field names.
List<UserCompany> findByCompanyIdAndCompanyIsActivated(params) -
this first thinks that companyId and companyIsActivated are properties within UserCompany and tries to find them if fails
it then thinks that UserCompany has a field Company - which is another class and Company has field - Id and IsActivated and tries to find them
Where as the below thing
List<UserCompany> findByCompany_IdAndCompany_IsActivated(params)
assumes directly that UserCompany has a field Company - which is another class and Company has field - Id and IsActivated and tries to find them
From the spring documentation
Property expressions :---
Property expressions can refer only to a direct
property of the managed entity, as shown in the preceding example. At
query creation time you already make sure that the parsed property is
a property of the managed domain class. However, you can also define
constraints by traversing nested properties. Assume Persons have
Addresses with ZipCodes. In that case a method name of
List findByAddressZipCode(ZipCode zipCode); creates the
property traversal x.address.zipCode. The resolution algorithm starts
with interpreting the entire part (AddressZipCode) as the property and
checks the domain class for a property with that name (uncapitalized).
If the algorithm succeeds it uses that property. If not, the algorithm
splits up the source at the camel case parts from the right side into
a head and a tail and tries to find the corresponding property, in our
example, AddressZip and Code. If the algorithm finds a property with
that head it takes the tail and continue building the tree down from
there, splitting the tail up in the way just described. If the first
split does not match, the algorithm move the split point to the left
(Address, ZipCode) and continues.
Although this should work for most cases, it is possible for the
algorithm to select the wrong property. Suppose the Person class has
an addressZip property as well. The algorithm would match in the first
split round already and essentially choose the wrong property and
finally fail (as the type of addressZip probably has no code
property). To resolve this ambiguity you can use _ inside your method
name to manually define traversal points. So our method name would end
up like so:
List findByAddress_ZipCode(ZipCode zipCode);
Underscore is reserved character which allows you to point the right object to construct jpa query. It's used only with nested objects. For example if you would like to query by ZipCode inside Address inside you Company object.
More information can be found here

Same domain members present in multiple dimensions in XBRL taxonomy

While going through the definition link base of a taxonomy, i found that a few domain members were present in two separate dimensions. Eg. Dim A contains domain Dom1 with members m1, m2,m3,m4. And Dim B contains domain dom2 with members m2,m3,m4. The issue is that it may lead to conflicting context names (even though the segment part of the contexts will be different).
The format of the context name is 'periodInformation_domainMember'. I need to use different dimensions for different sections of my report. So my basic question is how do i form context names?
I hope i have conveyed myself properly.
Appreciate any help... :)
use "Period Information + Dimension + member name" for making context names unique.....
You have to check the uniqueness based on the child nodes of <period> tag and child nodes of the <segment> tag... here in segment; if segment is present then each xbrldi:explicitMember has dimension in its attribute and member in its value...
...more: http://www.xbrl.org/Specification/XBRL-RECOMMENDATION-2003-12-31+Corrected-Errata-2005-04-25.htm#_4.7
What if there were multiple dimensions? Playing devil's advocate, what if you have dimensions with the same local name but in different namespaces? The only way to guarantee a unique name is to use the whole content of the context - which is ridiculous.
I've seen recommendations by regulators requiring filings in XBRL that 'Semantics SHOULD NOT be expressed...' in a context id and it is '..recommended to keep it as short as possible...'
The simplest solution is to pick unique names that that have nothing to do with the contents - for example c-1, c-2 etc.
The syntax of the XBRL is unimportant, it's just an implementation detail.

Definition of domains in mySQL?

I'm working on a college exercise and have the following question:
What is the domain of the "country" table?
My understanding of domain is that it defines the possible values of an attribute.
This means that the table "country" doesn't have a domain, but the various attributes in the table "country" have their own domains.
For example the attribute "SurfaceArea" has the domain FLOAT(10,2) and the attribute "Name" has the domain CHAR(52).
Is this correct?
http://en.wikipedia.org/wiki/Relational_database#Domain
Here is one source and yes, you are correct, domain describes the possible values of an attribute.
Your understanding of domains is correct but not complete. Typically, a domain is implemented as a data type like you specified. However, a domain can be more specific than the data type. Consider the attribute Grade, the possible values could be {A, B, C, D, F, I}. If you specify the data type as CHAR(1) then you are not eliminating invalid letters like G, H, J, etc. Consequently, it is important to understand what the original domain is and how it is different from the domain of the data type. Any remaining constraints such as eliminating those invalid values are typically implemented as an application constraints.A good reference on this is Elmasry and Navathe.

Which relational model is better for this example?

This is relational model for a OOP database, which of this is better?:
Note: -> this operand is used to define a foreign key like (field->table(reference))
First
**
attribute
(id:auto, attribute_name)
type
(id:auto, type_name)
type_attribute
(id:auto, type_code->type(id), attribute_id->attribute(id), default_value)
object
(id:auto, name, object_type->type(id))
object_property
(id:auto, object_id->object(id), attribute_id->attribute(id), my_value)
**
Second
attribute (id:auto, attribute_name)
type (id:auto, type_name)
type_attribute (id:auto, type_code->type(id),
attribute_code->attribute(id), default_value)
object (id:auto, name, object_type->type(id))
object_property (id:auto, (object_id,
object_type)->object(id,object_type), (object_type,
attribute_id)->type_attribute(id, attribute_id), my_value)
Really the difference is clearly visible at the object_property table.
In the first model you can define a property using the code and the attribute code, the problem here is that you can define elements that the type doesn't define the attribute for the type of the object. However, this model is most easy to use because for define an object_property you only need two codes like:
INSERT INTO object_property(object_code, attribute_code, my_value)
VALUES (3,4,'myvalue')
In the second model you can define a property using more consistent data using the object_code, object_type and the attribute_code. However you need to use three codes and additional query like this:
INSERT INTO object_property(object_code, object_type, attribute_code)
VALUES (3, (select object_type from object where code = 3), 4, 'my_value')
Which better?
Did you mean to say "relational model"? There is only one relational model:
we've never changed the axioms for the relational model. We have made
a number of changes over the years to the model itself—for example,
we've added relational comparisons—but the axioms (which are basically
those of classical predicate logic) have remained unchanged ever since
Codd's first papers. Moreover, what changes have occurred have all
been, in my view, evolutionary, not revolutionary, in nature. Thus, I
really do claim there's only one relational model, even though it has
evolved over time and will presumably continue to do so.
SQL and Relational Theory: How to Write Accurate SQL Code By C. J. Date

How can I store an array of boolean values in a MySql database?

In my case, every "item" either has a property , or not. The properties can be some hundreds, so I will need , say, max 1000 true/false bits per item.
Is there a way to store those bits in one field of the item ?
If you're looking for a way to do this in a way that's searchable, then no.
A couple searchable methods (involving more than 1 column and/or table):
Use a bunch of SET columns. You're limited to 64 items (on/offs) in a set, but you cna probably figure out a way to group them.
Use 3 tables: Items (id, ...), FlagNames(id, name), and a pivot table ItemFlags(item_id, flag_id). You can then query for items with joins.
If you don't need it to be searchable, then all you need is a method to serialize your data before you put it in the database, and a unserialize it when you pull it out, then use a char, or varchar column.
Use facilities built in to your language (PHP's serialize/unserialize).
Concatenate a series of "y" and "n" characters together.
Bit-pack your values into a string (8 bits per character) in the client before making a call to the MySQL database, and unpack them when retrieving data out of the database. This is the most efficient storage mechanism (if all rows are the same, use char[x], not varchar[x]) at the expense of the data not being searchable and slightly more complicated code.
I would rather go with something like:
Properties
ID, Property
1, FirsProperty
2, SecondProperty
ItemProperties
ID, Property, Item
1021, 1, 10
1022, 2, 10
Then it would be easy to retrieve which properties are set or not with a query for any particular item.
At worst you would have to use a char(1000) [ynnynynnynynnynny...] or the like. If you're willing to pack it (for example, into hex isn't too bad) you could do it with a char(64) [hexadecimal chars].
If it is less than 64, then the SET type will work, but it seems like that's not enough.
You could use a binary type, but that's designed more for stuff like movies, etc.. so I'd not.
So yeah, it seems like your best bet is to pack it into a string, and then store that.
It should be noted that a VARCHAR would be wasting space, since you do know precisely how much space your data will take, and can allocate it exactly. (Having fixed-width rows is a good thing)
Strictly speaking you can accomplish this using the following:
$bools = array(0,1,1,0,1,0,0,1);
$for_db = serialize($array);
// Insert the serialized $for_db string into the database. You could use a text type
// make certain it could hold the entire string.
// To get it back out:
$bools = unserialize($from_db);
That said, I would strongly recommend looking at alternative solutions.
Depending on the use case you might try creating an "item" table that has a many-to-many relationship with values from an "attributes" table. This would be a standard implementation of the common Entity Attribute Value database design pattern for storing variable points of data about a common set of objects.