which database should i use for multi-value column - mysql

I don't know much about databases. I have a table on paper that have 4 or 5(1 for id) columns. One is primary and single value while others columns are secondary and can have multiple-value. Now i have some values and i have to search those value in secondary columns and return highest matched primary column. i.e
Id Primary Secondary_1 Secondary_2 Secondary_3
1 ABCD 12,11,9 51,52 77
2 ABCE 9,15,17 12,14,7 71,77
3 ABEF 8,9,14,12 51,7 77,71
4 ABEG 7,9,15 52,14 77,78
Secondary columns can have string type. Now suppose i have to search (8,9,14,77)
it should return ABEF. if search(9,51,77) it should return (ABCD,ABEF) and so on. So my problem is how i store database which schema should i use for this type of problem.

You should create 3 tables additionally to the primary table, and use foreign keys to connect their rows to the primary table:
primary table: Id, Primary
secondary table 1: ForeignId, Value
secondary table 2: ForeignId, Value
secondary table 3: ForeignId, Value
Then, create a foreign key on all "ForeignId" columns, connected with the "Id" column of the primary table.
Of course names of these columns are horrible. Don't name them "Value", be more precise.
The goal is to have single values in one field, not multiple ones. With such a normalized design, you can query your secondary tables with simple string comparisons, while joining your primary table rows with the matched rows.

Related

Unique value in field for specific field in sql

I have to create a new table in SQL, but I have a problem.
I want to have a unique value in one field but only for a specific field in same table, similar like a one to many.
Table:
ID_Order
Supplier
ID_Supplier_Order
And now I want to have ID_Order unique for the table, and ID_Supplier_Order only for specific Supplier. Can I do it in one table, or I have to create a second for Suppliers?
Taken from:
Add unique constraint to combination of two columns
CREATE UNIQUE INDEX uq_yourtablename
ON dbo.yourtablename(column1, column2);

normalization - 1NF clarification

i have a question about the 1 normal form and will explain it by an example.
lets imagine that we have a set of students that are working on a set of projects, but not necessarily only one, but more than one (many to many relation). we have a table where the information's of the students are recorded, and one for the projects. but we need to link them together. but since the 1NF says redundancy and only value per tuple, how would you do it?
both fields are primary keys here
illustation 1:
student_ID project_ID
1 7
2 7,1
3 4,1,9
4 1,3
5 1
illustration 2:
student_ID project_ID
1 7
2 1
2 7
3 4
3 1
3 9
4 1
4 3
5 1
Illustration 1: I know that if this would be a result of a table, this would violate the 1NF becuase one than one value per tuple.
Ilustration 2: since they are primary keys they are not allowed to be duplicated, even if i remove the primary key from the student_ID i still would be redundant.
How can i fix this issue?
thanks in advance :)
The primary key of this table will be a composite of the two fields. They must both together be unique. Both fields are foreign keys to their respective tables and they will be unique in their respective tables.
What you have here is basically a junction table, and your second illustration shows the correct way to normalize it.
Note that, as is typical for junction tables, the primary key for your table will consist of both of the columns together. Together, each unique combination of values in these columns specifies a distinct student–project pairing.
Edit: In MySQL, you would define this table e.g. as:
CREATE TABLE student_projects (
student_id INTEGER NOT NULL,
project_id INTEGER NOT NULL,
PRIMARY KEY (student_id, project_id)
)
To enforce relational consistency, you may also want to add explicit foreign key constraints to each of the columns.

Query with primary key obtained from primary keys of underlying tables

I have a table in Access which I'd like to substitute with a query which gathers data from the table and other new tables. The table is used by many queries which look to a primary key (autonumber) in the table, so the new query must have a primary key which is a unique combination of the primary keys of the tables used by the query. What can I do?
--EDIT--
Solution found: Since I want to "merge" tables with a query, and since the pk is an autonumber, I can define the new pk (of the query) by "expanding the numbering": I multiply both pkeys by 2 (because I have two tables) and add or subtract 1 to one of the two (or 1 for the first table and 2 for the second, and so on).
For example:
PK1 = 1,2,3,4,5,6
PK2 = 1,3,4,5,8,9,10 (some records may have been deleted, so the number is skipped)
new PK = (2*PK1, (2*PK2 + 1)) = (2,4,6,8,10,12),(3,7,9,11,17,19,21)
as you can see they will never overlap (no new value of PK2 can be obtained from any value of PK1, because of the "+1") because math says they belong to different vector spaces.
Hope it may help somebody
Use composite key (Multiple-field primary key)

MySQL unique index by multiple fields

We have a special kind of table in our DB that stores the history of its changes in itself. So called "self-archived" table:
CREAT TABLE coverages (
id INT, # primary key, auto-increment
subscriber_id INT,
current CHAR, # - could be "C" or "H".
record_version INT,
# etc.
);
It stores "coverages" of our subscribers. Field "current" indicates if this is a current/original record ("C") or history record ("H").
We could only have one current "C" coverage for the given subscriber, but we can't create a unique index with 2 fields (*subscriber_id and current*) because for any given "C" record there could be any number of "H" records - history of changes.
So the index should only be unique for current == 'C' and any subscriber_id.
That could be done in Oracle DB using something like "materialized views": where we could create a materialized view that would only include records with current = 'C' and create a unique index with these 2 fields: *subscriber_id, current*.
The question is: how can this be done in MySQL?
You can do this using NULL values. If you use NULL instead of "H", MySQL will ignore the row when evaluating the UNIQUE constraint:
A UNIQUE index creates a constraint such that all values in the index must be
distinct. An error occurs if you try to add a new row with a key value that
matches an existing row. This constraint does not apply to NULL values except
for the BDB storage engine. For other engines, a UNIQUE index permits multiple
NULL values for columns that can contain NULL.
Now, this is cheating a bit, and it means that you can't have your data exactly as you want it. So this solution may not fit your needs. But if you can rework your data in this way, it should work.

Translating a MySQL data/query-set into the equivalent Cassandra representation

Consider a 500 million row MySQL table with the following table structure ...
CREATE TABLE foo_objects (
id int NOT NULL AUTO_INCREMENT,
foo_string varchar(32),
metadata_string varchar(128),
lookup_id int,
PRIMARY KEY (id),
UNIQUE KEY (foo_string),
KEY (lookup_id),
);
... which is being queried using only the following two queries ...
# lookup by unique string key, maximum of one row returned
SELECT * FROM foo_objects WHERE foo_string = ?;
# lookup by numeric lookup key, may return multiple rows
SELECT * FROM foo_objects WHERE lookup_id = ?;
Given those queries, how would you represent the given data-set using Cassandra?
you have two options:
(1) is sort of traditional: have one CF (columnfamily) with your foo objects, one row per foo, one column per field. then create two index CFs, where the row key in one is the string values, and the row key in the other is lookup_id. Columns in the index rows are foo ids. So you do a GET on the index CF, then a MULTIGET on the ids returned.
Note that if you can make id the same as lookup_id then you have one less index to maintain.
High-level clients like Digg's lazyboy (http://github.com/digg/lazyboy) will automate maintaining the index CFs for you. Cassandra itself does not do this automatically (yet).
(2) is like (1), but you duplicate the entire foo objects into subcolumns of the index rows (that is, the index top-level columns are supercolumns). If you're not actually querying by the foo id itself, you don't need to store it in its own CF at all.