Database "pointers" to rows? - mysql

Is there a way to have "pointers to rows" in a database?
for example I have X product rows, all these rows represent distinct products but many have the same field values except their "id" and "color_id" are different.
I thought of just duplicating the rows but this could be error prone, plus making a small change would have to be done on several rows, again buggy.
Question: Is there a way to fill some rows fully, then use a special value to "point to" certain field values?
For example:
id | field1 | field2 | field3 | color_id
-----------------------------------------------
1 | value1 | value2 | value3 | blue
2 | point[1] | point[1] | point[1] | red (same as row 1, except id and color)
3 | point[1] | point[1] | point[1] | green (same as row 1, except id and color)
4 | valueA | valueB | valueC | orange
5 | point[4] | point[4] | point[4] | brown (same as row 4, except id and color)
6 | valueX | valueY | valueZ | pink
7 | point[6] | point[6] | point[6] | yellow (same as row 6, except id and color)
I'm using MySQL, but this is more of a general question. Also if this is goes completely against database theory, some explanation of why this is bad would be appreciated.

This does go against database design. Look for descriptions of normalization and relational algebra. It is bad mainly because of the comment you have made "duplicating the rows but this could be error prone, plus making a small change would have to be done on several rows, again buggy."
The idea of relational databases is to act on sets of data and find things by matching on primary and foreign keys and absolutely not to use or think of pointers at all.
If you have common data for each product, then create a product table
create table product (
product_id int,
field1 ...,
field2 ...,
field3
)
with primary key on product_id
The main table would have fields id, color_id and product_id
if product table looks like
product_id | field1 | field2 | field3
-----------------------------------------------
1 | value1 | value2 | value3
2 | valueA | valueB | valueC
3 | valueX | valueY | valueZ
The main table would look like
id | product_id | color_id
--------------------------------
1 | 1 | blue
2 | 1 | red
3 | 1 | green
4 | 2 | orange
5 | 2 | brown
6 | 3 | pink
7 | 3 | yellow

Sure there is a way to have pointers to rows in a database. Just don't use a relational DBMS. In the 1960s and 1970s, there were several very successful DBMS products that were based entirely on linking records together by embedding pointers to records inside other records. Perhaps the most well known of these was IMS.
The down side of having pointers to records in other records was that the resulting database was far less flexible than relational databases ended up being. For predeterimned access paths, a database built on a network of pointers is actually faster than a relational database. But when you want to combine the data in multiple ways, the lack of flexibility will kill you.
That is why relational DBMSes took over the field in the 1980s and 1990s, although hierarchical and network databases still survive for fairly specialized work.
As others have suggested, you should learn normalization. When you do, you will learn how to decompose tables into smaller tables with fewer coulmns (fields) in each table. When you need to use the data in joined fashion, you can use a relational join to put the data back together. Relational joins can be almost as fast as navigating by pointers, especially if you have the right indexes built.
Normalization will help you avoid harmful redundancy, which is the problem you highlighted in your question.

One way of doing this is to separate the columns that seem to have repeated data and put that in a separate table. Give each of the rows in this new table a unique id. Add a column to the original table which contains the id in the new table. Then use a FOREIGN KEY relationship between the original table and the new table's id column.

well this would be called normalization under normal circumstances .. the whole point of it is to deal with that kinda scenarios .. so no it cant be done the way u want to do it.. u will need to normalize the data properly.

Create separate tables for the field1, field2 and field three values.
Put existing values there, and reference them by putting their id's into your current table.

If you're using common string values, it's good to store the strings in a separate table and refer to them with foreign keys. If you're storing anything like an integer, it wouldn't be worth it - the size of the pointer would be comparable to the size of the data itself.

It does go against database theory because you're throwing the relational part of databases out the window.
The way to do it is to make an ObjectID column that contains the key of the row you want to point to.
id | field1 | field2 | field3 | color_id | object_id |
------------------------------------------------------------
1 | value1 | value2 | value3 | blue
2 | null | null | null | red | 1 |
3 | null | null | null | green | 1 |
4 | valueA | valueB | valueC | orange
5 | null | null | null | brown | 4 |
6 | valueX | valueY | valueZ | pink
7 | null | null | null | yellow | 6 |
But remember: This is a bad idea. Don't do it. If you did want to do it, that would be how.
There are instances where it's required; but after dealing with a system that was pervasive in this, I'd always try to find another way, even if it means duplicating data and letting your business layer keep everything straight.
I work in a system where this was done throughout the system, and it's maddening when you have to recreate the functionality of relationships because someone wanted to be clever.

The way you would want to implement this in a database would be to create two tables:
object_id | field1 | field2 | field3
and
instance_id | object_id | colour
And then the rows of the second would point to the first, and you could generate the full table of data you want on the fly by
select t1.*, t2.colour from t1 join t2 on (t1.object_id=t2.object_id)

You should probably have two tables with a foreign key relationship.
Example
Products:
Id
field1
field2
field3
ProductColors:
Id
ProductId
Color

Related

Dynamic value to display numbers of entries in second table

I've got multiple entries in table A and would like to display the number of entries in a coloumn of table B. Is there a way to create a dynamic cell-content displaying the number of entries in a table?
I'm a beginner in MySQL and did not find a way to do it so far.
Example table A:
+----+------+------------+
| id | name | birthday |
+----+------+------------+
| 1 | john | 1976-11-18 |
| 2 | bill | 1983-12-21 |
| 3 | abby | 1991-03-11 |
| 4 | lynn | 1969-08-02 |
| 5 | jake | 1989-07-29 |
+----+------+------------+
What I'd like in table B:
+----+------+----------+
| id | name | numusers |
| 1 | tblA | 5 |
+----+------+----------+
In my actual database there is no incrementing ID so just taking the last value would not work - if this would've been a solution.
If MySQL can't handle this the option would be to create some kind of cronjob on my server reading the number of rows and writing them into that cell. I know how to do this - just checking if there's another way.
I'm not looking for a command to run on the mysql-console. What I'm trying to figure out is if there's some option which dynamically changes the cell's value to what I've described above.
You can create a view that will give you this information. The SQL for this view is inspired by an answer to a similar question:
CREATE VIEW table_counts AS
SELECT table_name, table_rows
FROM information_schema.tables
WHERE table_schema = '{your_db}';
The view will have the cells you speak of. As you can see, it is just a filter on an already existing table, so you might consider that this table information_schema.tables is the answer to your question.
You can do that directly with COUNT() for example SELECT COUNT(*) FROM TblA The you get all rows from that table. If you IDXs are ok then its very fast. If you write it to another table you have to make an request too to get the result of the second table. So i think your can do it directly.
If you have some performance problems there are some other possibilities like Triggers or Stored Procedures to calculate that result and save them in a memory table to get a better performance.

SQL table logic, which way is correct

I am having an argument with myself over the best way to do this.
Here is the scenario.
I have a table that has a list of users in it.
For simplicity let's say it looks like this:
Mike | male
Amy | female
Andy | male
and so on.
I then have a list of colours, let's say 4 colours:
red
blue
green
yellow
any of the users can have one or more of the colours assigned to them.
Do I add a new column to the users table called assignedColours, create a new table called colours that looks like this:
id | colour
1 | red
2 | blue
3 | green
4 | yellow
and then assign then store an array in the users assignedColours coloumn like
Mike | Male | 2,3
Amy | Female | 1,3,4
Or do I create a colour table with columns of the colours and assign the users to that column like:
Red | Blue | Green | Yellow
| Mike | Mike |
Amy | | Amy | Amy
or is there a better way of doing this all together?
I am looking for an answers as to which one is the preferred way and why.
Your first solution will give you problems if you need to search by colours.
Your second would give you extra work when adding more colours.
An additional table joining people & colours would be a good way to go. Check out information on Many-To-Many Relationships : http://www.singingeels.com/Articles/Understanding_SQL_Many_to_Many_Relationships.aspx.
You would want a UserColours table...as well as users & colours.
2 colums, UserId, ColourId
Primary key on UserId, ColourId so they do not have duplicates.
Why not creating an adjacency table? It will allow for easier joins and setting foreign keys.
Example:
Users
id| name | gender
-----------------
1 | Mike | male
2 | Amy | female
3 | Andy | male
Colors
id | name
1 | red
2 | blue
3 | green
4 | yellow
UserColors
user_id | color_id
1 | 3
1 | 2
2 | 1
4 | 2
The UserColors table allows you to associate the Users with the Colors.
The concept is known as adjacency table or join table and is used to map one2many and many2many relations.
You can find a more developed example here: http://www.joinfu.com/2005/12/managing-many-to-many-relationships-in-mysql-part-1/
Well, the idea behing RDS (Relational Data Store) such as MySQL, is to have the data normalized and thus, easily searchable.
Because of that, your best bet, is to save a table of colors, a table of users and a many-to-many table, saving the users-colors, their definition would be something along this line
users table
id | int
name | varchar
gender | varchar
colors table
id | int
name | varchar
users_colors table
id | int
user_id | int
color_id | int
this way, you can easily find all users having a certain color, whereas with an un-normalized scheme, you would run into a problem, how would you query for users with a certain color and not another?

Several separated tables vs one integrated table with an additional column?

I have 3 tables which all of them have the same structure:
// table1 // table2 // table3
+----+------+ +----+------+ +----+------+
| id | name | | id | name | | id | name |
+----+------+ +----+------+ +----+------+
| 1 | jack | | 1 | ali | | 1 | peter|
+----+------+ +----+------+ +----+------+
Well, I want to know, my current structure is better or an integrated table along with one additional column? something like this:
+----+------+-------+
| id | name | which |
+----+------+-------+
| 1 | jack | table1|
| 2 | ali | table2|
| 3 | peter| table3|
+----+------+-------+
Note: It should be noted that in the current structure (several tables) my query is something like this:
select id, name from table1
union all
select id, name from table2
union all
select id, name from table3
Now I want to know converting those several tables to one table and add a new column is better or not? (I think that new column is kinda overload, is it true?)
This has practical consequences and also philosophical consequences. From a practical point of view, it's very hard to know without knowing a lot more about how the data is going to be used. what's the read to write ratio for this data? How often is data from two or more tables going to be selected in a single query? If you have to do a UNION to get all the data gathered, it's both slower and more cumbersome.
I prefer the philosophical approach, starting with the subject matter. Is there only one kind of entity here, or are there three different entitites that all happen to have the same attribute? That nearly always tells me whether to put them in the same table or not, and also turns out to give the right answer to the practical issue as well, most of the time.
I will say that I would be looking around for some better name for the values of the extra attribute. "table1", "table2" and "table3" seem terribly opaque to me. The subject matter should provide a clue here as well.
Edit:
now that I get the subject matter, I'm going to opine in favor of a single table. It is an opinion rather than a hard and fast rule. So it would be something like.
+----+-----------+----------+--------------+
| id | word | language |translation |
+----+-----------+----------+--------------+
| 1 | butterfly | Spanish | mariposa |
| 2 | butterfly | French | papillon |
| 3 | butterfly | Italian | farfalla |
| 4 | chair | Spanish | silla |
+----+-----------+----------+--------------+
If you are sure that all three tables will remain have common attributes then the option of single table is fine and if that may not persist then don't think about it.
This thread may help you more.

MySQL Table structure: Multiple attributes for each item

I wanted to ask you which could be the best approach creating my MySQL database structure having the following case.
I've got a table with items, which is not needed to describe as the only important field here is the ID.
Now, I'd like to be able to assign some attributes to each item - by its ID, of course. But I don't know exactly how to do it, as I'd like to keep it dynamic (so, I do not have to modify the table structure if I want to add a new attribute type).
What I think
I think - and, in fact, is the structure that I have right now - that I can make a table items_attributes with the following structure:
+----+---------+----------------+-----------------+
| id | item_id | attribute_name | attribute_value |
+----+---------+----------------+-----------------+
| 1 | 1 | place | Barcelona |
| 2 | 2 | author_name | Matt |
| 3 | 1 | author_name | Kate |
| 4 | 1 | pages | 200 |
| 5 | 1 | author_name | John |
+----+---------+----------------+-----------------+
I put data as an example for you to see that those attributes can be repeated (it's not a relation 1 to 1).
The problem with this approach
I have the need to make some querys, some of them for statistic purpouses, and if I have a lot of attributes for a lot of items, this can be a bit slow.
Furthermore - maybe because I'm not an expert on MySQL - everytime I want to make a search and find "those items that have 'place' = 'Barcelona' AND 'author_name' = 'John'", I end up having to make multiple JOINs for every condition.
Repeating the example before, my query would end up like:
SELECT *
FROM items its
JOIN items_attributes attr
ON its.id = attr.item_id
AND attr.attribute_name = 'place'
AND attr.attribute_value = 'Barcelona'
AND attr.attribute_name = 'author_name'
AND attr.attribute_value = 'John';
As you can see, this will return nothing, as an attribute_name cannot have two values at once in the same row, and an OR condition would not be what I'm searching for as the items MUST have both attributes values as stated.
So the only possibility is to make a JOIN on the same repeated table for every condition to search, which I think it's very slow to perform when there are a lot of terms to search for.
What I'd like
As I said, I'd like to be able to keep the attributes types dynamical, so by adding a new input on 'attribute_name' would be enough, without having to add a new column to a table. Also, as they are 1-N relationship, they cannot be put in the 'items' table as new columns.
If the structure, in your opinion, is the only one that can acheive my interests, if you could light up some ideas so the search queries are not a ton of JOINs it would be great, too.
I don't know if it's quite hard to get it as I've been struggling my head until now and I haven't come up with a solution. Hope you guys can help me with that!
In any case, thank you for your time and attention!
Kind regards.
You're thinking in the right direction, the direction of normalization. The normal for you would like to have in your database is the fifth normal form (or sixth, even). Stackoverflow on this matter.
Table Attribute:
+----+----------------+
| id | attribute_name |
+----+----------------+
| 1 | place |
| 2 | author name |
| 3 | pages |
+----+----------------+
Table ItemAttribute
+--------+----------------+
| item_id| attribute_id |
+--------+----------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
+--------+----------------+
So for each property of an object (item in this case) you create a new table and name it accordingly. It requires lots of joins, but your database will be highly flexible and organized. Good luck!
In my Opinion it should be something like this, i know there are a lot of table, but actually it normilizes your DB
Maybe that is why because i cant understant where you get your att_value column, and what should contains this columns

mysql select from 2 other columns in the same table

I have a table which looks like this but much longer...
| CategoryID | Category | ParentCategoryID |
+------------+----------+------------------+
| 23 | Screws | 3 |
| 3 | Packs | 0 |
I am aiming to retrieve one column from this which in this instance would give me the following...
| Category |
+--------------+
| Packs/Screws |
Please excuse me for not knowing exactly how to word this, so far I can only think to split the whole table into multiple tables and use LEFT JOIN, this seems like a very good opportunity for a learning curve however.
I realise that CONCAT() will come into play when combining the two retrieved Category names but beyond that I am stumped.
SELECT CONCAT(x.category,'/',y.category) Category
FROM my_table x
JOIN my_table y
ON y.categoryid = x.parentcategoryid
[WHERE x.parentcategoryid = 0]