Inner Join Question

Inner Join Question - mysql

CREATE TABLE college
(
id SERIAL PRIMARY KEY,
SCHOOL VARCHAR(100),
CColor VARCHAR(100),
CCmascot VARCHAR(100)
);
CREATE TABLE mats
(
id SERIAL PRIMARY KEY,
CColor VARCHAR(100),
CCNAME VARCHAR(100)
);
MYSQL
Ok so here is the problem I think its pretty simple but I am not getting it right. I have the SCHOOL name passed to me through the URL and I use the $_GET to get the college name now I need to query:
By using the SCHOOL name I need to get the CCOLOR and the CCNAME.

Your question is unclear so an answer can only be approximated.
You need columns in both tables that can be used to join them, that is columns that have values that can be used to identify when a record/s in the parent table (college) matches a record/s in the child table (mats). Ideally you would have a foreign key in the child table maps, which could be named college_id (this uses a naming convention that references the parent table).
Giving a foreign key like the one mentioned above your query would become
select
college.ccolor
from
college inner join mats
on college.id = mats.college_id
where
mats.ccname = "<<COLOUR_DESIRED>>";
assuming ccname is the name of ccolor.

You have the college name and you wish to find out the colour name, if I understand correctly.
The linking attribute is CColor.
You query should look a little bit like this:
select
m.ccname, m.ccolor
from
mats m
inner join
college c
on
c.ccolor = m.ccolor
where
c.school = #myVariable

Database Tip of the Day: Use Foreign Key constraints, or you will have data corruption problems, and people on SO will have no idea how your columns relate to each other.
When you know the whys and the whatfors of relational modeling, you might find it necessary to go without them (although it's not recommended unless you have a really good reason), but for now, use them to explicitly define how the tables relate to each other.
Otherwise your question is kind of like asking a chef, "I have some unlabeled jars of food and what I think is oregano. How do I cook a romantic dinner for two?" (Umm.. what's in the jars??)
Foreign key doumentation: http://dev.mysql.com/doc/refman/5.1/en/ansi-diff-foreign-keys.html
Join documentation: http://dev.mysql.com/doc/refman/5.1/en/join.html

SELECT college.CColor FROM college
INNER JOIN mats ON college.CColor = mats.CColor
AND mats.CColor = 'your query'

Related

What is the point of providing a JOIN condition when there are foreign keys?

TL;DR: Why do we have to add ON table1.column = table2.column?
This question asks roughly why do we need to have foreign keys if joining works just fine without them. Here, I'd like to ask the reverse. Given the simplest possible database, like this:
CREATE TABLE class (
class_id INT PRIMARY KEY,
class_name VARCHAR(40)
);
CREATE TABLE student (
student_id INT PRIMARY KEY,
student_name VARCHAR(40),
class_id INT,
FOREIGN KEY(class_id) REFERENCES class(class_id) ON DELETE SET NULL
);
… and a simple join, like this:
SELECT student_id, student_name, class_name
FROM student
JOIN class
ON student.class_id = class.class_id;
… why can't we just omit the ON clause?
SELECT student_id, student_name, class_name
FROM student
JOIN class;
To me, the line FOREIGN KEY(class_id) REFERENCES class(class_id) … in the definition of student already includes all the necessary information for the FROM student JOIN class to have an implicit ON student.class_id = class.class_id condition; but we still have to add it. Why is that?

For this you must consider the JOIN operation. It doesn't check if your two table or collection have relation or not. So the simple join without condition (ON) you will have a big result with all possibilities.
The ON operation filters to get your expected result

Reposting Damien_The_Unbeliever's comment as an answer
you don't have to join on foreign keys;
sometimes multiple foreign keys exist between the same pair of tables.
Also, SQL is a crusty language without many shortcuts for the most common use case.

JOIN condition is an expression which specifies the maching criteria, and it is checked during JOIN process. It can cause a fail only if syntax error occures.
FOREIGN KEY is a rule for data consistency checking subsystem, and it is checked during data change. It will cause a fail if the data state (intermnediate and/or final) does not match the rule.
In other words, there is nothing in common between them, they are completely different and unrelated things.
I feel like I have to reiterate parts of the question. Please, give it a second read - Dima Parzhitsky
Imagine that your offer is accepted. I have tables:
CREATE TABLE users (userid INT PRIMARY KEY);
CREATE TABLE messages (sender INT REFERENCES users (userid),
receiver INT REFERENCES users (userid));
I write SELECT * FROM users JOIN messages.
What reference must be used for joining condition? And justify your assumption...

Why are multiple indexes being generated for a table?

I've just realized that one of my tables, "pclass", has multiple instances of several foreign keys. In the Structure tab, #2-5 are the foreign keys. I have no idea why multiple instances are being generated.
Could they be generated by the JOINS? Please let me know if I need to provide other information.
$brother_id = htmlspecialchars($_GET["brother_id"]);
$selected = $brother_id;
$query_brotherId = "SELECT b.id, b.firstname, b.lastname, b.pname, b.country, b.street01, b.street02, b.city, usStates.abv AS us_state, b.intl_state, b.postalcode, b.zipcode, b.phone, b.email, pclass.id AS pclass_id, greekAlphabet.name AS pclass01, prepclass.name AS prepclass, pclassSuffix.name AS pclass02, semester.name AS pclass_sem, pclass.year AS pclass_year, b.bigbrother_id AS bbID, bb.firstname AS bbFirst, bb.lastname AS bbLast, b.status, b.comments
FROM brothers AS b
LEFT JOIN pclass ON b.pclass_id = pclass.id
LEFT JOIN prepclass ON pclass.prepclass_id = prepclass.id
LEFT JOIN greekAlphabet ON pclass.greekAlphabet_id = greekAlphabet.id
LEFT JOIN pclassSuffix ON pclass.suffix_id = pclassSuffix.id
LEFT JOIN semester ON pclass.semester_id = semester.id
LEFT JOIN usStates ON b.us_state = usStates.id
LEFT JOIN brothers AS bb ON b.bigbrother_id = bb.id
WHERE b.id = $brother_id";
$result_brotherId = mysqli_query($link, $query_brotherId);

First your question:
Could they be generated by the JOINS?
No. Foreign Keys are generated by data definition statements like CREATE TABLE, ALTER TABLE and so on.
I have no idea why multiple instances are being generated.
The person who created the database must have thought they will be useful. Or if you created the database via some sql-tool (don't know) the tool created the foreign keys because it got told there is a relation between those fields.
Why it is probably not bad to have the keys:
Foreign Keys are created to display the relations between your different table.
Also they enforce a specific behaviour when you are doing actions which could disrupt the integrity of your data. You can change this behaviour in your last screenshot.
For each foreign key you can give a name which will be shown in error messages when you try to act against the constraing. And you can define how the foreign key acts if you change or delete the parent field.
For example
You have the following tables displaying which tool belongs to which person.
persons
personid
firstname
lastname
...
tools
toolid
personid (foreign key to persons)
name
....
So in the tools table you have a foreign key to the persons table, this field defines the owner of the tool.
Now let's define some use cases
Assumption: For some reason Peter is no longer able to wield any tools, so he no longer fits into the database.
What should happen to his tools? It depends what your database displays!
your database displays anyone who ever owned a tool.
This means, even if the person actually doesn't even exist anymore, the data should still remain. You would actually enforce this behaviour otherwise, but it would work in our current case to show what the foreign key can do.
So the action we choose for ON DELETE is RESTICT. (It also is the default action)
Now let's try to call: DELETE FROM persons WHERE firstname = 'Peter'
Result: the foreign key constraint will prompt you an error message. There are relations which depend on this entry in the persons table.
The database displays persons and some tools, tools don't have to have an owner
In this case we again want to delete the person Peter. His tools can remain in the database, instead of the personid they will get a null value into this field.
So we choose the action ON DELETE: SET NULL
This one is pretty straight forward. Important: the field with the foreign key must not have a NOT NULL constraint.
The database displays the people and the tools in a building or something..
So if Peter and his tools leave the building, we don't care about them anymore.
The action for ON DELETE: CASCADE.
If you now enter the DELETE-statement, the foreign key will take care of deleting all the other entries (the tools) connected to Peter.

How to store a graph of sql tables

Lets say we have quite a few tables (T1, T2... T50), and we would like to have n to n relations between all of them.
What would be a propper way of implementig that.
Having a relations table for each pair of Tx and Ty would not be practical if the number of tables goes up to 100 or more.
The current solution I have is
relationships_table
id_x, table_name_x, id_y, table_name_y
for storing all the relationships. This way adding new tables is trivial, but what are the disadvantages?
1) What is a better way of supporting such a use case, if we're limited to sql?
2) How to efficiently solve this if we're not limited to sql?

The solution you proposed is the most reasonable solution to the stated problem. But the problem seems somewhat unreasonable.
If you need a graph, then you only need two tables, one for the nodes and another one for the edges.
If some nodes are of specific types then you can have extra specialization tables for them.

Add only the essential Relation tables. tblA relates to tblB, and tblB relates to tblC. So, usually that implies that you can get from A to C via
FROM tblA
JOIN tblB ON ...
JOIN tblC ON ...
Won't this do? And need not much more than 50 extra tables? And be a lot cleaner?

I run into the same problem and I had a sligthly different approach. I added a table called relationable, only storing an id and all tables appearing in the graph have a reference to this table. I make sure on my own that only one element references an relationable entry in the whole database (This is actually what boters me the most, but in practice it is not such a problem just not looking nice). and then a relation table for the n to n relationship between relationable.
To make my point I add an example i MADE IN MySQL.
CREATE TABLE relationable
(
relationable_id INT AUTO_INCREMENT PRIMARY KEY
) ENGINE=INNODB;
in the relation table I added a name, because my vertices have a name, there might even be multiple vertices between two nodes with different names.
CREATE TABLE relation
(
from_id INT NOT NULL,
to_id INT NOT NULL,
name VARCHAR(255) NOT NULL,
FOREIGN KEY (from_id) REFERENCES relationable(relationable_id) ON DELETE CASCADE,
FOREIGN KEY (to_id) REFERENCES relationable(relationable_id) ON DELETE CASCADE
)ENGINE=INNODB;
finally a table which appears in the graph would look like the following
CREATE TABLE place
(
place_id INT NOT NULL,
name VARCAHR(255),
FOREIGN KEY (PLACE_ID) REFERENCES relationable(relationable_id)
ON DELETE CASCADE
) ENGINE=INNODB;
Now obviously this has pros and cons,
cons
You need to make sure yourself that a relationable is only referenced once. Inside one table this is taken care of by PRIMARY KEY but over all tables this is not done.
You might need a huge int for the id of relationable.
The table relation might get quite big.
pros
To errase an entry and all its relations deleting the relationable entry suffices, all entrys in relation and the respective table will be deleted.
When joining two tables there is no need for the relationable table.

MySQL - Very simple Join is taking too long

This is my first question in stackoverflow and I am delighted to be part of this community because it has helped me many times.
I'm not an expert in SQL and MySQL but I'm working in a project that needs large tables (million rows). I have a problem when doing a join and I don't understand why it takes so long. Thanks in advance:)
Here are the tables:
CREATE TABLE IF NOT EXISTS tabla_maestra(
id int UNIQUE,
codigo_alta char(1),
nombre varchar(100),
empresa_apellido1 varchar(150),
apellido2 varchar(50),
tipo_via varchar(20),
nombre_via varchar(100),
numero_via varchar(50),
codigo_via char(5),
codigo_postal char(5),
nombre_poblacion varchar(100),
codigo_ine char(11),
nombre_provincia varchar(50),
telefono varchar(250) UNIQUE,
actividad varchar(100),
estado char(1),
codigo_operadora char(3)
);
CREATE TABLE IF NOT EXISTS tabla_actividades_empresas(
empresa_apellido1 varchar(150),
actividad varchar(100)
);
Here is the query I want to do:
UPDATE tabla_maestra tm
INNER JOIN tabla_actividades_empresas tae
ON (tm.nombre!='' AND tae.empresa_apellido1=tm.empresa_apellido1)
SET tm.actividad=tae.actividad;
This query takes too long, and before executing it I was trying to test how long takes this simplier query:
SELECT COUNT(*) FROM tabla_maestra tm
INNER JOIN tabla_actividades_empresas tae
ON (tm.nombre!='' AND tae.empresa_apellido1=tm.empresa_apellido1);
It is still taking too long, and I don't understand why. Here are the indexes I use:
CREATE INDEX cruce_nombre
USING HASH
ON tabla_maestra (nombre);
CREATE INDEX cruce_empresa_apellido1
USING HASH
ON tabla_maestra (empresa_apellido1);
CREATE INDEX index_actividades_empresas
USING HASH
ON tabla_actividades_empresas(empresa_apellido1);
If I use the EXPLAIN statement, these are the results:
http://oi59.tinypic.com/2zedoy0.jpg
I would be so grateful to receive any answer that could help me. Thanks a lot,
Dani.

A join involving half a million rows -- as your query plan shows -- is bound to take some time. The count(*) query is quicker because it doesn't need to read the tabla_maestra table itself, but it still needs to scan all the rows of index cruce_empresa_apellido1.
It might help some if you made index index_actividades_empresas a unique index (supposing that that's indeed appropriate) or if instead you drop that index and make column empresa_apellido1 a primary key of table tabla_actividades_empresas.
If even that does not give you sufficient performance, then the only other thing I see to do is to give table tabla_actividades_empresas a synthetic primary key of integer type, and to change the corresponding column of tabla_maestra to match. That should help because comparing an integer to an integer is faster than comparing a string to a string, even when you can filter out (most) mismatches via a hash.

I agree with the other ones (see John Bollinger i.e.) about the lack of Primary Keys on it. It's highly adiviced for IDs (I noticed you worry about it be repeated, but PK smoothly treats it too - I meant MySQL's AUTOINCREMENT).
Why do you use the tabla_actividades_empresas.empresa_apellido1 instead of look for tabla_maestra's ID to be referenced in?
If so, you could define Foreign Key to it: tabla_actividades_empresas.maestra_id i.e.
Because it gets better if you associate tables with non-strings types.
You also can subquery the tables before the JOIN action between them. It's an example:
UPDATE (SELECT * FROM tabla_maestra WHERE nombre != '') AS tm
INNER JOIN tabla_actividades_empresas AS tae
ON tae.empresa_apellido1 = tm.empresa_apellido1
SET tm.actividad = tae.actividad;
I have not tested it. But it seems to be a nice behavior to follow since then.
Oh... everytime do you need to update all the data rows? Unless, you can update only the forgotten ones. You can apply the UPDATE by INNER JOIN after one LEFT JOIN to determine the needed ones to be updated. Does it have sense? I'm not any expert, but it can be useful to think about.
EDIT
You may test some subquery too:
UPDATE tabla_maestra AS main, tabla_actividades_empresas AS aggr
SET main.actividad = aggr.actividad
WHERE main.empresa_apellido1 = aggr.empresa_apellido1
AND main.nombre <> ''
Don't forget to try of adjusting the relationship.

Thank you so much for your answers.
The fact is that table 'tabla_maestra' is a table that contain information about enterprises, but does't contain the values for the 'actividad' field (activity of the enterprise). Moreover, the 'id' field is still empty (I will it in a future. It is difficult to explain why, but it has to be done this way).
I need to add the activity of each enterprise joining with an auxiliar table 'tabla_actividades_empresas', which contain the activity for each enterprise name. And I only have to do it one time, no more. Then I will be able to drop the table 'tabla_actividades_empresas' because I won't need it.
And the only way to join them is by the field 'empresa_apellido1', it is to say, the name of the enterprise.
I have made the field 'tabla_actividades_empresas.empresa_apellido1' unique, but it doesn't improve the performance.
And it doesn't have sense to define a foreign key on 'tabla_actividades_empresas' because the field 'empresa_apellido1' is UNIQUE only for the 'tabla_actividades_empresas', not for the 'tabla_maestra' (in this table, an enterprise name can appear many times because enterprises can have different offices in different places). It is to say, 'tabla_actividades_empresas' doesn't contain repeated enterprises, but 'tabla_maestra' has repeated name enterprises.
By the way, what do you mean by "adjusting the relationship"? I have tried your subqueries with the explain statement, and it doesn't use the indexes correctly, the performance is worse.

Can a foreign key act as a primary key?

I'm currently designing a database structure for our team's project. I have this very question in mind currently: Is it possible to have a foreign key act as a primary key on another table?
Here are some of the tables of our system's database design:
user_accounts
students
guidance_counselors
What I wanted to happen is that the user_accounts table should contain the IDs (supposedly the login credential to the system) and passwords of both the student users and guidance counselor users. In short, the primary keys of both the students and guidance_counselors table are also the foreign key from the user_accounts table. But I am not sure if it is allowed.
Another question is: a student_rec table also exists, which requires a student_number (which is the user_id in the user_accounts table) and a guidance_counsellor_id (which is also the user_id in the user_accounts) for each of its record. If both the IDs of a student and guidance counselor come from the user_accounts table, how would I design the student_rec table? And for future reference, how do I manually write it as an SQL code?
This has been bugging me and I can't find any specific or sure answer to my questions.

Of course. This is a common technique known as supertyping tables. As in your example, the idea is that one table contains a superset of entities and has common attributes describing a general entity, and other tables contain subsets of those entities with specific attributes. It's not unlike a simple class hierarchy in object-oriented design.
For your second question, one table can have two columns which are separately foreign keys to the same other table. When the database builds the query, it joins that other table twice. To illustrate in a SQL query (not sure about MySQL syntax, I haven't used it in a long time, so this is MS SQL syntax specifically), you would give that table two distinct aliases when selecting data. Something like this:
SELECT
student_accounts.name AS student_name,
counselor_accounts.name AS counselor_name
FROM
student_rec
INNER JOIN user_accounts AS student_accounts
ON student_rec.student_number = student_accounts.user_id
INNER JOIN user_accounts AS counselor_accounts
ON student_rec.guidance_counselor_id = counselor_accounts.user_id
This essentially takes the student_rec table and combines it with the user_accounts table twice, once on each column, and assigns two different aliases when combining them so as to tell them apart.

Yes, there should be no problem. Foreign keys and primary keys are orthogonal to each other, it's fine for a column or a set of columns to be both the primary key for that table (which requires them to be unique) and also to be associated with a primary key / unique constraint in another table.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008