mySQL: join not finding data, even though they exist - mysql

I have a couple of tables in a mySQL database. For simplicity I'll just show some basic fields:
Table: sources:
sourceID int not null unique primary key
trigger int not null
<other stuff>
Table: sourceBS
id not null unique primary key
sourceID int not null,
name varchar(20),
SourceID in the in the sourceBS table is a foreign key referencing its namesake in sources, with the cascade option. I have tested this: if I delete an entry in sources, the corresponding entry in sourceBS also vanishes. Good.
I want to select some stuff from a join of sources and sourceBS, filtering based on a "sources" property. This should be easy, via a join which, I think, the foreign key should render pretty efficient, so:
SELECT sources.sourceID, sourceBS.*
FROM sources
LEFT JOIN sourceBS ON sources.sourceID = sourceBS.sourceID
WHERE trigger=1;
But when this runs, each row has "NULL" for the values returned from sourceBS, even sourceBS contains entries matching the condition. I can verify this:
SELECT *
FROM sourceBS
WHERE sourceID IN (
SELECT sourceID
FROM sources
WHERE trigger=1
);
Here I get a proper set of results, i.e. non-null values. But, while this works as a proof of concept, it's no good in real life because I want to return a bunch of stuff from the "sources" table as well, and I don't want to have to run multiple queries in order to get what I want.
Returning to the join, if I replace the left join with an inner join, then no results are returned. It is as if, somehow, the "join" is simply not finding any matches in the sourceBS table, and yet they are there as the second query shows.
Why is this happening? I know that this join has a 1:M relationship, sourceBS could have multiple entries for a given entry in sources, but that should be OK. I can test exactly this type of join on other DBs, and it works.

OK, so I've solved this - it wasn't a transaction issue in the end:when I tried it on the original machine, it failed again. It was the order of the join. It appears that in my terminal I had the "ON" clause the other way round to above, that is, I was doing:
... LEFT JOIN sourceBS ON (sourceBS.blockSourceID=sources.sourceID)
which returns all the nulls. If I do it (as in the above code I pasted)
... LEFT JOIN sourceBS ON (sources.sourceID=sourceBS.sourceID
it works. When I tried it the second time last night on a new machine, I'd used the second formulation.
Guess I'd better read up on joins to understand why this happened!

Related

Should I save the same value(id) into tables or join them to retrive the value(id)?

Let's say I have a table posts that contains the User_id of the user who posted and the post post_id of the post. And I have another table comments that contains only the post that it belongs to child_of_post.
The problem is here: I need to select only from comments but at the same time get the user_id of the post that the comment belongs to.
So should I use a join like:
SELECT user_id FROM comments INNER JOIN posts ON child_of_post = post_id
Reading this confused me even more, I don't really know how to explain it, but, in general if I need to use the same value like and id, should I save that value in every table that I need it ? Or should I save it only in one table and use joins to retrieve it ?
Is using a join better that adding one more column to a table ?
Is using a join better than adding one more column to a table ?
In general : Yes.
Your database design looks good. As a general principle, avoid duplicating data across tables. This is inefficient in terms of storage, and also can quickly turn into a maintenance nightmare when it comes to modifying data, which ultimately threatens the integrity of your data.
Instead of duplicating data, the usual approach is to store a reference to the table row where the original data is stored ; this is called a foreign key, and it offers various functionalities that help maintain data integrity (prevent inserts of orphan records in the child table, delete child records when the parent is deleted, ...).
In your use case, you indeed would need to JOIN to recover the user that created the original post, like :
SELECT p.user_id, c.*
FROM comments c
INNNER JOIN posts p ON c.child_of_post = p.post_id
Assuming that post_id is the primary key of table posts, such JOIN with an equality condition referencing the primary key of another table, is very efficient, especially if you create an index on referencing column comments.child_of_post.
PS : it is a good practice to give aliases to table names and use them to index the fields in the query ; it avoids subtle bugs caused by column name clashes (when both tables have fields with the same name), and makes the query easier.

How Mysql indexes works in case of JOIN?

Suppose I have two tables patient, person
Mysql query is like below.
select fname , lname
from patient p
left join per on (per.person_id=p.person_id)
where p.account_id=2 and (per.fname like 'will%' OR per.lname like 'will%' ).
In case of this query how mysql will use index created on (p.account_id,p.person_id)
person_id is a foreign key from person table in patient table. .
I suspect you do not want LEFT. With LEFT JOIN, you are asking for account #2 whether or not he is named 'will'.
SELECT fname, lname
FROM patient p
JOIN per ON per.person_id = p.person_id
WHERE p.account_id = 2
AND (per.fname LIKE 'will% OR per.lname LIKE 'will%')
will find the full name of account #2 if it is a 'will', else return nothing.
You have not said what indexes you have, so we cannot explain your existing indexes. Please provide SHOW CREATE TABLE for each table.
For either version of the query, these indexes are the only useful ones:
p: INDEX(account_id) -- if it is not already the PRIMARY KEY
per: INDEX(person_id) -- again, if it is not already the PRIMARY KEY.
A PRIMARY KEY is a UNIQUE index.
The first index (or PK) would be a quick lookup to find the row(s) with account_id=2. The second would make the join work well. No index is useful for "will" because of the OR.
The query will look at patient first, then per, using "Nested Loop Join".
Please also provide EXPLAIN SELECT ..., so we can discuss the things I am guessing about.

SQL Query to populate table based on PK of Main Table being joined

Here is my Database structure (basic relations):
I'm attempting to formulate a one-line query that will populate the clients_ID, Job_id, tech_id, & Part_id and return back all the work orders present. Nothing more nothing less.
Thus far I've struggled to generate this Query:
SELECT cli.client_name, tech.tech_name, job.Job_Name, w.wo_id, w.time_started, w.part_id, w.job_id, w.tech_id, w.clients_id, part.Part_name
FROM work_orders as w, technicians as tech, clients as cli, job_types as job, parts_list as part
LEFT JOIN technicians as techy ON tech_id = techy.tech_name
LEFT JOIN parts_list party ON part.part_id = party.Part_Name
LEFT JOIN job_types joby ON job_id = joby.Job_Name
LEFT JOIN clients cliy ON clients_id = cliy.client_name
Apparently, once all the joining happens it does not even populate the correct foreign key values according to their reference.
[some values came out as the actual foreign key id, not even
corresponding value.]
It just goes on about 20-30 times depending on largest row of a table that I have (one of the above).
I only have two work orders created, So ideally it should return just TWO Records, and columns, and fields with correct information. What could I be doing wrong? Haven't been with MySQL too long but am learning as much as I can.
Your join conditions are wrong. Join on tech_id = tech_id, not tech_id = tech_name. Looks like you do this for all your joins, so they all need to be fixed.
I really don't follow the text of your question, so I am basing my answer solely on your query.
Edit
Replying to your comment here. You said you want to "load up" the tech name column. I assume you mean you want tech name to be part of your result set.
The SELECT part of the query is what determines the columns that are in the result set. As long as the table where the column lives is referenced in the FROM/JOIN clauses, you can SELECT any column from that table.
Think of a JOIN statement as a way to "look up" a value in one table based on a value in another table. This is a very simplified definition, but it's a good way to start thinking about it. You want tech name in your result set, so you look it up in the Technicians table, which is where it lives. However, you want to look it up by a value that you have in the Work Orders table. The key (which is actually called a foreign key) that you have in the Work Orders table that relates it to the Technicians table is the tech_id. You use the tech_id to look up the related row in the Technicians table, and by doing so can include any column in that table in your result set.

MySQL sub-query AND 'NOT IN'

sorry if this has been answered before; I'm a little unsure how best to describe this problem never mind search for it. But here goes...
Basically I have a 'projects' table, which holds and 'id' and a 'title'. I also have a 'projects_history' table which holds information when (IF) the project is set to archived (a boolean value). It has a 'pid' key that references the project - there can be more than one record for each project as it is updated (in order to track who sets the value to what).
There's also a 'project_enquiries' table that holds information on enquiries that have been raised for the project, so there is a 'pid' key that references 'projects'. Similarly, there's a 'project_enquiry_history' table that records when (IF) the enquiry is set to closed (a boolean value). It has a 'eid' key that references the project_enquiry - there can be more than one record for each enquiry as it is updated (in order to track who sets the value to what).
My query aims to pull out the projects that haven't been archived (so either there is no record in 'project_history' or the most recent record for the project has 'archived' = 0), which have enquiries that are still open (so either there is no record in 'project_enquiry_history' or the most recent record for the enquiry has 'open' = 1).
I'm really struggling on where to start with the query.
The first thing is that if there is no record in project_history, then you will need to do a left join of your projects table on the project_history table using your project id to discover this. You will have entries in the results that are null in the columns for project_history where there is no corresponding entry. You can then chain that process, using the result to do another left join on the 'project_enquiry_history' to find the items that have no entries.
You can also use these results as normal, to do selects of the data.
It will look something like:
SELECT * from project p LEFT OUTER JOIN project_history ph ON P.ID=ph.pid
LEFT OUTER JOIN project_enquiry_history peh ON peh.ID=ph.pid
WHERE (archived is NULL OR archived=0)
AND (open is NULL OR open=1)
You may have to tweak this to find the most recent if you are using timestamps and such.

Storing Friends in Database for Social Network

For storing friends relationships in social networks, is it better to have another table with columns relationship_id, user1_id, user2_id, time_created, pending or should the confirmed friend's user_id be seralized/imploded into a single long string and stored along side with the other user details like user_id, name, dateofbirth, address and limit to like only 5000 friends similar to facebook?
Are there any better methods? The first method will create a huge table! The second one has one column with really long string...
On the profile page of each user, all his friends need to be retrieved from database to show like 30 friends similar to facebook, so i think the first method of using a seperate table will cause a huge amount of database queries?
The most proper way to do this would be to have the table of Members (obviously), and a second table of Friend relationships.
You should never ever store foreign keys in a string like that. What's the point? You can't join on them, sort on them, group on them, or any other things that justify having a relational database in the first place.
If we assume that the Member table looks like this:
MemberID int Primary Key
Name varchar(100) Not null
--etc
Then your Friendship table should look like this:
Member1ID int Foreign Key -> Member.MemberID
Member2ID int Foreign Key -> Member.MemberID
Created datetime Not Null
--etc
Then, you can join the tables together to pull a list of friends
SELECT m.*
FROM Member m
RIGHT JOIN Friendship f ON f.Member2ID = m.MemberID
WHERE f.MemberID = #MemberID
(This is specifically SQL Server syntax, but I think it's pretty close to MySQL. The #MemberID is a parameter)
This is always going to be faster than splitting a string and making 30 extra SQL queries to pull the relevant data.
Separate table as in method 1.
method 2 is bad because you would have to unserialize it each time and wont be able to do JOINS on it; plus UPDATE's will be a nightmare if a user changes his name, email or other properties.
sure the table will be huge, but you can index it on Member11_id, set the foreign key back to your user table and could have static row sizes and maybe even limit the amount of friends a single user can have. I think it wont be an issue with mysql if you do it right; even if you hit a few million rows in your relationship table.