PHP / MySQL selecting records from multiple tables - mysql

i have a MySQL statement which works - i can get the records requested - movies.* & groups.name.
$stmt= $mysqli->query("SELECT DISTINCT ebooks.*, groups.name FROM ebooks
INNER JOIN ebooks_groups ON ebooks.uuid = ebooks_groups.ebookuuid
INNER JOIN groups_users ON ebooks_groups.groupuuid = groups_users.groupuuid
INNER JOIN groups ON groups_users.groupuuid = groups.uuid
WHERE useruuid=".$get_useruuid."
ORDER BY groups.name");
1/ However i need to grab another column from the groups table - namely groups.uuid
i tried
SELECT DISTINCT movies.*, groups.* FROM movies, groups
&
SELECT DISTINCT movies.*, groups.name, groups.uuid FROM movies, groups
but it retrieved no records.
2/ Then I had another look at my original code - ... FROM movies ... - how is this even working if i'm not selecting FROM movies, groups tables?

AFAIK, this is pure MySQL. PHP or not doesn't come into play.
First to understand is the implicit join:
Explicit vs implicit SQL joins
That understanding should solve at least half of your problem.
Secondly, I'd never code a SELECT * without a very good reason (and there's few). It makes much more sense to select just the columns you need instead of getting them all and even if you need all that are currently there, if you work on the database model later on, there might be more (or less!!) columns in the database and it'll be much harder to detect that your code needs updating if you don't have them explicitly listed.
For the rest I build my SQL queries slowly step by step. That helps a lot to debugging your queries esp. as you have the actual tables and some sample data ...
[That should solve your other half of the question]

Related

SQL Query relationship between 3 tables

I have a scenario where I have the following tables:
Inventories
delivery_items
deliveries
I seek a query where, having the inventory id, I get the delivery_item(fk_inventory),
which then I get the delivery from the (fk_delivery).
Manually, I go to the delivery_items table, then I search for the fk_inventory that matches the id from the inventory that I'm looking for,
then I get the fk_delivery, and get the delivery.
But I need to run a report on 15k+ items.
How to write a query where from a list of inventory ids I can get to the delivery following the relationship that I mentioned above?
There are many sites on writing SQL queries, differentiating between a normal (inner) join vs outer join, left join, right join, subqueries, etc. What you are looking to do is probably best (due to all inventory items in question) is simple joins.
Try to think of it this way, and maybe do it this way. Have a sheet of paper, one representing each table and write the columns on it.
Now, visually looking at the available tables, put them next to each other based on how they are related. Note the column in table A that is the foreign key to the next table. Then again, from the second to the third.
Once you have this done (or even if just mentally), you can SEE how they are related. This is the basis of the FROM clause
select *
from
YourFirstTable yft
JOIN YourSecondTable yst
on yft.WhateverKey = yst.MatchingKeyColumn
JOIN YourThirdTable ytt
on yst.KeyToThirdTable = ytt.KeyInThisTable
Now that you have all your relationships established, you can always declare the individual columns you want from those respective tables. Easier to use with the aliases such as I provided here via yft, yst, ytt representing the first, second and third tables. Use aliases appropriate to your tables such as i=inventories, di = delivery_items, d = deliveries.
Then add whatever FILTERING conditions you want. If the condition is based on the FIRST Table such as yft above, that would go into the WHERE clause such as
where
yft.SomeColumn = 'blah'
If the filtering criteria is specific to your second or third table, just add that to the JOIN / ON condition so it stays with the table and you know contextually it is associated HERE. It makes it easier when you are getting into LEFT JOINs.
from
YourFirstTable yft
JOIN YourSecondTable yst
on yft.WhateverKey = yst.MatchingKeyColumn
AND yst.SecondTableColumn = 'someOtherValue'
AND yst.SomeOtherColumn = 'somethingElse'
So now, the engine can go through all inventory items, to the corresponding details, to the actual deliveries without having to do individual searches each time which would be painful to trace / run / and performance.

Inner joining within an inner join

I tried to find if there are any answered but couldn't seem to find any. I'm trying to join together four tables but one of the joins is not on the table that the other two joins are from, I've successfully joined three of the table I'm just not sure of syntax for joining the third.
SELECT * FROM
nc_booking
INNER JOIN
nc_customer ON nc_booking.c_id = nc_customer.id
INNER JOIN
nc_propertys ON nc_booking.p_id = nc_propertys.id
How would i now join nc_propertys to another table nc_owner?
Building on the code from #GordonLinoff, to add your extra table you need to do something like:
SELECT *
FROM nc_booking b INNER JOIN
nc_customer c
ON b.c_id = c.id INNER JOIN
nc_propertys p
ON b.p_id = p.id INNER JOIN
nc_owner o
ON o.id = p.o_id;
You haven't shared the column names we need to use to connect the extra table, so the last line might not be right. A few things to note ...
(1) The SELECT * is not ideal. If you only need particular columns here, list them. I've stuck with your * because I don't know what you want from the tables. Where a column with the same name exists in each table, you'll have "fully qualify" the field name as follows ...
SELECT c.id as customer_id,
-- more field can go here, with a comma after each
...
Several of the joined tables have an id field, so the c. is necessary to tell the database which one we want. Notice that as with the tables, we can also give the fields an 'alias', which in this case is 'customer_id'. This can be very helpful for presentation, and is often essential when using the output from a query as part of a larger piece of code.
(2) Since all the joins are INNER JOINS it makes little (if any) difference what order the tables are listed as long as the connections between them remain the same.
(3) For MySQL, it technically shouldn't matter whether you have lots of new-lines or none at all. SQL is designed to ignore "white space" (except within data). What matters is simply laying out your code so it is easy to read ... especially for other users who later might need to figure out what you were doing (although in my experience also for you, when you return to a piece of code several years later and can't remember it at all).
(4) In each ON clause it doesn't actually matter whether you wright say a = b or b = a. That's because you aren't setting one to equal the other, you are requiring that they already be equal so it amounts to the same thing either way.
My advice to a SQL beginner would be when you are writing a SELECT query (which only reads and doesn't change any data): if you aren't too sure then write some code and set it to run. If it's completely invalid, your software should give you some idea of what is wrong and no harm will be done. If it's valid but wrong, the very worst that can happen is that you put some unnecessary load on your database server ... if it takes a long time to run and you weren't expecting it to, then you should be able to cancel the query. As long as you have some idea of what you expect the results to look like, and roughly how many rows to expect, you won't go too far wrong. If you get completely stuck come back here to Stack Overflow.
Things get a bit different if you are writing code which DELETEs or UPDATEs data. Then you want to know exactly what you're up to. Normally you can write a closely related SELECT statement first to make sure you're going to be making all and only the changes you were expecting. It's also best to make sure you've got a way to undo your changes should the worst happen. Backups are obviously good, and you can often create your own backup copy of a table before you make any alterations. You don't necessarily need to rely on backup software or your in house IT guys for that ... in my experience they don't like databases anyway.
Also there are some great books out there. For a beginner, I'd recommend anything by Ben Forta, including his SQL in 10 Minutes (that's a per chapter figure), or his MySQL Crash Course (the latter is a little old though, so won't have anything on the more recently added features of MySQL).
Your syntax looks okay. I am providing an answer because you really should learn to use table aliases. They make a query easier to write and to read:
SELECT *
FROM nc_booking b INNER JOIN
nc_customer c
ON b.c_id = c.id INNER JOIN
nc_propertys p
ON b.p_id = p.id;

Right way to phrase MySQL query across many (possible empty) tables

I'm trying to do what I think is a set of simple set operations on a database table: several intersections and one union. But I don't seem to be able to express that in a simple way.
I have a MySQL table called Moment, which has many millions of rows. (It happens to be a time-series table but that doesn't impact on my problem here; however, these data have a column 'source' and a column 'time', both indexed.) Queries to pull data out of this table are created dynamically (coming in from an API), and ultimately boil down to a small pile of temporary tables indicating which 'source's we care about, and maybe the 'time' ranges we care about.
Let's say we're looking for
(source in Temp1) AND (
((source in Temp2) AND (time > '2017-01-01')) OR
((source in Temp3) AND (time > '2016-11-15'))
)
Just for excitement, let's say Temp2 is empty --- that part of the API request was valid but happened to include 'no actual sources'.
If I then do
SELECT m.* from Moment as m,Temp1,Temp2,Temp3
WHERE (m.source = Temp1.source) AND (
((m.source = Temp2.source) AND (m.time > '2017-01-01')) OR
((m.source = Temp3.source) AND (m.time > '2016-11'15'))
)
... I get a heaping mound of nothing, because the empty Temp2 gives an empty Cartesian product before we get to the WHERE clause.
Okay, I can do
SELECT m.* from Moment as m
LEFT JOIN Temp1 on m.source=Temp1.source
LEFT JOIN Temp2 on m.source=Temp2.source
LEFT JOIN Temp3 on m.source=Temp3.source
WHERE (m.source = Temp1.source) AND (
((m.source = Temp2.source) AND (m.time > '2017-01-01')) OR
((m.source = Temp3.source) AND (m.time > '2016-11-15'))
)
... but this takes >70ms even on my relatively small development database.
If I manually eliminate the empty table,
SELECT m.* from Moment as m,Temp1,Temp3
WHERE (m.source = Temp1.source) AND (
((m.source = Temp3.source) AND (m.time > '2016-11-15'))
)
... it finishes in 10ms. That's the kind of time I'd expect.
I've also tried putting a single unmatchable row in the empty table and doing SELECT DISTINCT, and it splits the difference at ~40ms. Seems an odd solution though.
This really feels like I'm just conceptualizing the query wrong, that I'm asking the database to do more work than it needs to. What is the Right Way to ask the database this question?
Thanks!
--UPDATE--
I did some actual benchmarks on my actual database, and came up with some really unexpected results.
For the scenario above, all tables indexed on the columns being compared, with an empty table,
doing it with left joins took 3.5 minutes (!!!)
doing it without joins (just 'FROM...WHERE') and adding a null row to the empty table, took 3.5 seconds
even more striking, when there wasn't an empty table, but rather ~1000 rows in each of the temporary tables,
doing the whole thing in one query took 28 minutes (!!!!!), but,
doing each of the three AND clauses separately and then doing the final combination in the code took less than a second.
I still feel I'm expressing the query in some foolish way, since again, all I'm trying to do is one set union (OR) and a few set intersections. It really seems like the DB is making this gigantic Cartesian product when it seriously doesn't need to. All in all, as pointed out in the answer below, keeping some of the intelligence up in the code seems to be the better approach here.
There are various ways to tackle the problem. Needless to say it depends on
how many queries are sent to the database,
the amount of data you are processing in a time interval,
how the database backend is configured to manage it.
For your use case, a little more information would be helpful. The optimization of your query by using CASE/COUNT(*) or CASE/LIMIT combinations in queries to sort out empty tables would be one option. However, if-like queries cost more time.
You could split the SQL code to downgrade the scaling of the problem from 1*N^x to y*N^z, where z should be smaller than x.
You said that an API is involved, maybe you are able handle the temporary "no data" tables differently or even don't store them?
Another option would be to enable query caching:
https://dev.mysql.com/doc/refman/5.5/en/query-cache-configuration.html

Access 2007 - Left Join returns correct results, Inner Join returns nothing

I have a query that the only way I could get it to work was to left join, on three fields. If I did an ordinary inner join on these three fields the query returned nothing. But if I try each individual join separately, they all join as I would expect, e.g. Bob to Bob, Bookshop to Bookshop, Bread to Bread etc.
So for these two sets of query results...
1.Manager 1.Shop 1.Product 1.Cost 2.Manager 2.Shop 2.Product 2.Quantity
Bob Hardware Spanners 15 Bob Hardware Spanners 3
Terry Food Bread 12 Terry Food Bread 4
Sue Bookshop Books 18 Sue Bookshop Books 7
...this query returns no results:
SELECT 1.Manager, 1.Shop, 1.Product, 1.Cost, 2.Quantity
FROM 1 INNER JOIN 2 ON 1.Manager = 2.Manager AND 1.Shop = 2.Shop AND 1.Product = 2.Product;
I know joining on text isn't ideal, but I have similar queries that join on these three fields without problem, so wondered whether it was a 'feature' of Access that I had encountered, or whether it's likely to be a problem in the data?
-edit-
By putting the JOIN conditions into the WHERE clause instead, I found that, if I have WHERE 1.Manager = "Bob" AND 2.Manager = "Bob:
WHERE 1.Product = "Spanners"
works on its own, and:
WHERE 2.Product = "Spanners"
works on its own, but combining the two:
WHERE 1.Product = "Spanners" AND 2.Product = "Spanners"
again returns nothing!
-edit 2-
The main query does indeed behave properly when it is referencing the data in tables. So there may be something odd about the way the base queries return their results.
-edit 3-
This is the link to an example of the problem: [link removed]
01 Top Level Queries: both of these are the same, but that one refers to tables, and works; and the other refers to queries, and does not work. I want to find out why the query version doesn't work.
02 2nd Level Queries and Tables: there are two versions of each set of data - one is a query, and the other is a table made using a Make Table version of the query. Both are identical as far as I can tell.
03 and 04 Level Queries: these are lower level queries that go to make up the 2nd level queries
Tables: these are the base tables that all other queries are built on.
OK, so I downloaded your db and took a look. I got as far as finding that if you put the NumStores query first in you inner join then it would return records, then abandoned ship. I don't want to sound harsh but you are so far down the road of poor database design you have no hope of going further. Among the many issues that will continue to cause you problems are:
No primary keys in your tables (no indexes of any kind).
Incomprehensible naming convention for your objects (queries and tables).
Data is duplicated in many different tables (normalization violations).
Embedded subqueries in your main queries.
If you want to use Access to help you you need to learn how to use it.
For the record, if anyone looks at this question having a similar problem - one of the queries that fed into the main query was grouping on a field that didn't appear anywhere in that particular query. Once I'd removed that field from the Group By clause the main query returned the results I expected.
Odd that a query was essentially returning exactly the same results with different behaviour, but there you go.
Had the same problema here in the future (year 2017, Access 2010).
For some reason, Left Join would work bringing the exact same result Inner Join brought and mysteriously stopped.
After "Feb 11 '13 at 9:54" message, I noticed that one of the joined queries had doubled Group By fields not showing (no reason for that), so I deleted them. It worked. Access recreated the no-show Group By fields, but not doubled anymore, and that was the (bug?) problem.

Leverage the work done in one SQL query to simplify a second one?

I have a database with, among others, the following two tables:
classes is a straightforward table that has one row per class in a class schedule.
sessions is a table that characterizes the days and times that each class meets, where each row is capable of expressing a notion like:
"Tuesdays | Jan 22-Mar 5 | 6-9pm"
"Tuesdays & Thursdays | Jan 22-Mar 7 | 6-9pm"
"Monday-Thursday | Jan 21-24 | 3-6pm"
"Saturday | Mar 9 | 9am-4pm"
and so on.
There is guaranteed to be at least one row in sessions for each row in classes, and for certain classes there may be two or more associated session rows.
At present, I'm using two different queries to get the class and session information for the classes that match a particular set of criteria, like this:
select c.class_id, c.title, c.instructor, c.num_seats, c.price
from classes c
join classes_by_department cbd
on (cbd.class_id = c.class_id)
join /* several other tables */
on /* several other join conditions */
where cbd.department_id = '{$dept_id}'
and /* several other qualifying conditions */
;
and this:
select s.class_id, s.start_date, s.end_date, s.day_bits, s.start_time, s.end_time
from sessions s
join classes c
on (c.class_id = s.class_id)
join classes_by_department cbd
on (cbd.class_id = s.class_id)
join /* the same other tables */
on /* the same other join conditions */
where cbd.department_id = '{$dept_id}'
and /* the same other qualifying conditions */
;
This works fine, and -- at least in the current application -- the tables aren't big enough, and the traffic isn't heavy enough, for two queries to be a problem. Nevertheless, it strikes me as a bit wasteful, and I'm wondering if there isn't a way to better leverage the work already done by the first query to perform the second one (rather than what amounts to running the same query twice and just selecting different columns).
Of course I realize that I could just select all the relevant columns from classes and sessions in a single query (the second one), but I like the fact that in the current approach, the first query delivers exactly one row per qualifying class, rather than as many rows as the class has session records. I would need to restructure the existing logic that processes the query results if I merged the queries. (Yeah, I know, waah...)
One solution that occurred to me is to collect all the class_ids returned by the first query into a vector (since I have to iterate through those results anyway) and then format the contents of that vector as the content of a value-list for an IN clause, so that the second query would simply become:
select s.class_id, s.start_date, s.end_date, s.day_bits, s.start_time, s.end_time
from sessions s
where s.class_id in (/* value-list */);
I'm not too worried about the scalability of such a solution, as I understand that huge SQL queries are no big deal. Plus, it could take advantage of an index defined over sessions.class_id.
But... well... it's just not very satisfying to someone who's looking to improve his SQL chops, which I'll freely admit are pretty rudimentary. It feels inelegant, and not very "SQL-ish," or whatever the SQL equivalent to the term Pythonic is.
Can anyone suggest something more appropriate?
The canonical way to do what you want is to use views. Define your first query as:
create view vw_MyClasses as
select c.class_id, c.title, c.instructor, c.num_seats, c.price, cbd.department_id
from classes c
join classes_by_department cbd
on (cbd.class_id = c.class_id)
join /* several other tables */
on /* several other join conditions */
where /* several other qualifying conditions */
Then your class query would be:
select *
from vw_MyClasses
where department_id = '{$dept_id}'
Then, your second query can be:
select s.class_id, s.start_date, s.end_date, s.day_bits, s.start_time, s.end_time
from sessions s
where s.class_id in (select class_id from vw_MyClasses
where department_id = '{$dept_id}');
Or, what may be more efficient in MySQL:
select s.class_id, s.start_date, s.end_date, s.day_bits, s.start_time, s.end_time
from sessions s
where exists (select 1 from vw_MyClasses mc where mc.class_id = s.class_id limit 1)
There is a very good reason for doing this. Repeating such logic in multiple queries becomes a maintenance nightmare. As you modify the logic in one place, it is very easy to forget to make the modifications in all places. Sometimes, views are not sufficient, so you may need to use user defined functions, as explained here.
Also, if the criteria are so useful, you might want to put flags in the class table to identify them. This requires maintaining them in some way, such as nightly updates or using triggers.
In all honesty I wouldn't bother. Firstly it works just fine an seems fairly elegant to me from what you've told us. Secondly, if there's no reason to bring back extra data on the second query then don't do it. Thirdly and by far the most important is that as it currently stands it is fairly easy to understand what's happening. You may not always be the only person trying to decipher this and it is important that the code is readable by someone else. Over complicated SQL queries are not nice.
I think it is just fine as is and it's SQL-ishness is good.