Subquery or a double query? - mysql

For a client I have built a travelagency website. Now they asked me to optimize and automize some things. One of these things is that the accommodations shown in the searchlist must be bookable.
Simplified this is the structure
I have an AccommodationObject, an Accommodation has 1 or more PricePeriods.
What I need to do pick out the last priceperiod connected to a given accommodation. The priceperiod has a field 'date_until', which is a timestamp. In the query a check must be made, current timestamp must be less than the last timestamp from a priceperiod.
This is easy to do with a second query which just gets the last timestamp from the priceperiod table from all the priceperiod-rows connected to the given accommodation.
I'm wondering if this is the best case, or if I should use a subquery for this. And if so, how would a query like this look? I don't have much experience with subqueries.
Update
Table structure (simple)
Accommodation ->ID
PricePeriod -> ID | AccommodationID | DateUntil
Simplified:
SELECT fieldlist FROM Accommodation WHERE ID = $id
SELECT MAX(DateUntil) FROM PricePeriod WHERE AccommodationID = $id
But I would like to this in one query. I hope it's clear this way..

It depends upon a number of things, but what you should do is try a few different alternatives.
Personally, I would do:
SELECT fieldlist,
(SELECT MAX(DateUntil) FROM PricePeriod) WHERE AccomidationID = a.id) AS last_date
FROM Accommodation AS a
WHERE a.id = $id
You could also do (But it may not be as efficient)...:
SELECT fieldlist, MAX(b.DateUntil)
FROM Accommodation AS a
JOIN PricePeriod AS b ON a.id = b.AccommodationID
WHERE a.id = $id
GROUP BY a.id
The big thing is to try them all (Be sure to use SQL_NO_CACHE so that the results are true each time). Run them through EXPLAIN to try to see how "heavy" they are...

Related

Understanding an SQL Alias

I'm new to SQL.
I've read the book called 'Sams Teach Yourself Oracle PL/SQL in 10 minutes'.
I found it very interesting and easy to understand. There was some information on aliases but when I started doing exercises I came across an alias I don't know the purpose of.
Here is the cite http://www.sql-ex.ru/ and here is the database schema http://www.sql-ex.ru/help/select13.php#db_1, just in case. I'm working with the computer firm database i.e database number 1. The task is:
To find the makers producing PCs but not laptops.
Here is one of the solutions:
SELECT DISTINCT maker
FROM Product AS pcproduct
WHERE type = 'PC'
AND NOT EXISTS (SELECT maker
FROM Product
WHERE type = 'laptop'
AND maker = pcproduct.maker
);
The question is: Why do we need to alias product as pc_product and make the comparison 'maker = pc_product.maker' in the subquery?
Because in the inner query there are columns, which are named exactly the same as those in outer query (due to the fact that you use the same table there).
Since outer query columns are available in the inner query, there must be a distinction, which column you want, without alias, you'd write in the inner query maker = maker, which would be always true.
You are accessing the table twice, once in the main query, one in the sub query.
In the main query you say: Look at each record. Dismiss it, if the type doesn't equal 'PC'. Dismiss it, if you find a record in the table for the same maker with type 'laptop'.
In order to ask for the same maker, you must compare the maker of the main query's record with the records of the subquery. Both stem from the same table, so where product.maker = product.maker would be ambiguous. (Or rather the DBMS would assume you are talking about the subquery record, because the expression is inside the subquery. where product.maker = product.maker would hence be true, and you'd end up checking only whether there is at least one laptop in the table, regardless of the maker.)
So when dealing with the same table twice in a query, give at least on of them an alias in order to tell one record from the other.
Anyway for the given query I'd also qualify the other column in the expression for readability:
AND product.maker = pcproduct.maker
or even
FROM Product laptopproduct
WHERE type = 'laptop'
AND laptopproduct.maker = pcproduct.maker
On a sidenote: The query looks for makers that produce PCs, but no laptops. I'd prefer asking for this with aggregation:
select maker
from product
group by maker
having sum(type = 'PC') > 0
and sum(type = 'laptop') = 0;
That query can be understood as :
Gimme the maker's that have products of the 'PC' type, but where a
product of the 'laptop' type doesn't exist for that maker.
Including the tablename or alias name is sometimes needed when the same column names are used in more than 1 table.
So that the optimizer will know from which table the column is used.
It's not some smart AI that could guess that a criteria as
WHERE x = x
would actually mean
WHERE table1.x = table2.x
But more often, shorter alias names are used.
To increase readability and make the SQL more concise.
For example. The following two queries are equivalent.
Without aliases:
SELECT myawesometableone.id, mysecondevenmoreawesometable.id,
mysecondevenmoreawesometable.col1
FROM myawesometableone
JOIN mysecondevenmoreawesometable on mysecondevenmoreawesometable.one_id = myawesometableone.id
With aliases:
SELECT t1.id, t2.id, t2.col1
FROM myawesometableone AS t1
JOIN mysecondevenmoreawesometable AS t2 on t2.one_id = t1.id
Which SQL do you think looks better?
As for why that maker = pc_product.maker is used inside the EXISTS?
That's how the syntax for EXISTS works.
You establish a link between the query in the EXISTS and the outer query.
And in that case, that link is the "maker" column.
This doesn't detract from the other (correct) answers, but an easier to follow example might be:
SELECT DISTINCT pcproduct.maker
FROM Product AS pcproduct
WHERE pcproduct.type = 'PC'
AND NOT EXISTS (SELECT internalproduct.maker
FROM Product AS internalproduct
WHERE internalproduct.type = 'laptop'
AND internalproduct.maker = pcproduct.maker
);

Join to tables and String Compare (large data set)

I am very new to SQL and don't really know much about what i'm doing. I'm trying to figure out how to get a list of leads and owners whose corresponding campaign record types are stated as "inter"
So far I have tried joining the two tables and running a string compare I found on a different stack overflow page. Separately they work fine but together everything breaks... I only get the error "You have an error in your SQL syntax; check the manual"
select a.LeadId, b.OwnerId from
(select * from CampaignMember as a
join
select * from Campaign as b
on b.id = a.CampaignId)
where b.RecordTypeId like "inter%"
Schema:
Campaign CampaignMember
------------- ----------------
Id CampaignId
OwnerId LeadId
RecordTypeId ContactId
The string compare is also very slow. I am looking at a table of 600M values. Is there a faster alternative?
Is there also a way to get more specific errors in MySQL?
If you format your code properly, it will be very easy to see why it's not working.
select a.LeadId, b.OwnerId
from (
select *
from CampaignMember as a
join select *
from Campaign as b on b.id = a.CampaignId
)
where b.RecordTypeId like "inter%"
It's not a valid JOIN format. Also the last part, SQL use single quote ' instead of double quote "
Probably what you want is something like this
SELECT a.LeadId, b.OwnwerId
FROM CampaignMember a
JOIN Campaign b ON b.id = a.CampaignId
WHERE b.RecordTypeId LIKE 'inter%'
Try this:
select CampaignMember.LeadId, Campaign.OwnerId from
Campaign
inner join
CampaignMember
on CampaignMember.CampaignId= Campaign.id
where Campaign.RecordTypeId like "inter%"
MySql is generally pretty poor and handling sub-selects, so you should avoid them when possible. Also, your sub-select isn't filtering any rows, so it has to evaluate every row before applying the LIKE filter. This is sometimes "intelligently" handled by the query engine, but you should try to minimize reliance on the engine to optimize the query.
Additionally, you really should only return the columns that you care about; SELECT * is ok for confirming things, but slows queries down.
Therefore, the query posted by Eric (above) is actually the best choice.

Using keys on JOIN

I want to get data that is separated on three tables:
app_android_devices:
id | associated_user_id | registration_id
app_android_devices_settings:
owner_id | is_user_id | notifications_receive | notifications_likes_only
app_android_devices_favorites:
owner_id | is_user_id | image_id
owner_id is either the id from app_android_devices or the associated_user_id, indicated by is_user_id.
That is because the user of my app should be able to login to their account or use the app anonymously. If the user logged in he will have the same settings and likes on all devices.
associated_user_id is 0 if the device is used anonymously or the user ID from another table.
Now i've got the following query:
SELECT registration_id
FROM app_android_devices d
JOIN app_android_devices_settings s
ON ((d.id=s.owner_id AND
s.is_user_id=0)
OR (
d.associated_user_id=s.owner_id AND
s.is_user_id=1))
JOIN app_android_devices_favorites f
ON (((d.id=f.owner_id AND
f.is_user_id=0)
OR
d.associated_user_id=f.owner_id AND
f.is_user_id=1)
AND f.image_id=86)
WHERE s.notifications_receive=1
AND (s.notifications_likes_only=0 OR f.image_id=86);
To decide if the device should receive a push notification on a new comment. I've set the following keys:
app_android_devices: id PRIMARY, associated_user_id
app_android_devices_settings: (owner_id, is_user_id) UNIQUE, notifications_receive, notifications_likes_only
app_android_devices_favorites: (owner_id, is_user_id, image_id) UNIQUE
I've noticed that the above query is really slow. If I run EXPLAIN on that query I see that MySQL is using no keys at all, although there are possible_keys listed.
What can I do to speed this query up?
Having such complicated JOIN conditions makes life hard for everyone. It makes life hard for the developer who wants to understand your query, and for the query optimizer that wants to give you exactly what you ask for while preferring more efficient operations.
So the first thing that I want to do, when you tell me that this query is slow and not using any index, is to take it apart and put it back together with simpler JOIN conditions.
From the way you describe this query, it sounds like the is_user_id column is a sort of state variable telling you whether the user is or is not logged in to your app. This is awkward to say the least; what happens if s.is_user_id != f.is_user_id? Why store this in both tables? For that matter, why store this in your database at all, instead of in a cookie?
Perhaps there's something I'm not understanding about the functionality you're going for here. In any case, the first thing I see that I want to get rid of is the OR in your JOIN conditions. I'm going to try to avoid making too many assumptions about which values in your query represent user input; here's a slightly generic example of how you might be able to rewrite these JOIN conditions as a UNION of two SELECT statements:
SELECT ... FROM
app_android_devices d
JOIN
app_android_devices_settings s ON d.id = s.owner_id
JOIN
app_android_devices_favorites f ON d.id = f.owner_id
WHERE s.is_user_id = 0 AND f.is_user_id = 0 AND ...
UNION ALL
SELECT ... FROM
app_android_devices d
JOIN
app_android_devices_settings s ON d.associated_user_id = s.owner_id
JOIN
app_android_devices_favorites f ON d.associated_user_id = f.owner_id
WHERE s.is_user_id = 1 AND f.is_user_id = 1 AND ...
If these two queries hit your indexes and are very selective, you might not notice the additional overhead (creation of a temporary table) required by the UNION operation. It looks as though one of your result sets may even be empty, in which case the cost of the UNION should be nil.
But, maybe this doesn't work for you; here's another suggestion for an optimization you might pursue. In your original query, you have the following condition:
WHERE s.notifications_receive=1
AND (s.notifications_likes_only=0 OR f.image_id=86);
This isn't too cryptic - you want results only when the notifications_receive setting is true, and only if the notifications_likes_only setting is false or the requested image is a "favorite" image. Depending on the state of notifications_likes_only, it looks like you may not even care about the favorites table - wouldn't it be nice to avoid even reading from that table unless absolutely necessary?
This looks like a good case for EXISTS(). Instead of joining app_android_devices_favorites, try using a condition like this:
WHERE s.notifications_receive = 1
AND (s.notifications_likes_only = 0
OR EXISTS(SELECT 1 FROM app_android_devices_favorites
WHERE image_id = 86 AND owner_id = s.owner_id)
It doesn't matter what you try to SELECT in an EXISTS() subquery; some people prefer *, I like 1, but even if you gave specific columns it wouldn't affect the execution plan.

SQL Code to order by data from related table

Okay this SQL query is giving me a headache, hoping theres someone who's done something like this before.
I have two tables (truncated)
tblTickets: tblNotes:
ticketno (int) noteid (int)
firmid (int) ticketno (int)
ticket_desc (text) datecreated (datetime)
... ...
They are related in that a Ticket can have many Notes
What I need to do is create a query that searches by firmid (i.e. 32) and orders the "Tickets" by their latest "Note" using tblNotes.datecreated (ordered newest first)
Thanks!
NB. MySQL server (5.5.32)
EDIT: To those who've marked the question down: I have tried, and the furthest successful SQL I got was to list all tickets and notes joined by using JOIN on ticketno, I didnt add this code to the question because I guessed I was going about it all the wrong way, and maybe I needed to use a UNION, something I've always found tricky to use.
I need it to only search by the latest note for each ticket. Thats what I needed help on.
You need to use a sub-query within the WHERE clause of your SQL to identify the last note date and then join to the SQ to limit the notes that are returned.
The following should be enought to get you started.
SELECT ...
FROM tblTickets T
INNER JOIN
tblNotes N
ON N.ticketno = T.ticketno
INNER JOIN
(SELECT N1.ticketno
,MAX(N1.datecreated) AS last_note_date
FROM tblNotes N1
GROUP BY
N1.ticketno
)SQ
ON N.ticketno = SQ.ticketno
AND N.datecreated = SQ.last_note_date

Advanced Access Query

I have two tables. One contains Potential Customer information along with their Vehicle requirements (Vehicle Type, Vehicle Colour) etc. The other table contains a list of the Vehicles. This includes data such as NumberOfSeats, Max Speed, Price etc.
I need a query that will list Vehicles (from the Vehicles table) that satisfy the Potential Customers requirements (Vehicle Type) etc.
There's a few things I'd like to avoid in the query. I want to list these by ONLY specifying the Potential Customer's ID (Cust ID). I.E I don't want to have to do something like WHERE Cust ID = 1 AND ... AND ... AND ...
I thought about this and concluded that a JOIN or UNION is most likely needed to be used. But when I was trying to put a JOIN statement together, I found that I'd have to list loads of JOIN ON fields:
SELECT *
FROM [Potential Customer] INNER JOIN [Vehicles] AS Matches
ON Matches.`Number of Seats` >= [Potential Customer].`Min Seats` AND
ON Matches.`Color` >= [Potential Customer].`Preferred Color` = AND
...
WHERE [Potential Customer].`Cust No` = 3
Is there a better way to do this?
But you already have several ... AND ... statements. So I think that a good way to do it is:
SELECT Cars.* FROM Cars, Customer WHERE
Customer.ID = 1 AND
Cars.Whatever >= Customer.Whatever >= AND
...
I, personally, would do it that way because it's easy and understandable. Also, for about 8 years of marginal database experience, I never bothered to learn anything about joins (ashamed). And, BTW, this is not such an advanced query :P
You may be able to get what you are after by using a simple query like this:
SELECT Customer.Id, Vehicle.Id FROM Customer, Vehicle
WHERE Vehicle.criteria_1 >= Customer.Criteria_1 AND... AND Customer.Id = 3
That should give you a list of Vehicle.Id (or whatever else you select form Vehicle) for a specific customer.
BTW, how is the query going to be created? Ad-hoc in code? Stored proc?