SELECT with JOIN limited - mysql

I have the follow query:
SELECT pro.*
FROM tb_AutProposta pro, tb_AutParcelamento par
WHERE pro.id = par.id
But, want to limit each tb_AutParcelamento to 1. Tried "Subselect", but without success.
The table pro is an contract and par is the parcels of this contract. For each contract, is generated n parcels, and, for each parcel is generated a due date, and I need to know the last due date for each contract.
Any idea?

I just made up some field names because we don't know your exact table structure.
This query works under the assumptions:
you want the latest ("highest") parcel due date for each related contract
you have fields for due_date and pro_id in your parcel table. (tb_AutParcelamento.pro_id being a foreign key to the tb_AutProposta.id)
i made up pro_id because i assume the condition pro.id = par.id is wrong, when the id fields in both tables are auto increment values and the primary key in each table.
SELECT pro.*, MAX(par.due_date) as latest_due_date
FROM tb_AutProposta pro
LEFT JOIN tb_AutParcelamento par
ON pro.id = par.pro_id
GROUP BY pro.id

Without really knowing your data schema, and what you're trying to do, you could try to limit the number of par records returned.
SELECT pro.*, FIRST(par.Id) as FirstParcel
FROM tb_AutProposta pro, tb_AutParcelamento par
WHERE pro.id = par.id
GROUP BY pro.*

This is a question involving the retrieval of the group-wise maximum/minimum of a set of records. The method I like to use is as follows:
SELECT
a.*,
b.due_date
FROM
tb_AutProposta a
INNER JOIN
tb_AutParcelamento b ON a.id = b.id
INNER JOIN
(
SELECT id, MAX(due_date) AS last_due_date
FROM tb_AutParcelamento
GROUP BY id
) c ON b.id = c.id AND b.due_date = c.last_due_date
Replace due_date with the actual name of the date field in tb_AutParcelamento.

Related

MySQL SELECT queries without LIMIT

I am doing a course on Relational Databases, MySQL to be more especific. We need to create some SELECT queries for a project. The project is related to music. It has tables to represent musicians (musician), bands (band) and the musician ability to do a certain task, like singing or playing the guitar (act).
Table musician contains :
id
name
stagename
startyear
Table band contains :
code
name
type ("band" or "solo")
startyear
And finally, table act contains :
band (foreign key to code of "band" table)
musician (foreign key to id of "musician" table)
hability (guitarist, singer, like that... and a foreign key to another table)
earnings
I have doubts in two exercises, the first one asks to select musicians id and stagename who participate with more acts in bands whose type is solo.
My solution for the first one is this:
SELECT ma.id, ma.stagename
FROM musician ma, act d, band ba
WHERE ma.id = d.musician
AND ba.code = d.band
AND ba.type = "solo"
GROUP BY ma.id, ma.stagename
HAVING COUNT(ma.id) = (SELECT COUNT(d2.musician) AS count
FROM act d2, band ba2
WHERE d2.band = ba2.code
AND ba2.type = "solo"
GROUP BY d2.musician
ORDER BY count DESC
LIMIT 1);
The second one is very similar to the last one. We need to select, for every startyear, the id and stagename of a musician who can do more acts, with the corresponding number of acts and the maximum and minimum of his cachet. This is my solution:
SELECT ma.startyear, ma.id, ma.stagename, COUNT(ma.id) AS NumActs, MIN(d.earnings), MAX(d.earnings)
FROM musician ma, act d, band ba
WHERE ma.id = d.musician
AND ba.code = d.band
AND ba.type = "solo"
GROUP BY ma.year, ma.id, ma.stagename
HAVING COUNT(ma.id) = (SELECT COUNT(d2.musician) AS count
FROM act d2, band ba2
WHERE d2.band = ba2.code
AND ba2.type = "solo"
GROUP BY d2.musician
ORDER BY count DESC
LIMIT 1);
The results with my dummy data are perfect but my teacher told us we should avoid using the LIMIT option, but that's the only way we can get the highest number, at least with what we know right now.
I've seen a lot of subqueries after the FROM statement to solve this problem, however, for this project we can't use subqueries inside FROM. Is this really possible without LIMIT ?
Thanks in advance.
It is possible, but much worse than with sub-query in from or limit. So I'd never use it in real life :)
Well, long story short, you can do something like this:
SELECT
m.id
, m.stagename
FROM
musician m
INNER JOIN act a ON (
a.musician = m.id
)
INNER JOIN band b ON (
b.code = a.band
AND b.type = 'solo'
)
GROUP BY
m.id
, m.stagename
HAVING
NOT EXISTS (
SELECT
*
FROM
act a2
INNER JOIN band b2 ON (
b2.code = a2.band
AND b2.type = 'solo'
)
WHERE
a2.musician != a.musician
GROUP BY
a2.musician
HAVING
COUNT(a2.musician) > COUNT(a.musician)
)
;
I think you can understand the idea from the query itself as it's pretty straightforward. However, let me know if you need an explanation.
It is possible that your restriction was slightly different and you were not allowing to use subquery in your main FROM part only.
P.S. I'm also use INNER JOIN ... ON syntax as it is easier to see what are table join conditions and what are where conditions.
P.P.S. It might be mistakes in query as I do not have your data structure so cannot execute the query and check. I only checked if the idea works with my test table.
EDIT I just re-read the question; my initial reading missed that inline views are disallowed.
We can avoid the ORDER BY ... DESC LIMIT 1 construct by making the subquery into an inline view (or, a "derived table" in the MySQL parlance), and using a MAX() aggregate.
As a trivial demonstration, this query:
SELECT b.foo
FROM bar b
ORDER
BY b.foo DESC
LIMIT 1
can be emulated with this query:
SELECT MAX(c.foo) AS foo
FROM (
SELECT b.foo
FROM bar b
) c
An example re-write of the first query in the question
SELECT ma.id
, ma.stagename
FROM musician ma
JOIN act d
ON d.musician = ma.id
JOIN band ba
ON ba.code = d.band
WHERE ba.type = 'solo'
GROUP
BY ma.id
, ma.stagename
HAVING COUNT(ma.id)
= ( SELECT MAX(c.count)
FROM (
SELECT COUNT(d2.musician) AS count
FROM act d2
JOIN band ba2
ON ba2.code = d2.band
WHERE ba2.type = 'solo'
GROUP
BY d2.musician
) c
)
NOTE: this is a demonstration of a rewrite of the query in the question; this makes no guarantee that this query (or the query in the question) are guaranteed to return a result that satisfies any particular specification. And the specification given in the question is not at all clear.

Speeding up mysql query

I have a mysql query to join four tables and I thought that it was just best to join tables but now that mysql data is getting bigger the query seems to cause the application to stop execution.
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM `purchase_order`
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY id ORDER BY `purchase_order`.`po_date` DESC LIMIT 0, 20
my problem really is the query that takes a lot of time to finish. Is there a way to speed this query or to change this query for faster retrieval of the data?
heres the EXPLAIN EXTENED as requested in the comments.
Thanks in advance, I really hope this is the right channel for me to ask. If not please let me know.
Will this give you the correct list of ids?
SELECT id
FROM purchase_order
ORDER BY`po_date` DESC
LIMIT 0, 20
If so, then start with that before launching into the JOIN. You can also (I think) get rid of the GROUP BY that is causing an "explode-implode" of rows.
SELECT ...
FROM ( SELECT id ... (as above) ...) AS ids
JOIN purchase_order po ON po.id = ids.id
JOIN ... (the other tables)
GROUP BY ... -- (this may be problematic, especially with the LIMIT)
ORDER BY po.po_date DESC -- yes, this needs repeating
-- no LIMIT
Something like this
SELECT
`purchase_order`.`id`,
`purchase_order`.`po_date` AS po_date,
`purchase_order`.`po_number`,
`purchase_order`.`customer_id` AS customer_id ,
`customer`.`name` AS customer_name,
`purchase_order`.`status` AS po_status,
`purchase_order_items`.`product_id`,
`purchase_order_items`.`po_item_name`,
`product`.`weight` as product_weight,
`product`.`pending` as product_pending,
`product`.`company_owner` as company_owner,
`purchase_order_items`.`uom`,
`purchase_order_items`.`po_item_type`,
`purchase_order_items`.`order_sequence`,
`purchase_order_items`.`pending_balance`,
`purchase_order_items`.`quantity`,
`purchase_order_items`.`notes`,
`purchase_order_items`.`status` AS po_item_status,
`purchase_order_items`.`id` AS po_item_id
FROM (SELECT id, po_date, po_number, customer_id, status
FROM purchase_order
ORDER BY `po_date` DESC
LIMIT 0, 5) as purchase_order
INNER JOIN customer ON `customer`.`id` = `purchase_order`.`customer_id`
INNER JOIN purchase_order_items
ON `purchase_order_items`.`po_id` = `purchase_order`.`id`
INNER JOIN product ON `purchase_order_items`.`product_id` = `product`.`id`
GROUP BY purchase_order.id DESC
LIMIT 0, 5
You need to be sure that purchase_order.po_date and all id column are indexed. You can check it with below query.
SHOW INDEX FROM yourtable;
Since you mentioned that data is getting bigger. I would suggest doing sharding and then you can parallelize multiple queries. Please refer to the following article
Parallel Query for MySQL with Shard-Query
First, I cleaned up readability a bit. You don't need tick marks around every table.column reference. Also, for short-hand, using aliases works well. Ex: "po" instead of "purchase_order", "poi" instead of "purchase_order_items". The only time I would use tick marks is around reserved words that might cause a problem.
Second, you don't have any aggregations (sum, min, max, count, avg, etc.) in your query so you should be able to strip the GROUP BY clause.
As for indexes, I would have to assume you have an index on your reference tables on their respective "id" key columns.
For your Purchase Order table, I would have an index on that based on the "po_date" in the first index field position in case you already had an index using it. Since your Order by is on that, let the engine jump directly to those dated records first and you have your descending order resolved.
SELECT
po.id,
po.po_date,
po.po_number,
po.customer_id,
c.`name` AS customer_name,
po.`status` AS po_status,
poi.product_id,
poi.po_item_name,
p.weight as product_weight,
p.pending as product_pending,
p.company_owner,
poi.uom,
poi.po_item_type,
poi.order_sequence,
poi.pending_balance,
poi.quantity,
poi.notes,
poi.`status` AS po_item_status,
poi.id AS po_item_id
FROM
purchase_order po
INNER JOIN customer c
ON po.customer_id = c.id
INNER JOIN purchase_order_items poi
ON po.id = poi.po_id
INNER JOIN product p
ON poi.product_id = p.id
ORDER BY
po.po_date DESC
LIMIT
0, 20

What's wrong with this group by query in MySQL?

My DB structure:
Each "wp_custombill" have many "wp_customuser"
I want to query all the custombill, each custombill have sum of its customuser. My query is
select cb.ID, cb.Description, cb.amount, SUM(cu.amount)
from wp_customuser as cu
inner join wp_custombill as cb
on cu.email = cb.Email
group by cb.ID, cb.Description, cb.amount
But the result seem like sum(cu.Amount) summary all of CustomUser's records.
My guess is your wp_custombill ids (1,2,3) have same email
But without data sample is hard to test.
Otherwise your query looks ok.
In GROUP BY clause we have to write the field, over which we decide the aggregation is done. For your situation your field is customer not the bill. For each customer we have to sum the bill amount. So the query would look like-
select cu.payentId, cb.ID, cb.Description, cb.amount, SUM(cu.amount)
from wp_customuser as cu
inner join wp_custombill as cb
on cu.email = cb.Email
group by cu.payentId
In this query we find the all bill associated with single customer by join on user table on email. You can use cu.email or cb.email in group by also.
you don't need to group with all columns. if you group them with description or amount field, then you may get duplicate records.
select cb.ID, cb.Description, cb.amount, SUM(cu.amount)
from wp_customuser as cu
inner join wp_custombill as cb
on cu.email = cb.Email
group by cb.email

Count, Group By, Subquery, Left Join not working as expected

This is puzzling me and no amount of the Google is helping me, hoping someone can point me in the right direction.
Please note that I have omitted some fields from the tables that don't relate to the question just to simplify things.
contacts
contact_id
name
email
contact_uuids
uuid
contact_id
visitor_activity
uuid
event
contact_communications
comm_id
contact_id
employee_id
Query
SELECT
`c`.*,
(SELECT COUNT(`log_id`) FROM `contact_communications` `cc` WHERE `cc`.`contact_id` = `c`.`contact_id`) as `num_comms`,
(SELECT MAX(`date`) FROM `contact_communications` `cc` WHERE `cc`.`contact_id` = `c`.`contact_id`) as `latest_date`,
(SELECT MIN(`date`) FROM `contact_communications` `cc` WHERE `cc`.`contact_id` = `c`.`contact_id`) as `first_date`,
(SELECT COUNT(`vaid`) FROM `visitor_activity` `va` WHERE `va`.`uuid` = `cu`.`uuid`) as `num_act`
FROM `contacts` `c`
LEFT JOIN `contact_uuids` `cu` ON `c`.`contact_id` = `cu`.`contact_id`
GROUP BY `c`.`contact_id`
ORDER BY `c`.`name` ASC
Some contacts have multiple UUIDs (upwards of 20 or 30).
When I perform the query WITHOUT the GROUP BY statement, I get the results I expect - a row returned for each UUID that exists for that contact, with the correct "num_comms" and "num_act" numbers.
However when I add the GROUP BY statement, the "num_comms" is a lot smaller then expected and the "num_act" returns only the value from the first row without the GROUP BY statement.
I tried doing a "WHERE NOT IN" in the subquery, however that simply crashed the server as it was far too intense.
So - how do I get this to add up all the COUNT values from the LEFT JOIN and not just return the first value?
Also if anyone can help me optimize this that would be great.
Two problems:
GROUP BY c.contact_id does not include all the non-aggregate columns. This is a MySQL extension. What you get is random values for the rows other than contact_id
The JOIN adds confusion. Your only use for visitor_activity is the COUNT(*) one it. But that does not make sense since it is limited to one UUID, whereas the row is limited to one contact_id. Rethink the purpose of that.
Remove this line:
(SELECT COUNT(`vaid`) FROM `visitor_activity` `va` WHERE `va`.`uuid` = `cu`.`uuid`) as `num_act`
and the rest may work ok.
I will continue with the assumption that you want the COUNT of all rows in visitor_activity for all the uuids associated with the one contact_id.
See if this:
( SELECT COUNT(*)
FROM `contacts` c2
JOIN `visitor_activity` USING(uuid)
WHERE c2.contact_id = c.contact_id as `num_act` ) AS num_act
will work for the last subquery. At the same time, remove the JOIN:
LEFT JOIN `contact_uuids` `cu` ON `c`.`contact_id` = `cu`.`contact_id`
Now, back to the other problem (the non-standard usage of GROUP BY). Assuming that contact_id is the PRIMARY KEY, then simply remove the
GROUP BY `c`.`contact_id`

MySQL ORDER BY ignored with GROUP BY (Efficient work around?)

I feel there is a simple solution to this -- I've looked at other questions on Stack Overflow, but they seem to be inefficient, or perhaps I'm doing them wrong.
Here are simplified versions of tables I'm working with.
CREATE TABLE object_values (
object_value_id INT PRIMARY KEY,
object_id INT,
value FLOAT,
date_time DATETIME,
INDEX (object_id)
);
CREATE TABLE object_relations (
object_group_id INT,
object_id INT,
INDEX (object_group_id, object_id)
);
There is a many-to-one relationship -- many object values per one object (150 object values on average per object)
I want to get the last object value (determined by date_time field) for each object_id based on the object_group_id.
Current Query:
SELECT a.`object_id`, a.`value`
FROM `object_values` AS a
LEFT JOIN `object_relations` AS b
ON ( a.`object_id` = b.`object_id` )
WHERE b.`object_group_id` = 105
ORDER BY a.`date_time` DESC
This pulls back all the records -- if I do a GROUP BY a.object_id then the ORDER BY gets ignored
I have tried a series of variations -- and I apologize as I know this is a common question, but trying the other solutions hasn't quite worked so far. Any help would greatly appreciated.
Some of the solutions came back with results about 70 seconds later -- this is a large system so needs to be a bit faster.
SELECT a.object_id, a.`value`
FROM object_values a JOIN object_relations b
ON a.object_id = b.object_id
JOIN (
SELECT a.object_id, MAX(a.date_time) MaxTime
FROM object_relations b JOIN object_values a
ON a.object_id = b.object_id
WHERE b.`object_group_id` = 105
GROUP BY a.object_id
) g ON a.object_id = g.object_id AND a.date_time = g.MaxTime
WHERE b.`object_group_id` = 105
GROUP BY a.object_id
You can omit the final GROUP BY if there will never be duplicate date_time in a single group.
Try this version without a group by, using a sub-query to pull the max date for each object:
SELECT a.`object_id`, a.`value`
FROM `object_relations` b
JOIN `object_values` a
ON b.`object_id` = a.`object_id`
AND b.`object_group_id` = 105
WHERE a.`date_time` = (select MAX(date_time) from object_values where object_id = a.object_id)
The left join was unneeded since you specified the group ID in the where clause, so I made the group the main table in the join. This could be done several ways. The sub query could also be moved to the join clause if WHERE is just changed to AND.