can someone help me simplify this syntax - mysql

can someone help me make it simple and fast?
select k1.sto, k1.mm,k1.dd, k1.yy, k1.sto_name, k1.trannum,
count(k2.barcode)
from trans012020 as k1 , trans012020 as k2
where k1.barcode=123456789
and k1.mm=k2.mm
and k1.dd=k2.dd
and k1.yy=k2.yy
and k1.sto=k2.sto
and k1.trannum=k2.trannum
group by k1.trannum
having count(k2.barcode)=1;
if I run it, it should display all the details I need where count(barcode)=1.
it displays what I need but it took 6mins to display 5 rows of data.
also it took 6mins to display me an empty data

Using is a convient way to join where the column names are identical. As using and on are incompatible options I've put the barcode criteria in where and because its an inner join this has the same effect as a join criteria.
MySQL allows a count alias like k2count to be used later. So a partial simplification is:
select k1.sto, k1.mm,k1.dd, k1.yy, k1.sto_name, k1.trannum,
count(k2.barcode) as k2count
from trans012020 as k1
join trans012020 as k2
using (mm,dd,yy,sto,trannum)
where k1.barcode=123456789
group by k1.trannum
having k2count=1;
Some more work here is required as if you are grouping by trannum, assuming that isn't the primary key, which k1.{sto,mm,dd,yy,sto_name} fields do you expect it to display. (see Don't disable only_full_group_by.
Correct indexing will help with the query. See Rick's ROT, indexing, or add the show create table trans012020 into your question.

Keep in mind how JOIN and GROUP BY work together. First the JOINs are done. This explodes into a big temp table. Second the GROUP BY shrinks id down to essentially what you started with. Let's avoid that "inflate-deflate":
select k1.sto, k1.mm,k1.dd, k1.yy, k1.sto_name, k1.trannum,
( SELECT count(k2.barcode) FROM trans012020 as k2
WHERE k1.mm=k2.mm
and k1.dd=k2.dd
and k1.yy=k2.yy
and k1.sto=k2.sto
and k1.trannum=k2.trannum
) as k2count
from trans012020 as k1
where k1.barcode=123456789
having k2count=1;
If barcode is NOT NULL, change count(k2.barcode) to simply COUNT(*).
k1 needs an INDEX (or the PRIMARY KEY) beginning with barcode.
If k2count can never be more than 1, then there is an even better way:
select k1.sto, k1.mm,k1.dd, k1.yy, k1.sto_name, k1.trannum
from trans012020 as k1
where k1.barcode=123456789
AND EXISTS ( SELECT 1 FROM FROM trans012020 as k2
WHERE k1.mm=k2.mm
and k1.dd=k2.dd
and k1.yy=k2.yy
and k1.sto=k2.sto
and k1.trannum=k2.trannum )
And, yes, this is desirable for k2:
INDEX(mm, dd, yy, sto, trannum)
(The order of the columns is not important for this query.)
Note that the GROUP BY went away.

So, here are my thoughts, didn't do a POC, but from documentation
The select is evaluating everything until the group by, this mean is grouping everything in your database regardless of the having condition.
my suggestion is for you to try to do it like
select k1.sto,k1.mm,k1.dd,k1.yy,k1.sto_name,k1.trannum,count(k2.barcode) from trans012020 as k1 , trans012020 as k2
where
count(k2.barcode)=1 and
k1.barcode=123456789 and k1.mm=k2.mm and k1.dd=k2.dd and k1.yy=k2.yy and k1.sto=k2.sto and k1.trannum=k2.trannum
group by k1.trannum;
Hopefully that gets you the desire result
Here is some documentation on how these are evaluated
https://www.mysqltutorial.org/mysql-having.aspx

Related

Group rows but keep values where not null

I am trying to group rows in MySQL but end up with a wrong result.
My DB looks like this:
I'm using this query:
SELECT
r_id, va_id,va_klasse,va_periode,
1va_mer,1va_hjem,1va_mot,1va_bil,1va_fit,1va_hand,1va_med,1va_fra,
2va_mer,2va_hjem,2va_trae,2va_bil,2va_sty,2va_mus,2va_med,2va_fra,
3va_mer,3va_hjem,3va_mot,3va_bil,3va_pima,3va_nat,3va_med,3va_fra,
va_lock, va_update
FROM o6hxd_valgfag
WHERE va_klasse IN('7A','7B','7C','8A','8B','8C','9A','9B','9C')
GROUP BY va_id
ORDER BY va_klasse,va_name
This produces a wrong result, where one row is returned with only the first three numbers 123 and not the ones from row two and three.
What I would like is a result where the numbers 123, 321 and 132 are gathered in one line.
I can explain more detailed if this isn't sufficient.
If across those fields there should only be ever one value, you should really have them all in the same record and go about fixing it to insert and update the same record.
Ie I am aware that you database isn't designed correctly
However
To dig you out, you could give this a crack, I suppose.
SELECT
r_id, va_id,va_klasse,va_periode,
MAX(1va_mer),MAX(1va_hjem),MAX(1va_mot),MAX(1va_bil),MAX(1va_fit),MAX(1va_hand),MAX(1va_med),MAX(1va_fra),
MAX(2va_mer),MAX(2va_hjem),MAX(2va_trae),MAX(2va_bil),MAX(2va_sty),MAX(2va_mus),MAX(2va_med),MAX(2va_fra),
MAX(3va_mer),MAX(3va_hjem),MAX(3va_mot),MAX(3va_bil),MAX(3va_pima),MAX(3va_nat),MAX(3va_med),MAX(3va_fra),
va_lock, va_update
FROM o6hxd_valgfag
WHERE va_klasse IN('7A','7B','7C','8A','8B','8C','9A','9B','9C')
GROUP BY va_id
ORDER BY va_klasse,va_name
Your query will not work as intended. Think about this use-case:
what if for row1 (r_id =9), the fields 2va_sty, 2va_mus, 2va_med are not empty and has values?
In such case what should your desired output be? It certainly cannot be the numbers 123, 321 and 132 gathered in one line. Group by is usually used if you want to use aggregate functions executed against a certain field value, in your case va_id.
Not a solution to your problem but i think a better query would be like this (because of the not named columns in the group by clause https://dev.mysql.com/doc/refman/5.5/en/group-by-handling.html):
SELECT
aa.r_id, aa.va_id, aa.va_klasse, aa.va_periode,
aa.1va_mer, aa.1va_hjem, aa.1va_mot, aa.1va_bil, aa.1va_fit, aa.1va_hand, aa.1va_med, aa.1va_fra,
aa.2va_mer, aa.2va_hjem, aa.2va_trae, aa.2va_bil, aa.2va_sty,2va_mus, aa.2va_med, aa.2va_fra,
aa.3va_mer, aa.3va_hjem, aa.3va_mot, aa.3va_bil, aa.3va_pima, aa.3va_nat, aa.3va_med, aa.3va_fra,
aa.va_lock, aa.va_update
FROM o6hxd_valgfag AS aa
INNER JOIN (
SELECT va_id
FROM o6hxd_valgfag
GROUP BY va_id
) AS _aa
ON aa.va_id = _aa.va_id
WHERE aa.va_klasse IN ('7A','7B','7C','8A','8B','8C','9A','9B','9C')
ORDER BY aa.va_klasse, aa.va_name;

How to ORDER a LEFT JOIN for 2 tables

I've (on reflection ridiculously) stored our 'punters' in 2 tables depending on whether they registered or paid through the express checkout form.
My SQL looks like this:
SELECT
DISTINCT(sale_id), sale_punter_type, sale_comment, sale_refund, sale_timestamp,
punter_surname, punter_firstname,
punter_checkout_surname, punter_checkout_firstname,
punter_compo_surname, punter_compo_firstname,
sale_random, sale_scanned, sale_id
FROM
sale
LEFT JOIN
punter ON punter_id = sale_punter_no
LEFT JOIN
punter_checkout ON punter_checkout_id = sale_punter_no
LEFT JOIN
punter_compo ON punter_compo_id = sale_punter_no
WHERE
sale_event_no = :id
ORDER BY
punter_surname, punter_firstname,
punter_checkout_surname, punter_checkout_firstname
It returns results BUT lists the registered users alphabetically first, then the checkout punters alphabetically.
My question is there a way to get all of the users (registered or checkout) all together sorted alphabetically in 1 sorted list instead of 2 joined sorted lists.
I thought maybe I could use something like punter_checkout_surname AS punter_surname but that didn't work.
Any thoughts? I know now that I shouldn't have used 2 separate tables but but I'm stuck with it now.
Thank you.
I think you just want to use coalesce().
ORDER BY COALESCE(punter_surname, punter_checkout_surname)
COALESCE(punter_firstname, punter_checkout_firstname)
Other comments:
I doubt that DISTINCT is necessary. Why would this generate multiple rows for a single sale_id?
When a query has multiple tables, qualify all the column names (that is, include table aliases so you and others know where the table comes from).
Your data has three sets of names. That seems overkill.
You might want to put the COALESCE() in the SELECT so you don't have quite so many names generated by the query.
Here's my answer:
ORDER BY
COALESCE( UCASE( punter_surname) , UCASE( punter_checkout_surname ), UCASE( punter_compo_surname ) ) ,
COALESCE( UCASE( punter_firstname ) , UCASE( punter_checkout_firstname ), UCASE( punter_compo_firstname) )

MySQL Select within another select

I have a query as follows
select
Sum(If(departments.vat, If(weeklytransactions.weekendingdate Between
'2011-01-04' And '2099-12-31', weeklytransactions.takings / 1.2,
If(weeklytransactions.weekendingdate Between '2008-11-30' And '2010-01-01',
weeklytransactions.takings / 1.15, weeklytransactions.takings / 1.175)),
weeklytransactions.takings)) As Total,
weeklytransactions.weekendingdate,......
and another that returns a vat rate as follows
select format(Max(Distinct vat_rates.Vat_Rate),3) From vat_rates Where
vat_rates.Vat_From <= '2011-01-03'
I want to replace the hard coded if statement with the lower query, replacing the date in the lower query with weeklytransactions.weekendingdate.
After Kevin's comments, here is the full query I'm trying to get to work;
Select Max(vat_rates.vat_rate) As r,
If(departments.vat, weeklytransactions.takings / r, weeklytransactions.takings) As Total,
weeklytransactions.weekendingdate,
Week(weeklytransactions.weekendingdate),
round(datediff(weekendingdate, (if(month(weekendingdate)>5,concat(year(weekendingdate),'-06-01'),concat(year(weekendingdate)-1,'-06-01'))))/7,0)+1 as fyweek,
cast((Case When Month(weeklytransactions.weekendingdate) >5 Then Concat(Year(weeklytransactions.weekendingdate), '-',Year(weeklytransactions.weekendingdate) + 1) Else Concat(Year(weeklytransactions.weekendingdate) - 1, '-',Year(weeklytransactions.weekendingdate)) End) as char) As fy,
business_units.business_unit
From departments Inner Join (business_units Inner Join weeklytransactions On business_units.buID = weeklytransactions.businessUnit) On departments.deptid = weeklytransactions.departmentId
Where (vat_rates.vat_from <= weeklytransactions.weekendingdate and business_units.Active = true and business_units.sales=1)
Group By weeklytransactions.weekendingdate, business_units.business_unit Order By fy desc, business_unit, fyweek
Regards
Pete
Assuming I read your question correctly, your problem is about having the result of another SELECT used to be returned by the result of your main query (plus depending on how acquainted you are with SQL, maybe you haven't had the occasion to learn about JOINs?).
You can have subqueries you extract data from within a SELECT, provided you define it within the FROMclause. The following query will work, for example:
SELECT A.a, B.b
FROM A
JOIN (SELECT aggregate(c) FROM C) AS B
Notice that there is no reference to table A within the subquery. Thing is, you cannot just add it like that to the query, as the subquery doesn't know it is a subquery. So the following won't work:
SELECT A.a, B.b
FROM A
JOIN (SELECT aggregate(c) FROM C WHERE C.someValue = A.someValue) AS B
Back to basics. What you want to do here visibly, is to aggregate some data associated to each of the records of another table. For that, you will need merge your SELECT queries and use GROUP BY:
SELECT A.a, aggregate(C.c)
FROM A, C
WHERE C.someValue = A.someValue
GROUP BY A.a
Back to your tables, the following should work:
SELECT w.weekendingdate, FORMAT(MAX(v.Vat_Rate, 3)
FROM weeklytransactions AS w, vat_rates AS v
WHERE v.Vat_From <= w.weekendingdate
GROUP BY w.weekendingdate
Feel free to add and remove fields and conditions as you see fit (I wouldn't be surprised that you'd also want to use a lower bound when filtering the records from vat_rates, since the way I have written it above, for a given weekendingdate, you get records from that week + the weeks before!).
So it looks like my first try did not address the actual problem. With the additional information provided in the comments, as well as the new complete query, let's see how this goes.
We are still missing error messages, but normally the query as written should result in MySQL having the following complaint:
ERROR 1109 (42S02): Unknown table 'vat_rates' in field list
Why? Because the vat_rates table does not appear in the FROM clause, whereas it should. Let's make that more obvious by simplifying the query, removing all references to the business_units table as well as the fields, calculations and order that do not add or remove anything to the problem, leaving us with the following:
SELECT MAX(vat_rates.vat_rate) AS r,
IF(d.vat, w.takings / r, w.takings) AS Total
FROM departments AS d
INNER JOIN weeklytransactions AS w ON w.departmentId = d.deptid
WHERE vat_rates.vat_from <= w.weekendingdate
GROUP BY w.weekendingdate
That cannot work, and will produce the error mentioned above. It looks like there is no FOREIGN ID between the weeklytransactions and vat_rates tables, so we have no choice but to do a CROSS JOIN for the moment, hoping that the condition in the WHERE clause and the aggregate function used to get r are enough to fit the business logic at hand here. The following query should return the expected data instead of an error message (I also remove r since that seems to be an intermediate value judging by the comments that were written):
SELECT IF(d.vat, w.takings / MAX(v.vat_rate), w.takings) AS Total
FROM vat_rates AS v, departments AS d
INNER JOIN weeklytransactions AS w ON w.departmentId = d.deptid
WHERE v.vat_from <= w.weekendingdate
GROUP BY w.weekendingdate
From there, assuming it works, you will only need to put back all the parts I removed to get your final query. I am a tad doubtful about the way the VAT rate is gotten here, but I have no idea what your requirements are in that regard so I leave it up to you to make sure that works as expected.

MYSQL GROUP_CONCAT and IN

I have a little query, it goes like this:
It's slightly more complex than it looks, the only issue is using the output of one subquery as the parameter for an IN clause to generate another. It works to some degree - but it only provides the results from the first id in the "IN" clause. Oddly, if I manually insert the record ids "00003,00004,00005" it does give the proper results.
What I am seeking to do is get second level many to many relationship - basically tour_stops have items, which in turn have images. I am trying to get all the images from all the items to be in a JSON string as 'item_images'. As stated, it runs quickly, but only returns the images from the first related item.
SELECT DISTINCT
tour_stops.record_id,
(SELECT
GROUP_CONCAT( item.record_id ) AS in_item_ids
FROM tour_stop_item
LEFT OUTER JOIN item
ON item.record_id = tour_stop_item.item_id
WHERE tour_stop_item.tour_stops_id = tour_stops.record_id
GROUP BY tour_stops.record_id
) AS rel_items,
(SELECT
CONCAT('[ ',
GROUP_CONCAT(
CONCAT('{ \"record_id\" : \"',record_id,'\",
\"photo_credit\" : \"',photo_credit,'\" }')
)
,' ]')
FROM images
WHERE
images.attached_to IN(rel_items) AND
images.attached_table = 'item'
ORDER BY img_order ASC) AS item_images
FROM tour_stops
WHERE
tour_stops.attached_to_tour = $record_id
ORDER BY tour_stops.stop_order ASC
Both of these below answers I tried, but it did not help. The second example (placing the entire first subquery inside he "IN" statement) not only produced the same results I am already getting, but also increased query time exponentially.
EDIT: I replaced my IN statement with
IN(SELECT item_id FROM tour_stop_item WHERE tour_stops_id = tour_stops.record_id)
and it works, but it brutally slow now. Assuming I have everything indexed correctly, is this the best way to do it?
using group_concat in PHPMYADMIN will show the result as [BLOB - 3B]
GROUP_CONCAT in IN Subquery
Any insights are appreciated. Thanks
I am surprised that you can use rel_items in the subquery.
You might try:
concat(',', images.attached_to, ',') like concat('%,', rel_items, ',%') and
This may or may not be faster. The original version was fast presumably because there are no matches.
Or, you can try to change your in clause. Sometimes, these are poorly optimized:
exists (select 1
from tour_stop_item
where tour_stops_id = tour_stops.record_id and images.attached_to = item_id
)
And then be sure you have an index on tour_stop_item(tour_stops_id, item_id).

How to avoid filesort for that mysql query?

I'm using this kind of queries with different parameters :
EXPLAIN SELECT SQL_NO_CACHE `ilan_genel`.`id` , `ilan_genel`.`durum` , `ilan_genel`.`kategori` , `ilan_genel`.`tip` , `ilan_genel`.`ozellik` , `ilan_genel`.`m2` , `ilan_genel`.`fiyat` , `ilan_genel`.`baslik` , `ilan_genel`.`ilce` , `ilan_genel`.`parabirimi` , `ilan_genel`.`tarih` , `kgsim_mahalleler`.`isim` AS mahalle, `kgsim_ilceler`.`isim` AS ilce, (
SELECT `ilanresimler`.`resimlink`
FROM `ilanresimler`
WHERE `ilanresimler`.`ilanid` = `ilan_genel`.`id`
LIMIT 1
) AS resim
FROM (
`ilan_genel`
)
LEFT JOIN `kgsim_ilceler` ON `kgsim_ilceler`.`id` = `ilan_genel`.`ilce`
LEFT JOIN `kgsim_mahalleler` ON `kgsim_mahalleler`.`id` = `ilan_genel`.`mahalle`
WHERE `ilan_genel`.`ilce` = '703'
AND `ilan_genel`.`durum` = '1'
AND `ilan_genel`.`kategori` = '1'
AND `ilan_genel`.`tip` = '9'
ORDER BY `ilan_genel`.`id` DESC
LIMIT 225 , 15
and this is what i get in explain section:
these are the indexes that i already tried to use:
any help will be deeply appreciated what kind of index will be the best option or should i use another table structure ?
You should first simplify your query to understand your problem better. As it appears your problem is constrained to the ilan_gen1 table, the following query would also show you the same symptoms.:
SELECT * from ilan_gene1 WHERE `ilan_genel`.`ilce` = '703'
AND `ilan_genel`.`durum` = '1'
AND `ilan_genel`.`kategori` = '1'
AND `ilan_genel`.`tip` = '9'
So the first thing to do is check that this is the case. If so, the simpler question is simply why does this query require a file sort on 3661 rows. Now the 'hepsi' index sort order is:
ilce->mahelle->durum->kategori->tip->ozelik
I've written it that way to emphasise that it is first sorted on 'ilce', then 'mahelle', then 'durum', etc. Note that your query does not specify the 'mahelle' value. So the best the index can do is lookup on 'ilce'. Now I don't know the heuristics of your data, but the next logical step in debugging this would be:
SELECT * from ilan_gene1 WHERE `ilan_genel`.`ilce` = '703'`
Does this return 3661 rows?
If so, you should be able to see what is happening. The database is using the hepsi index, to the best of it's ability, getting 3661 rows back then sorting those rows in order to eliminate values according to the other criteria (i.e. 'durum', 'kategori', 'tip').
The key point here is that if data is sorted by A, B, C in that order and B is not specified, then the best logical thing that can be done is: first a look up on A then a filter on the remaining values against C. In this case, that filter is performed via a file sort.
Possible solutions
Supply 'mahelle' (B) in your query.
Add a new index on 'ilan_gene1' that doesn't require 'mahelle', i.e. A->C->D...
Another tip
In case I have misdiagnosed your problem (easy to do when I don't have your system to test against), the important thing here is the approach to solving the problem. In particular, how to break a complicated query into a simpler query that produces the same behaviour, until you get to a very simple SELECT statement that demonstrates the problem. At this point, the answer is usually much clearer.