MariaDB select from inner joined query - mysql

I am not able to further select from a joined subquery.
I have data in three tables: "events", "records" and "work_list". Each table has one piece of the puzzle where work_list is the shortest and contains top-level data, and the events table tracks many tiny frequent events.
I need to calculate many statistical variables from the events based on some key variables defined in work_list like weighted moving average etc. I have those metrics ready and working, but I have problems filtering the data in events based on selected parameters stored in work_list.
Here is code that does not work. The SELECT * is not important, I will change it to be more meaningful later, it is for clarity. However, I have tried many selections in place of the * without success.
What is wrong with this query from subquery?
Query example 1:
SELECT * FROM
(SELECT events.id, events.type,events.timestamp, work_list.task
FROM
( events
INNER JOIN records ON events.record_id = records.id
INNER JOIN work_list ON records.work_list_id = work_list.id
)
WHERE work_list.customer_number = '1234' AS subquery
);
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'as subquery ) LIMIT 0, 25' at line 8
The inner joined subquery works and it returns a normal table.
Query example 2:
SELECT events.id, events.type,events.timestamp, work_list.task
FROM (
events
INNER JOIN records ON events.record_id = records.id
INNER JOIN work_list ON records.work_list_id = work_list.id
)
WHERE work_list.customer_number = '1234';
I tried using parenthesis in different orders, and I changed selected variables in SELECT events.id, events.type,events.timestamp, work_list.task. I wonder if this is a poor way of doing this. I have the calculation part. So even if there might be better structures for this, I am interested in solutions that maintain this structure.
The goal of this phase is to filter the events table for further queries that are coded on top of it replacing the SELECT *.
These are the final calculations made earlier which I plan to use when I figure out the problem with Query example 1.
Query example 3:
SELECT *, ((SUM(rate * diff) OVER (ORDER BY startTime
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)) /
(SUM(diff) OVER(ORDER BY startTime
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW))) as rate_WMA
FROM (
SELECT id, startTime, counts, diff, (counts / diff)*3600 as rate
FROM (
SELECT id, TIMESTAMPDIFF(SECOND, MIN(timestamp), MAX(timestamp))AS diff, SUM(change) as counts, MIN(timestamp) as startTime
FROM `the filered subquery here`
GROUP BY id
) AS subquery
WHERE diff > 0
) AS totaltotal;

You have extra parenthesis (no need for those) and the alias for the subquery should be placed after the subquery:
SELECT *
FROM (
SELECT events.id, events.type,events.timestamp, work_list.task
FROM events
INNER JOIN records ON events.record_id = records.id
INNER JOIN work_list ON records.work_list_id = work_list.id
WHERE work_list.customer_number = '1234'
) AS subquery;

Related

Is there a way to create an SQL query faster than this one?

I have a MySQL table which stores the data of a hotel's reservations.
I need a query to see the amount of guests who stayed in the hotel for each date.
I was able to create a query (using a subquery) but it performs very slowly. Is there a better way to get the requested data? (For example join the table to itself, or whatever.)
My query is:
SELECT CheckOutDate AS Date,
(SELECT SUM(NrOfGuests) FROM tblGuests tG
WHERE tG.CheckInDate <= tblGuests.CheckOutDate
AND tG.CheckOutDate > tblGuests.CheckOutDate
AND tG.IsCancelled = False AND tG.NoShow = False)
AS NrOfGestsStaying
FROM tblGuests
GROUP BY CheckOutDate
What is the best way to make it perform faster?
In the original query, the SELECT returns a SUM on every row of the table using a subquery. The duplicates are removed afterwards using a group by CheckOutDate. So, in other words, this is the SUM(NrOfGuests) for distinct CheckOutDate.
You can remove duplicate CheckOutDate in advance by subquerying distinct CheckOutDate. So in the receiving query the SUM is applied just one time for distinct CheckOutDate:
SELECT dT.CheckOutDate
,(SELECT SUM(NrOfGuests)
FROM tblGuests tG
WHERE tG.CheckInDate <= dT.CheckOutDate
AND tG.CheckOutDate >= dT.CheckOutDate
AND tG.IsCancelled = 0
AND tG.NoShow = 0
) AS NrOfGuests
FROM (
SELECT DISTINCT CheckOutDate
FROM tblGuests
) AS dT
ORDER BY dT.CheckOutDate

SQL query needs optimization

SELECT LM.user_id,LM.users_lineup_id, min( LM.total_score ) AS total_score
FROM vi_lineup_master LM JOIN
vi_contest AS C
ON C.contest_unique_id = LM.contest_unique_id join
(SELECT min( total_score ) as total_score
FROM vi_lineup_master
GROUP BY group_unique_id
) as preq
ON LM.total_score = preq.total_score
WHERE LM.contest_unique_id = 'iledhSBDO' AND
C.league_contest_type = 1
GROUP BY group_unique_id
Above query is to find the loser per group of game, query return accurate result but its not responding with large data. How can I optimize this?
You can try to move your JOINs to subqueries. Also, you should pay attention on your "wrong" GROUP BY usage on the outer query. In Mysql you can group by some columns and select others not specified in the group clause without any aggregation function, but the database can't ensure what data it will return to you. For the sake of consistency of your application, wrap them in an aggregation function.
Check if this one helps:
SELECT
MIN(LM.user_id) AS user_id,
MIN(LM.users_lineup_id) AS users_lineup_id,
MIN(LM.total_score) AS total_score
FROM vi_lineup_master LM
WHERE 1=1
-- check if this "contest_unique_id" is equals
-- to 'iledhSBDO' for a "league_contest_type" valued 1
AND LM.contest_unique_id IN
(
SELECT C.contest_unique_id
FROM vi_contest AS C
WHERE 1=1
AND C.contest_unique_id = 'iledhSBDO'
AND C.league_contest_type = 1
)
-- check if this "total_score" is one of the
-- "min(total_score)" from each "group_unique_id"
AND LM.total_score IN
(
SELECT MIN(total_score)
FROM vi_lineup_master
GROUP BY group_unique_id
)
GROUP BY LM.group_unique_id
;
Also, some pieces of this query may seem redundant, but it's because I did not want to change the filters you wrote, just moved them.
Also, your query logic seems a bit strange to me, based on the tables/columns names and how you wrote it... please, check the comments in my query which reflects what I understood of your implementation.
Hope it helps.

Incorrect group by and order by merge

I have couple tables joined in MySQL - one has many others.
And try to select items from one, ordered by min values from another table.
Without grouping in seems to be like this:
Code:
select `catalog_products`.id
, `catalog_products`.alias
, `tmpKits`.`minPrice`
from `catalog_products`
left join `product_kits` on `product_kits`.`product_id` = `catalog_products`.`id`
left join (
SELECT MIN(new_price) AS minPrice, id FROM product_kits GROUP BY id
) AS tmpKits on `tmpKits`.`id` = `product_kits`.`id`
where `category_id` in ('62')
order by product_kits.new_price ASC
Result:
But when I add group by, I get this:
Code:
select `catalog_products`.id
, `catalog_products`.alias
, `tmpKits`.`minPrice`
from `catalog_products`
left join `product_kits` on `product_kits`.`product_id` = `catalog_products`.`id`
left join (
SELECT MIN(new_price) AS minPrice, id FROM product_kits GROUP BY id
) AS tmpKits on `tmpKits`.`id` = `product_kits`.`id`
where `category_id` in ('62')
group by `catalog_products`.`id`
order by product_kits.new_price ASC
Result:
And this is incorrect sorting!
Somehow when I group this results, I get id 280 before 281!
But I need to get:
281|1600.00
280|2340.00
So, grouping breaks existing ordering!
For one, when you apply the GROUP BY to only one column, there is no guarantee that the values in the other columns will be consistently correct. Unfortunately, MySQL allows this type of SELECT/GROUPing to happen other products don't. Two, the syntax of using an ORDER BY in a subquery while allowed in MySQL is not allowed in other database products including SQL Server. You should use a solution that will return the proper result each time it is executed.
So the query will be:
For one, when you apply the GROUP BY to only one column, there is no guarantee that the values in the other columns will be consistently correct. Unfortunately, MySQL allows this type of SELECT/GROUPing to happen other products don't. Two, the syntax of using an ORDER BY in a subquery while allowed in MySQL is not allowed in other database products including SQL Server. You should use a solution that will return the proper result each time it is executed.
So the query will be:
select CP.`id`, CP.`alias`, TK.`minPrice`
from catalog_products CP
left join `product_kits` PK on PK.`product_id` = CP.`id`
left join (
SELECT MIN(`new_price`) AS "minPrice", `id` FROM product_kits GROUP BY `id`
) AS TK on TK.`id` = PK.`id`
where CP.`category_id` IN ('62')
order by PK.`new_price` ASC
group by CP.`id`
The thing is that group by does not recognize order by in MySQL.
Actually, what I was doing is really bad practice.
In this case you should use distinct and by catalog_products.*
In my opinion, group by is really useful when you need group result of agregated functions.
Otherwise you should not use it to get unique values.

`ORDER BY` before `GROUP BY` in a request with INNER JOIN?

Initially I need to build a query fetching sites from one table ordered by date of newest article (articles placed in the separate table).
I build the following query:
SELECT *
FROM `sites`
INNER JOIN `articles` ON `articles`.`site_id` = `sites`.`id`
ORDER BY `articles`.`date` DESC
GROUP BY `sites`.`id`
I supposed that SELECT and INNER JOIN will fetch all posts and associate a site to each one, than ORDER BY will order the result by descending of post date than GROUP BY will take the very first post for each site and I will get the needed result.
But I'm receiving MySQL error #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'GROUP BYsites.idLIMIT 0, 30' at line 7
If I place GROUP BY before ORDER BY statement the query is working but it will not give me the newest post for each site. Instead the result will be sorted after the grouping which is not the thing I need (actually I could prefer to order in another way after grouping).
I read several pretty similar questions but they all related to the data stored in a single table making it possible to use MAX and MIN functions.
What should I do to implement what I need?
You can use either a subquery / derived-table / inline-view or a self-exclusion join, e.g.:
SELECT s.*, a1.*
FROM `sites` s
INNER JOIN `articles` a1 ON a1.`site_id` = s.`id`
LEFT OUTER JOIN `articles` a2 ON a2.`site_id` = a1.`site_id`
AND a2.`date` > a1.`date`
WHERE
a2.`site_id` IS NULL
ORDER BY
a1.`date` DESC
The principle is that you select the sites for which there is no article date greater than any other article date.
rewrite the sql to the following syntax -
SELECT `articles`.`article_name`,'sites'.'id','articles'.'site_id'
FROM `sites`,'articles'
WHERE `articles`.`site_id` = `sites`.`id`
ORDER BY 'sites'.'id', `articles`.`date` DESC;
Do something like this in the select statement. Group by function demands that all fields to be grouped. Hence usage of * is not possible.
SELECT * FROM ( SELECT `S.<col1>`, `S.<col2>`, `A.<col1>`,`A.<col2>`,
ROW_NUMBER ()
OVER (PARTITION BY `SITES`.`ID`
ORDER BY `SITES`.`ID` DESC)
RID
FROM `SITES` `S`,`ARTICLES` `A`
WHERE `ARTICLES`.`SITE_ID` = `SITES`.`ID`
)
WHERE RID = 1;
Can you try this?
Finally I came to the solution.
First of all I changed the main query from queering from sites table to queering from articles. Next I added the MAX(date) column to the result.
So the resulting query implementing the thing I need is the following:
SELECT `sites`.`url`,MAX(`articles`.`date`) AS `last_article_date`
FROM `articles`
INNER JOIN `sites` ON `sites`.`id` = `article`.`site_id`
GROUP BY `site_id`
ORDER BY `last_article_date` ASC
Thanks to all of you for giving me hints and right search directions!

Getting the number of rows with a GROUP BY query within a subquery

I'm building a MySQL query with subqueries. The query requires, as described in Getting the number of rows with a GROUP BY query, the number of records returned by a group-by query, because I want the number of days with records in the database. So I'm using the following:
SELECT
COUNT(*)
FROM
(
SELECT
cvdbs2.dateDone
FROM
cvdbStatistics cvdbs2
WHERE
cvdbs2.mediatorId = 123
GROUP BY
DATE_FORMAT( cvdbs2.dateDone, "%Y-%d-%m" )
) AS activityTempTable
Now, I want this as a subquery, because I need some more data with different WHERE statements. So my query becomes:
SELECT
x,
y,
z,
(
SELECT
COUNT(*)
FROM
(
SELECT
cvdbs2.dateDone
FROM
cvdbStatistics cvdbs2
WHERE
cvdbs2.mediatorId = mediators.id
GROUP BY
DATE_FORMAT( cvdbs2.dateDone, "%Y-%d-%m" )
) AS activityTempTable
) AS activeDays
FROM
mediators
LEFT JOIN
cvdbStatistics
ON
mediators.id = cvdbStatistics.mediatorId
WHERE
mediators.recruiterId = 409
GROUP BY
mediators.email
(I left out some irrelevant WHERE-statements from my queries. 409 is just an example id, this is inserted by PHP).
Now, I'm getting the following error:
#1054 - Unknown column 'mediators.id' in 'where clause'
MySQL forgot about the mediators.id in the deepest subquery. How can I build a query, with the number of results of a GROUP-BY query, which requires a value from the main query, as one of the results? Why isn't the deepest query aware of 'mediators.id'?
Try the following:
SELECT
x,
y,
z,
(
SELECT
COUNT(distinct DATE_FORMAT( cvdbs2.dateDone, "%Y-%d-%m" ))
FROM
cvdbStatistics cvdbs2
WHERE
cvdbs2.mediatorId = mediators.id
) AS activeDays
FROM
mediators
LEFT JOIN
cvdbStatistics
ON
mediators.id = cvdbStatistics.mediatorId
WHERE
mediators.recruiterId = 409
GROUP BY
mediators.email
Did you try to put also the "mediators" table in the FROM of the deepest subquery ? Because they are two different queries and the tables of the first one are not called in the subquery. I'm not sure of what i say but i think the only relation between the query and the subquery is the result return by the subquery.