This is my code :
SELECT *
FROM Event_list
WHERE interest in
(
SELECT Interest_name
from Interest
where Interest_id in
(
SELECT Interest_id
FROM `User's Interests`
where P_id=Pid and is_canceled=0
)
)
order by count(Eid) desc
I don't use any GROUP BY clause but still only get one row. when removing the ORDER BY clause I get all the correct rows (but not in the right order).
I'm trying to return a view (named Event_list) sorted by most common Eid (Event id), but I want to see every row without any grouping.
COUNT() is a group function, so using it will automatically result in grouping of rows. This is why you get only one row in your result when you use it in your ORDER BY clause.
Unfortunately, it's not clear what you're trying to do, so I can't tell you how to rewrite your query to get your desired results.
I suspect the query you want is more like this:
SELECT el.*,
(select count(*)
from interest i join
UserInterests ui
on ui.is_canceled = 0 and ui.p_id = i.id
where el.interest = i.interest_name
) as cnt
FROM Event_list el
ORDER BY cnt desc;
It is a bit hard to tell without sample data and a better formed query. Some notes:
Don't use special characters in table and column names. Having to escape the names merely leads to queries that are harder to read, write, and understand.
Qualify column names, so you know what tables columns come from.
Use table aliases -- so queries are easier to write and to read.
The WHERE clause only does filtering. Your description of the problem doesn't seem to involve filtering, only ordering.
Any time you use an aggregation function, the query automatically becomes an aggregation query. Without a group by, exactly one row is returned.
Give foreign keys the same names as primary keys, where possible.
You may try:
SELECT L.* , C.Cnt
FROM Event_list L
LEFT JOIN (
SELECT E.EID, COUNT(*) AS Cnt
FROM Event_List E
JOIN Interest I
ON E.Interest = I.Interest_name
JOIN `User's Interests` U
ON U.Interest_id = I.Insert_Id
Where U.P_id=Pid and U.is_canceled=0
GROUP BY E.EID
) C
ON E.Eid = C.Eid
Order By Cnt DESC
I don't have the tables to test so you may want to correct column names and other conditions. Just provide you the idea.
Related
I run this complicated query on Spring JPA Repository.
My goal is to get all info from the site table, ordering it by events severity on each site.
This is my query:
SELECT alls.* FROM sites AS alls JOIN
(
SELECT distinct ets.id FROM
(
SELECT s.id, et.`type`, et.severity_level, COUNT(et.`type`) FROM sites AS s
JOIN users_sites AS us ON (s.id=us.site_id)
JOIN users AS u ON (us.user_id=u.user_id)
JOIN areas AS a ON (s.id=a.site_id)
JOIN panels AS p ON (a.id=p.area_id)
JOIN events AS e ON (p.id=e.panel_id)
JOIN event_types AS et ON (e.event_type_id=et.id)
WHERE u.user_id="98765432-123a-1a23-123b-11a1111b2cd3"
GROUP BY s.id , et.`type`, et.severity_level
ORDER BY et.severity_level, COUNT(et.`type`) DESC
) AS ets
) as etsd ON alls.id = etsd.id
The second select (the one with "distinct") returns site_ids ordered correctly by severity.
Note that there are different event_types + severity in each site, and I use pagination on the answer, so I need the distinct.
The problem is - the main select doesn't keep this order.
Is there any way to keep the order in one complicated query?
Another related question - one of my ideas was making two queries:
The "select distinct" query that will return me the order --> saved in a list "order list"
The main "sites" query (that becomes very simple) with "where id in {"order list"}
Order the second query in code by "order list".
I use the query every 10 seconds, so it is very sensitive on performance.
What seems to be faster in this case - original complicated query or those 2?
Any insight will be appreciated.
Tnx a lot.
A quirk of SQL's declarative set-oriented syntax for us procedural programmers: ORDER by clauses in subqueries are not carried through to the outer query, except sometimes by accident. If you want ordering at any query level, you must specify it at that level or you will get unpredictable results. The query optimizers are usually smart enough to avoid wasting sort operations.
Your requirement: give at most one sites row for each sites.id value, ordered by the worst event. Worst: lowest event severity, and if there are more than one event with lowest severity, the largest count.
Use this sort of thing to get the "worst" for each id, in place of DISTINCT.
SELECT id, MIN(severity_level) severity_level, MAX(num) num
FROM (
/* your inner query */
) ets
GROUP BY id
This gives at most one row per sites.id value. Then your outer query is
SELECT alls.*
FROM sites alls
JOIN (
SELECT id, MIN(severity_level) severity_level, MAX(num) num
FROM (
/* your inner query */
) ets
GROUP BY id
) worstevents ON alls.id = worstevents.id
ORDER BY worstevents.severity_level, worstevents.num DESC, alls.id
Putting it all together:
SELECT alls.*
FROM sites alls
JOIN (
SELECT id, MIN(severity_level) severity_level, MAX(num) num
FROM (
SELECT s.id, et.severity_level, COUNT(et.`type`) num
FROM sites AS s
JOIN users_sites AS us ON (s.id=us.site_id)
JOIN users AS u ON (us.user_id=u.user_id)
JOIN areas AS a ON (s.id=a.site_id)
JOIN panels AS p ON (a.id=p.area_id)
JOIN events AS e ON (p.id=e.panel_id)
JOIN event_types AS et ON (e.event_type_id=et.id)
WHERE u.user_id="98765432-123a-1a23-123b-11a1111b2cd3"
GROUP BY s.id , et.`type`, et.severity_level
) ets
GROUP BY id
) worstevents ON alls.id = worstevents.id
ORDER BY worstevents.severity_level, worstevents.num DESC, alls.id
An index on users.user_id will help performance for these single-user queries.
If you still have performance trouble, please read this and ask another question.
I have this query I need to optimize further since it requires too much cpu time and I can't seem to find any other way to write it more efficiently. Is there another way to write this without altering the tables?
SELECT category, b.fruit_name, u.name
, r.count_vote, r.text_c
FROM Fruits b, Customers u
, Categories c
, (SELECT * FROM
(SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r
WHERE b.fruit_id = r.fruit_id
AND u.customer_id = r.customer_id
AND category = "Fruits";
This is your query re-written with explicit joins:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN
(
SELECT * FROM
(
SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r on r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
CROSS JOIN Categories c
WHERE c.category = 'Fruits';
(I am guessing here that the category column belongs to the categories table.)
There are some parts that look suspicious:
Why do you cross join the Categories table, when you don't even display a column of the table?
What is ORDER BY fruit_id, count_vote DESC, r_id supposed to do? Sub query results are considered unordered sets, so an ORDER BY is superfluous and can be ignored by the DBMS. What do you want to achieve here?
SELECT * FROM [ revues ] GROUP BY fruit_id is invalid. If you group by fruit_id, what count_vote and what r.text_c do you expect to get for the ID? You don't tell the DBMS (which would be something like MAX(count_vote) and MIN(r.text_c)for instance. MySQL should through an error, but silently replacescount_vote, r.text_cbyANY_VALUE(count_vote), ANY_VALUE(r.text_c)` instead. This means you get arbitrarily picked values for a fruit.
The answer hence to your question is: Don't try to speed it up, but fix it instead. (Maybe you want to place a new request showing the query and explaining what it is supposed to do, so people can help you with that.)
Your Categories table seems not joined/related to the others this produce a catesia product between all the rows
If you want distinct resut don't use group by but distint so you can avoid an unnecessary subquery
and you dont' need an order by on a subquery
SELECT category
, b.fruit_name
, u.name
, r.count_vote
, r.text_c
FROM Fruits b
INNER JOIN Customers u ON u.customer_id = r.customer_id
INNER JOIN Categories c ON ?????? /Your Categories table seems not joined/related to the others /
INNER JOIN (
SELECT distinct fruit_id, count_vote, text_c, customer_id
FROM Reviews
) r ON b.fruit_id = r.fruit_id
WHERE category = "Fruits";
for better reading you should use explicit join syntax and avoid old join syntax based on comma separated tables name and where condition
The next time you want help optimizing a query, please include the table/index structure, an indication of the cardinality of the indexes and the EXPLAIN plan for the query.
There appears to be absolutely no reason for a single sub-query here, let alone 2. Using sub-queries mostly prevents the DBMS optimizer from doing its job. So your biggest win will come from eliminating these sub-queries.
The CROSS JOIN creates a deliberate cartesian join - its also unclear if any attributes from this table are actually required for the result, if it is there to produce multiples of the same row in the output, or just an error.
The attribute category in the last line of your query is not attributed to any of the tables (but I suspect it comes from the categories table).
Further, your code uses a GROUP BY clause with no aggregation function. This will produce non-deterministic results and is a bug. Assuming that you are not exploiting a side-effect of that, the query can be re-written as:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN Reviews r
ON r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
ORDER BY r.fruit_id, count_vote DESC, r_id;
Since there are no predicates other than joins in your query, there is no scope for further optimization beyond ensuring there are indexes on the join predicates.
As all too frequently, the biggest benefit may come from simply asking the question of why you need to retrieve every single row in the tables in a single query.
I'm running two queries.
The first one gets unique IDs. This executes in ~350ms.
select parent_id
from duns_match_sealed_air_072815
group by duns_number
Then I paste those IDs into this second query. With >10k ids pasted in, it also executes in about ~350ms.
select term, count(*) as count
from companies, business_types, business_types_to_companies
where
business_types.id = business_types_to_companies.term_id
and companies.id = business_types_to_companies.company_id
and raw_score > 25
and diversity = 1
and company_id in (paste,ten,thousand,ids,here)
group by term
order by count desc;
When I combine these queries into one it takes a long time to execute. I don't know how long because I stopped it after minutes.
select term, count(*) as count
from companies, business_types, business_types_to_companies
where
business_types.id = business_types_to_companies.term_id
and companies.id = business_types_to_companies.company_id
and raw_score > 25
and diversity = 1
and company_id in (
select parent_id
from duns_match_sealed_air_072815
group by duns_number
)
group by term
order by count desc;
What is going on?
It's down to the way it processes the query - I believe it has to run your embedded query once for each row, whereas using two queries allows you to store the result.
Hope this helps!
The query has been re-written using JOIN, but particularly I've used EXISTS instead of IN. This is a short in the dark. It is possible that there may be many values generated in the sub-query causing the outer query to struggle while it goes through matching each item returned from the sub-query.
select term, count(*) as count
from companies c
inner join business_types_to_companies bc on bc.company_id = c.id
inner join business_types b on b.id = bc.term_id
where
raw_score > 25
and diversity = 1
and exists (
select 1
from duns_match_sealed_air_072815
where parent_id = c.id
)
group by term
order by count desc;
First, with respect, your subquery doesn't use GROUP BY in a sensible way.
select parent_id /* wrong GROUP BY */
from duns_match_sealed_air_072815
group by duns_number
In fact, it misuses the pernicious MySQL extension to GROUP BY. Read this. http://dev.mysql.com/doc/refman/5.6/en/group-by-handling.html . I can't tell what your application logic intends from this query, but I can tell you that it actually returns an unpredictably selected parent_id value associated with each distinct duns_number value.
Do you want
select MIN(parent_id) parent_id
from duns_match_sealed_air_072815
group by duns_number
or something like that? That one selects the lowest parent ID associated with each given number.
Sometimes MySQL has a hard time optimizing the WHERE .... IN () query pattern. Try a join instead. Like this:
select term, count(*) as count
from companies
join (
select MIN(parent_id) parent_id
from duns_match_sealed_air_072815
group by duns_number
) idlist ON companies.id = idlist.parent_id
join business_types_to_companies ON companies.id = business_types_to_companies.company_id
join business_types ON business_types.id = business_types_to_companies.term_id
where raw_score > 25
and diversity = 1
group by term
order by count desc
To optimize this further we'll need to see the table definitions and the output from EXPLAIN.
I have to get the names of the Departments and the number of Employees in it. Test is my schema.
So I come up with two queries that give me the same result -
First
SELECT Department.Departmentname,
(
SELECT COUNT(*)
FROM test.Employee
WHERE Employee.Departmentid = Department.idDepartment
) AS NumberOfEmployees
FROM test.Department;
Second
SELECT Department.Departmentname AS NAme,COUNT(Employee.idEmployee) AS Employee_COUNT
FROM test.Department
LEFT JOIN test.Employee
ON Employee.Departmentid = Department.idDepartment
GROUP BY Employee.Departmentid ;
Which of the two is the best and efficient way to get the required result? Any other solution is welcome.
Please explain why a particular solution is better
My preference for expressing the logic is the second query, which I would write as:
SELECT d.Departmentname AS Name, COUNT(e.idEmployee) AS Employee_COUNT
FROM test.Department d LEFT JOIN
test.Employee e
ON e.Departmentid = d.idDepartment
GROUP BY d.Departmentname;
Note the use of table aliases and the fact that the GROUP BY uses the same columns as the SELECT. However, in MySQL, this query will not use an index on DepartmentName for the group by. That means that the GROUP BY is doing a file sort, a relatively expensive operation.
When you write the query like this:
SELECT d.Departmentname,
(SELECT COUNT(*)
FROM test.Employee e
WHERE e.Departmentid = d.idDepartment
) AS NumberOfEmployees
FROM test.Department d;
No explicit group by is needed. With an index on Employee(DepartmentId) this will use the index for the count(*), so this version would normally perform better in MySQL.
The difference in performance is probably negligible until you start having thousands or ten of thousands of rows.
I have two tables, one for downloads and one for uploads. They are almost identical but with some other columns that differs them. I want to generate a list of stats for each date for each item in the table.
I use these two queries but have to merge the data in php after running them. I would like to instead run them in a single query, where it would return the columns from both queries in each row grouped by the date. Sometimes there isn't any download data, only upload data, and in all my previous tries it skipped the row if it couldn't find log data from both rows.
How do I merge these two queries into one, where it would display data even if it's just available in one of the tables?
SELECT DATE(upload_date_added) as upload_date, SUM(upload_size) as upload_traffic, SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
SELECT DATE(download_date_added) as download_date, SUM(download_size) as download_traffic, SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
I want to get result rows like this:
date, upload_traffic, upload_files, download_traffic, download_files
All help appreciated!
Your two queries can be executed and then combined with the UNION cluase along with an extra field to identify Uploads and Downloads on separate lines:
SELECT
'Uploads' TransmissionType,
DATE(upload_date_added) as TransmissionDate,
SUM(upload_size) as TransmissionTraffic,
SUM(upload_files) as TransmittedFileCount
FROM
packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
'Downloads',
DATE(download_date_added),
SUM(download_size),
SUM(download_files)
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC;
Give it a Try !!!
What you're asking can only work for rows that have the same add date for upload and download. In this case I think this SQL should work:
SELECT
DATE(u.upload_date_added) as date,
SUM(u.upload_size) as upload_traffic,
SUM(u.upload_files) as upload_files,
SUM(d.download_size) as download_traffic,
SUM(d.download_files) as download_files
FROM
packages_uploads u, packages_downloads d
WHERE u.upload_date_added = d.download_date_added
AND u.upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY date
ORDER BY date DESC
Without knowing the schema is hard to give the exact answer so please see the following as a concept not a direct answer.
You could try left join, im not sure if the table package exists but the following may be food for thought
SELECT
p.id,
up.date as upload_date
dwn.date as download_date
FROM
package p
LEFT JOIN package_uploads up ON
( up.package_id = p.id WHERE up.upload_date = 'etc' )
LEFT JOIN package_downloads dwn ON
( dwn.package_id = p.id WHERE up.upload_date = 'etc' )
The above will select all the packages and attempt to join and where the value does not join it will return null.
There is number of ways that you can do this. You can join using primary key and foreign key. In case if you do not have relationship between tables,
You can use,
LEFT JOIN / LEFT OUTER JOIN
Returns all records from the left table and the matched
records from the right table. The result is NULL from the
right side when there is no match.
RIGHT JOIN / RIGHT OUTER JOIN
Returns all records from the right table and the matched
records from the left table. The result is NULL from the left
side when there is no match.
FULL OUTER JOIN
Return all records when there is a match in either left or right table records.
UNION
Is used to combine the result-set of two or more SELECT statements.
Each SELECT statement within UNION must have the same number of,
columns The columns must also have similar data types The columns in,
each SELECT statement must also be in the same order.
INNER JOIN
Select records that have matching values in both tables. -this is good for your situation.
INTERSECT
Does not support MySQL.
NATURAL JOIN
All the column names should be matched.
Since you dont need to update these you can create a view from joining tables then you can use less query in your PHP. But views cannot update. And you did not mentioned about relationship between tables. Because of that I have to go with the UNION.
Like this,
CREATE VIEW checkStatus
AS
SELECT
DATE(upload_date_added) as upload_date,
SUM(upload_size) as upload_traffic,
SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
DATE(download_date_added) as download_date,
SUM(download_size) as download_traffic,
SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
Then anywhere you want to select you just need one line:
SELECT * FROM checkStatus
learn more.