I'm getting a Unknown column 'sites.id' in 'where clause' on the following query:
SELECT id, COUNT( returning_visitors.per_ip ) as readers, AVG( returning_visitors.per_ip ) as avg_visits_pr
FROM sites
JOIN (
SELECT COUNT( * ) AS per_ip
FROM site_hits_unique
WHERE site_id = sites.id
AND date >= CURDATE( ) - INTERVAL 30 DAY
GROUP BY site_id, ip
HAVING per_ip > 1
) AS returning_visitors
WHERE id IN (162888, 42705, 11412)
I want to run the inner query for every sites.id (the example just uses a few IDs for testing purposes).
The correlated subquery is only one level deep, so I'm not quite sure why it's not getting sites.id.
Any ideas how to fix?
I found the reason why from http://dev.mysql.com/doc/refman/5.6/en/subquery-restrictions.html:
Subqueries in the FROM clause cannot be correlated subqueries. They
are materialized in whole (evaluated to produce a result set) during
query execution, so they cannot be evaluated per row of the outer
query. Before MySQL 5.6.3, materialization takes place before
evaluation of the outer query. As of 5.6.3, the optimizer delays
materialization until the result is needed, which may permit
materialization to be avoided. See Section 8.2.1.18.3, “Optimizing
Derived Tables (Subqueries) in the FROM Clause”.
Although I still need to figure out how to rewrite my query to make it work the way I want it to. Is a function necessary / feasible here?
You should rewrite your query in a way like:
SELECT id, COUNT( returning_visitors.per_ip ) as readers, AVG( returning_visitors.per_ip ) as avg_visits_pr
FROM sites
JOIN (
SELECT COUNT( * ) AS per_ip, site_id
FROM site_hits_unique
WHERE site_id IN (162888, 42705, 11412)
AND date >= CURDATE( ) - INTERVAL 30 DAY
GROUP BY site_id, ip
HAVING per_ip > 1
) AS returning_visitors
on id=returning_visitors.site_id
WHERE id IN (162888, 42705, 11412)
Related
Help needed. Could someone help to generate code which would take only second value of IncurredAmount after first one from the same policid.
SELECT claims.claimid, claims.policyid, claims.IncurredAmount
FROM claims
GROUP BY claims.claimid, claims.policyid, claims.IncurredAmount
HAVING (((claims.policyid)=62));
That's what I have. I tried to take one policyid (62) in order to have less entries. But there I stuck. have no clue what clause can be used in order to take only second entries for all entries.
Try this, though whether it will work depends on the version of your database:
SELECT claimid, policyid, IncurredAmount
FROM (
SELECT *,
row_number() over (partition by policyid order by claimid) rn
FROM [MyTable]
) t
WHERE t.rn = 2
A solution exists for the old MySql versions (pre 8.0)
select *
from claims t
where exists (
select 1
from claims t2
where t2.policyid = t.policyid
and t2.claimid <= t.claimid
having count(distinct t2.claimid) = 2
)
order by policyid, claimid
db<>fiddle here
Although it's more equivalent to a DENSE_RANK.
I.e. if there's more with the 2nd lowest claimid then it'll get more than 1.
As the title indicates, I am trying to find the maximum summed value in column C for an object in column A based on a subset of column B over a period of time (let's say column D). My current query looks something like this in which I return the summed values greater than 10,000.
select id_a, id_b, sum(column_c) from master_table where id_b in (1,2,3,4,5)
and ymdh >= '2017-11-01' group by 1,2 having sum(column_c) > 10000 order by 2,3
desc;
What I'm trying to get returned is the greatest value from sum(column_c). I tried using both the max() and distinct() functions. Specifically using max(sum(imps)), but aggregate function calls many not be nested. Would anyone be able to provide guidance here?
You can use a FROM ( select ) T
select max(my_sum)
from (
select id_a
, id_b
, sum(column_c) my_sum
from master_table
where id_b in (1,2,3,4,5)
and ymdh >= '2017-11-01'
group by 1,2 having my_sum > 10000
order by 2,3 desc;
) T
Does this do what you want?
select id_a, id_b, sum(column_c)
from master_table
where id_b in (1,2,3,4,5) and
ymdh >= '2017-11-01'
group by id_a, id_b
having sum(column_c) > 10000
order by sum(column_c) desc
limit 1;
That is, use order by and limit to get the value you want. (This query includes the group by keys as well, but that is not necessary.)
scaisEdge has the answer (and my +1) - but I just wanted to add a bit about the thought process when designing an SQL statement like you're working on.
Don't feel you need to compose the whole thing - that it's one big statement, or that it's one single query.
Instead, you'll often need to break up the problem into steps, solve the individual steps, and then use those steps as sources for a query - because you don't have to use tables in the FROM clause; you can use your own subqueries instead.
So for this problem? You've got the first step done - you figured out how to write the query that gets the Sum over a particular grouping:
select someCol, sum(otherCol) as groupSum from myTable
group by someCol
Great! Now, you can effectively use this like it's a table:
select someCol, groupSum
from
(
select someCol, sum(otherCol) as groupSum from myTable
group by someCol
) mySubquery
And in your case, you want to get the maximum sum?
select max(groupSum)
from
(
select someCol, sum(otherCol) as groupSum from myTable
group by someCol
) mySubquery
Not only will this help while composing the full SQL statement, it'll actually help the person trying to read/debug it down the line, especially if you name your subqueries/columns well:
select max(totalHitsForWeek) as maxWeeklyUsage
from
(
select week, sum(hits) as totalHitsForWeek
from requestsTable
) hitsPerWeekSubquery
Hope that helps add to scaisEdge's answer! :-)
I am trying to make a reporting system where I need to display report
for each date.
These is my table schema for selected_items
This is stock_list
I am using php in the back-end and java in the front end to display
the data. I tried a couple of queries to get the desired output but so
far I am not able to get it.These are some of the queries i used.
SELECT
COALESCE(stock_list.date, selected_items.date) AS date,
SUM( stock_list.qty ) AS StockSum,
SUM( stock_list.weight ) AS Stockweight,
COUNT( selected_items.barcode ) AS BilledItems,
SUM( selected_items.weight ) AS Billedweight
FROM stock_list join selected_items
ON stock_list.date = selected_items.date
GROUP BY COALESCE(stock_list.date, selected_items.date)
ORDER BY COALESCE(stock_list.date, selected_items.date);
This gives me the first five columns but the output gives me wrong values.
Then I also tried Union.
SELECT SUM( qty ) AS StockSum, SUM( weight ) AS Stockweight
FROM `stock_list`
WHERE DATE LIKE '08-Jan-2016'
UNION SELECT COUNT( barcode ) AS BilledItems, SUM( weight ) AS Billedweight
FROM `selected_items`
WHERE DATE LIKE '08-Jan-2016'
UNION SELECT SUM( qty ) AS TotalStock, SUM( weight ) AS TotalWeight
FROM `stock_list`;
Here I get the correct values for four columns but the problem is the >result is displayed in two columns when I would like it to be in 4 columns.
Can anyone guide me please I have figured the java part of it but I am not good at php and mysql.
Thank you
Unfortunately, SQL Fiddle crashed while I was trying to execute this query
SELECT sl.date AS date, B.qtySum AS StockSum, B.weightSum AS Stockweight,
C.barcodeCount AS BilledItems, C.weightSum AS Billedweight
FROM stock_list sl
JOIN (SELECT SUM(qty) as qtySum, SUM(weight) as weightSum
FROM STOCK_LIST GROUP BY date) AS B
ON B.date = sl.date
JOIN (SELECT SUM (weight) AS weightSum, COUNT(barcode) AS barcodeCount
FROM SELECTED_ITEMS GROUP BY date) AS C
ON C.date = sl.date;
As it was tried here. The problem with joins is that the rows will be joined multiple times and thus, the sum goes awry. For example, you have four rows that are joined from the second table and so the sum is four times higher as it should. With subqueries you can avoid this problem as you count and sum up variables before joining them and therefore, the numbers should fit. Alas, I couldn't run the query so I'm not 100% sure it works, but it should be the right approach.
I have four queries that run on one web page. I use them for statistics and they are taking too long to load.
Here are my current configurations
use the text wrapping button on pastebin to make it easier to read.
I have a lot of RAM dedicated to mysql but it still takes a long time. I have also index most of the columns.
I'm just trying to see what other options I have.
I put "show create table" and total count(*) in here. I'm going to rename everything and paste in SO. I agree that someone in the future may use it.
QUERY ONE
SELECT SQL_NO_CACHE
DATE_FORMAT(DateActioned,'%M-%Y') as val1,
COUNT(*) AS total_count
FROM
db.statisticsresults
WHERE
DID = 28
AND ActionTypeID = 1
AND DateActioned IS NOT NULL
GROUP BY
DATE_FORMAT(DateActioned, '%m-%y')
ORDER BY
YEAR( DateActioned ) DESC,
MONTH( DateActioned ) DESC
This, I would have a covering index based on your key elements so the engine does not have to go back to the raw data... Based on this and your following queries, I would have THAT column in the primary index position such as
StatisticsResults -- index ( DID, ActionTypeID, DateActioned )
The order by by respective year() descending and month() descending will do the same thing as your hard-coded references to FIND the field in the list.
QUERY TWO
-- 381.812
SELECT SQL_NO_CACHE
DATE_FORMAT(DateActioned,'%M-%Y') as val1,
COUNT(*) AS total_count
FROM
db.statisticsdivision
WHERE
DID = 28
AND ActionTypeID = 9
AND DateActioned IS NOT NULL
GROUP BY
DATE_FORMAT(DateActioned, '%m-%y')
ORDER BY
YEAR( DateActioned ) DESC,
MONTH( DateActioned ) DESC
ON this one, the DID = '28', I changed to DID = 28. If the column is numeric, don't offer confusion to the engine to try and convert one to the other. The same indexes from option 1 would apply here too.
QUERY THREE
-- 33.899
SELECT SQL_NO_CACHE DISTINCT
AID,
COUNT(*) AS acount
FROM
db.statisticsresults
JOIN db.division_id USING(AID)
WHERE
DID = 28
GROUP BY
AID
ORDER BY
count(*) DESC
LIMIT
19
This one looks like a bit of a waste... you are joining to the division table based on an "AID" column in the stats table. Why are you doing the join unless you actually are expecting some invalid "AID" values not in the division table? Again, change your "DID" column to 28 instead of '28'. Ensure your division table has its index on "AID" for the join. The SECOND index from query 1 appears to be your better option
QUERY FOUR
-- 21.403
SELECT SQL_NO_CACHE DISTINCT
TID,
tax,
agent,
COUNT(*) AS t_count
FROM
db.statisticsresults sr
JOIN db.tax_id USING(TID)
JOIN db.agent_id ai ON(ai.AID = sr.AID)
WHERE
DID = 28
GROUP BY
TID,
sr.AID
ORDER BY
COUNT(*) DESC
LIMIT 19
Again, "DID" column from '28' to 28
FOR your TAX_ID table, have a covering index on that too so it can handle the join
TO the agent table without going TO the raw page data
Tax_ID -- index ( tid, aid )
Finally, if you are dealing with your original list finding things only from Jan 2012 to Dec 2013, you can simplify querying the ENTIRE table of stats by adding to your WHERE clause...
AND DateActioned >= '2012-01-01'
So you completely skip over anything prior to 2012 (old data I presume?)
I have two tables, news and news_views. Every time an article is viewed, the news id, IP address and date is recorded in news_views.
I'm using a query with a subquery to fetch the most viewed titles from news, by getting the total count of views in the last 24 hours for each one.
It works fine except that it takes between 5-10 seconds to run, presumably because there's hundreds of thousands of rows in news_views and it has to go through the entire table before it can finish. The query is as follows, is there any way at all it can be improved?
SELECT n.title
, nv.views
FROM news n
LEFT
JOIN (
SELECT news_id
, count( DISTINCT ip ) AS views
FROM news_views
WHERE datetime >= SUBDATE(now(), INTERVAL 24 HOUR)
GROUP
BY news_id
) AS nv
ON nv.news_id = n.id
ORDER
BY views DESC
LIMIT 15
I don't think you need to calculate the count of views as a derived table:
SELECT n.id, n.title, count( DISTINCT nv.ip ) AS views
FROM news n
LEFT JOIN news_views nv
ON nv.news_id = n.id
WHERE nv.datetime >= SUBDATE(now(), INTERVAL 24 HOUR)
GROUP BY n.id, n.title
ORDER BY views DESC LIMIT 15
The best advice here is to run these queries through EXPLAIN (or whatever mysql's equivalent is) to see what the query will actually do - index scans, table scans, estimated costs, etc. Avoid full table scans.