I have two tables, one that store product information and one that stores reviews for the products.
I am now trying to get the number of reviews submitted for the products between two dates but for some reason I get the same results regardless of the dates i put.
This is my query:
SELECT
productName,
COUNT(*) as `count`,
avg(rating) as `rating`
FROM `Reviews`
LEFT JOIN `Products` using(`productID`)
WHERE `date` BETWEEN '2015-07-20' AND '2015-07-30'
GROUP BY
`productName`
ORDER BY `count` DESC, `rating` DESC;
This returns:
+------------+---------------------+
| productName| count|rating |
+------------+------+--------------+
| productA | 23 | 4.3333333 |
| productB | 17 | 4.25 |
| productC | 10 | 3.5 |
+------------+---------------------+
Products table:
+---------+-------------+
|productID | productName|
+---------+-------------+
| 1 | productA |
| 2 | productB |
| 3 | productC |
+---------+-------------+
Reviews table
+---------+-----------+--------+---------------------+
|reviewID | productID | rating | date |
+---------+-----------+--------+---------------------+
| 1 | 1 | 4.5 | 2015-07-27 17:47:01|
| 2 | 1 | 3.5 | 2015-07-27 18:54:22|
| 3 | 3 | 2 | 2015-07-28 13:28:37|
| 4 | 1 | 5 | 2015-07-28 18:33:14|
| 5 | 2 | 1.5 | 2015-07-29 11:58:17|
| 6 | 2 | 3.5 | 2015-07-30 15:04:25|
| 7 | 2 | 2.5 | 2015-07-30 18:11:11|
| 8 | 1 | 3 | 2015-07-30 18:26:23|
| 9 | 1 | 3 | 2015-07-30 21:35:05|
| 10 | 1 | 4.5 | 2015-07-31 14:25:47|
| 11 | 3 | 0.5 | 2015-07-31 14:47:48|
+---------+-----------+--------+---------------------+
when I put two random dates that I do know for sure they not on the date column, I will still get the same results. Even when I want to retrieve records only on a certain day, I get the same results.
You should not use left join, because by doing so you retrieve all the data from one table. What you should use is something like :
select
productName,
count(*) as `count`,
avg(rating) as `rating`
from
products p,
reviews r
where
p.productID = r.productID
and `date` between '2015-07-20' and '2015-07-30'
group by productName
order by count desc, rating desc;
If the result, given your sample data, that you're looking for is:
| productName | count | rating |
|-------------|-------|--------|
| productA | 5 | 4 |
| productB | 3 | 3 |
| productC | 1 | 2 |
This is the count and average of reviews made on any date between 2015-07-20 and 2015-07-30 inclusive.
Then the there are two issues with your query. First, you need to change the join to a inner join instead of a left join, but more importantly you need to change the date condition as you are currently excluding reviews that fall on the last date on the range, but after midnight.
This happens because your between clause compares datetime values with date values so the comparison ends up being date between '2015-07-20 00:00:00' and '2015-07-30 00:00:00' which clearly excludes some dates at the end.
The fix is to either change the date condition so that the end is a day later:
where date >= '2015-07-20' and date < '2015-07-31'
or cast the date column to a date value, which will remove the time part:
where date(date) between '2015-07-20' and '2015-07-30'
Sample SQL Fiddle
You are using a LEFT JOIN between your reviews and your products tables. This will result in all the rows of reviews being shown with some rows having all product columns left empty.
You should use INNER JOIN, as this will filter only the wanted results.
(In the end I can only guess, since I don't even know which column belongs to which table ...)
The full query (very similar to Angelo Giannis's solution):
select
productName,
count(*) as `count`,
avg(rating) as `rating`
from
products INNER JOIN reviews USING(productId)
where date between '2015-07-20' and '2015-07-30'
group by productName
order by count desc, rating desc;
Here a fiddle with my and Angelo's solution (they both work).
Related
After searching a lot on this forum and the web, i have an issue that i cannot solve without your help.
The requirement look simple but not the code :-(
Basically i need to make a report on cumulative sales by product by week.
I have a table with the calendar (including all the weeks) and a view which gives me all the cumulative values by product and sorted by week. What i need the query to do is to give me all the weeks for each products and then add in a column the cumulative values from the view. if this value does not exist, then it should give me the last know record.
Can you help?
Thanks,
The principal is establish all the weeks that a product could have had sales , sum grouping by week, add the missing weeks and use the sum over window function to get a cumulative sum
DROP TABLE IF EXISTS T;
CREATE TABLE T
(PROD INT, DT DATE, AMOUNT INT);
INSERT INTO T VALUES
(1,'2022-01-01', 10),(1,'2022-01-01', 10),(1,'2022-01-20', 10),
(2,'2022-01-10', 10);
WITH CTE AS
(SELECT MIN(YEARWEEK(DT)) MINYW, MAX(YEARWEEK(DT)) MAXYW FROM T),
CTE1 AS
(SELECT DISTINCT YEARWEEK(DTE) YW ,PROD
FROM DATES
JOIN CTE ON YEARWEEK(DTE) BETWEEN MINYW AND MAXYW
CROSS JOIN (SELECT DISTINCT PROD FROM T) C
)
SELECT CTE1.YW,CTE1.PROD
,SUMAMT,
SUM(SUMAMT) OVER(PARTITION BY CTE1.PROD ORDER BY CTE1.YW) CUMSUM
FROM CTE1
LEFT JOIN
(SELECT YEARWEEK(DT) YW,PROD ,SUM(AMOUNT) SUMAMT
FROM T
GROUP BY YEARWEEK(DT),PROD
) S ON S.PROD = CTE1.PROD AND S.YW = CTE1.YW
ORDER BY CTE1.PROD,CTE1.YW
;
+--------+------+--------+--------+
| YW | PROD | SUMAMT | CUMSUM |
+--------+------+--------+--------+
| 202152 | 1 | 20 | 20 |
| 202201 | 1 | NULL | 20 |
| 202202 | 1 | NULL | 20 |
| 202203 | 1 | 10 | 30 |
| 202152 | 2 | NULL | NULL |
| 202201 | 2 | NULL | NULL |
| 202202 | 2 | 10 | 10 |
| 202203 | 2 | NULL | 10 |
+--------+------+--------+--------+
8 rows in set (0.021 sec)
Your calendar date may be slightly different to mine but you should get the general idea.
Having a hard time wrapping my mind around what seems should be a simply query.
So let's say we have a table that keeps track of amount of widgets/balloons in each store by date. How would you get a list of stores and their latest widget/balloons count?
i.e.
mysql> SELECT * FROM inventory;
+----+------------+-------+---------+---------+
| id | invDate | store | widgets | balloons|
+----+------------+-------+---------+---------+
| 1 | 2011-01-01 | 3 | 50 | 35 |
| 2 | 2011-01-04 | 2 | 50 | 35 |
| 3 | 2013-07-04 | 3 | 12 | 78 |
| 4 | 2020-07-04 | 2 | 47 | 18 |
| 5 | 2020-08-06 | 2 | 16 | NULL |
+----+------------+-------+---------+---------+
5 rows in set (0.00 sec)
Would like the result table to list all stores and their latest inventory of widgets/baloons
store, latest widgets, latest balloons
+-------+-----------+---------+
| store | widgets | baloons |
+-------+-----------+---------+
| 2 | 16 | NULL |
| 3 | 12 | 78 |
+-------+-----------+---------+
or grab latest non NULL value for balloons.
This works for all versions of MySQL
select i.*
from inventory i
join
(
select store, max(invDate) as maxDate
from inventory
group by store
) tmp on tmp.store = i.store
and tmp.maxDate = i.invDate
With MySQL 8+ you can do window functions:
with cte as
(
select store, widgets, balloons,
ROW_NUMBER() OVER(PARTITION BY store ORDER BY invDate desc) AS rn
from inventory
)
select * from cte where rn = 1
You can use a correlated sub query to get latest record for each store
select i.*
from inventory i
where i.invDate = (
select max(invDate)
from inventory
where i.store = store
)
order by i.store
DEMO
For every ID_Number, there is a bill_date and then two types of bills that happen. I want to return the latest date (max date) for each ID number and then add together the two types of bill amounts. So, based on the table below, it should return:
| 1 | 201604 | 10.00 | |
| 2 | 201701 | 28.00 | |
tbl_charges
+-----------+-----------+-----------+--------+
| ID_Number | Bill_Date | Bill_Type | Amount |
+-----------+-----------+-----------+--------+
| 1 | 201601 | A | 5.00 |
| 1 | 201601 | B | 7.00 |
| 1 | 201604 | A | 4.00 |
| 1 | 201604 | B | 6.00 |
| 2 | 201701 | A | 15.00 |
| 2 | 201701 | B | 13.00 |
+-----------+-----------+-----------+--------+
Then, if possible, I want to be able to do this in a join in another query, using ID_Number as the column for the join. Would that change the query here?
Note: I am initially only wanting to run the query for about 200 distinct ID_Numbers out of about 10 million. I will be adding an 'IN' clause for those IDs. When I do the join for the final product, I will need to know how to get those latest dates out of all the other join possibilities. (ie, how do I get ID_Number 1 to join with 201604 and not 201601?)
I would use NOT EXISTS and GROUP BY
select, t1.id_number, max(t1.bill_date), sum(t1.amount)
from tbl_charges t1
where not exists (
select 1
from tbl_charges t2
where t1.id_number = t2.id_number and
t1.bill_date < t2.bill_date
)
group by t1.id_number
the NOT EXISTS filter out the irrelevant rows and GROUP BY do the sum.
I would be inclined to filter in the where:
select id_number, sum(c.amount)
from tbl_charges c
where c.date = (select max(c2.date)
from tbl_charges c2
where c2.id_number = c.id_number and c2.bill_type = c.bill_type
)
group by id_number;
Or, another fun way is to use in with tuples:
select id_number, sum(c.amount)
from tbl_charges c
where (c.id_number, c.bill_type, c.date) in
(select c2.id_number, c2.bill_type, max(c2.date)
from tbl_charges c2
group by c2.id_number, c2.bill_type
)
group by id_number;
I want to join 2 tables:
source_table
----------------------------------
| source_id label |
|----------------------------------|
| 1 Contact Form |
| 2 E-Mail |
| 3 Inbound Call |
| 4 Referral |
----------------------------------
related_table
---------------------------------------
| id created_at source |
|---------------------------------------|
| 1 2013-12-26 2 |
| 2 2013-12-26 2 |
| 3 2013-12-26 4 |
| 4 2013-12-25 1 |
| 5 2013-12-18 2 |
| 6 2013-12-16 4 |
| 7 2013-11-30 2 |
---------------------------------------
So that it looks like this:
---------------------------------------
| created_at source amount |
|---------------------------------------|
| 2013-12-26 E-Mail 2 |
| 2013-12-26 Referral 1 |
| 2013-12-25 Contact Form 1 |
| 2013-12-18 E-Mail 1 |
| 2013-12-16 Referral 1 |
---------------------------------------
I want to count the occurrences of each source in related_table grouped by the source for each date in the range.
But I'm not sure how to write the query.
Here's what I have so far:
SELECT DISTINCT
source_table.source_id,
source_table.label AS source,
related_table.created_at,
COUNT(*) AS amount
FROM source_table
INNER JOIN related_table
ON related_table.source=source_table.source_id AND
related_table.created_at>='2013-12-01' AND
related_table.created_at<='2013-12-31'
GROUP BY `source`
ORDER BY `created_at` ASC
I'm not very good with SQL, so the above query might be far off from what I need to have. All I know is that it doesn't work as expected.
My implementation:
select created_at, s.label, amount
from
(
select count(r.Source) as amount, r.source, r.created_at
from related_table r
group by r.source, r.created_at) a inner join source_table s
on a.source = s.source_id
where created_at between '2013-12-01' and '2013-12-31'
order by amount desc, created_at desc
http://sqlfiddle.com/#!2/841bd/2
adjusted demo to your example...
SELECT
created_at
,label as source
,COUNT(*) AS amount
FROM source_table
INNER JOIN related_table
ON source_table.source_id = related_table.source
GROUP BY label, created_at
ORDER BY created_at DESC
Guys i want to get the top 3 disease and also their count from following table for a particular year..what query should i run?
mysql> select id,dname,d_id,entrydate from patient_master;
+----+------------------+------+------------+
| id | dname | d_id | entrydate |
+----+------------------+------+------------+
| 1 | Root Canal | 1 | 2012-08-02 |
| 2 | Cosmetic Filling | 3 | 2012-05-10 |
| 3 | Root Canal | 1 | 2012-05-25 |
| 4 | High BP | 6 | 2012-07-09 |
| 5 | Root Canal | 1 | 2012-07-10 |
| 6 | Normal Filling | 2 | 2012-05-10 |
| 7 | Maleria | 4 | 2012-07-10 |
| 8 | Maleria | 4 | 2012-07-12 |
| 9 | Typhoid | 5 | 2012-07-12 |
+----+------------------+------+------------+
9 rows in set (0.00 sec)
Use a group by clause to combine results by disease, and count(*) to count the number of records for each disease. You can then order from largest to fewest and use limit 3 to get only the top 3. I have also included a where clause to filter for only records in 2012.
select count(*), dname
from patient_master
where entrydate between '2012-01-01' and '2013-01-01'
group by dname
order by count(*) desc
limit 3
Demo: http://www.sqlfiddle.com/#!2/89c06/6
SELECT d_id, dname, count(d_id) as `count`, year(entrydate) as `year`
FROM patient_master
GROUP by `year`, d_id
ORDER BY `year` DESC, `count` DESC
Note I didn't put a limit here, as if you want to get both year and count in the same query, you would need to get into writing a pretty complex query to get the top 3 per year.
This will sort by year descending and then by disease count descending within each year. You will note be able to get the row id in this query, nor should you care about that value given what you are trying to do.