I have the following query:
SELECT DATE(utimestamp) as utimestamp, name, data*2000000 from tData
where utimestamp BETWEEN '2016-01-01 00:00:00' AND '2016-04-16 00:00:00'
AND name = 'Valor2' and data>20
group by YEAR(utimestamp), MONTH(utimestamp), name
union
SELECT DATE(utimestamp) as utimestamp, name, data*0.1 from tData
where utimestamp BETWEEN '2016-01-01 00:00:00' AND '2016-04-16 00:00:00'
AND name = 'Valor1' and data>20
group by YEAR(utimestamp), MONTH(utimestamp), name
order by utimestamp asc
Is there a more efficient way of operating with 'data'? Is there a way of doing this without using UNION?
You can try to use case when then:
SELECT DATE(utimestamp) as utimestamp, name,
case when name = 'Valor1' then data*0.1
when name = 'Valor2' then data*2000000
end
from tData
where utimestamp BETWEEN '2016-01-01 00:00:00' AND '2016-04-16 00:00:00'
and data>20
group by YEAR(utimestamp), MONTH(utimestamp), name
order by utimestamp asc
The query in your question is strange, because it has a math calculation without an aggregation function. And, you are aggregating by year and month, but not including them in the query.
I would be inclined to put the values in two separate columns, with the year and month explicitly defined in the query:
select year(utimestamp), month(utimestamp),
sum(case when name = 'Valor1' then data*0.01 end) as valor1,
sum(case when name = 'Valor2' then data*2000000 end) as valor2
from tData
where utimestamp between '2016-01-01' and '2016-04-16' and
name in ('Valor1', 'Valor2') and
data > 20
group by year(utimestamp), month(utimestamp)
order by max(utimestamp);
Related
I need to get the amount of distinct parent_ids that fill in one of the conditions below , grouped by day:
parent_ids that have both status = pending & processing
OR
parent_ids who have both status = canceled and processing.
I ve tried something similar to :
SELECT count(parent_id) as pencan, created_at, DATE_FORMAT(a.created_at, '%Y') AS year_key, DATE_FORMAT(a.created_at, '%m-%d') as day_key
FROM sales_flat_order_status_history
where created_at BETWEEN '2010-01-01 00:00:00' AND '2013-04-30 23:59:59'
GROUP BY created_at ,parent_id
HAVING SUM(status = 'processing')
AND SUM(status IN ('pending', 'cancelling'))
I think you just need to fix the group by:
SELECT DATE(created_at), count(parent_id) as pencan
FROM sales_flat_order_status_history
where created_at >= '2010-01-01' AND
created_at < '2013-05-01'
GROUP BY DATE(created_at) , parent_id
HAVING SUM(status = 'processing') AND
SUM(status IN ('pending', 'cancelling'))
Suppose I have a table like this:
How can I count the number of data that occur at the day of 2018-09-07 for each person, and the number of data that occur at the month 2018-09 for each person?
I mean I want to create a table like this:
I know that
SELECT
name,
COUNT(*) AS day_count_2018_09_07
FROM data_table
WHERE
arrive_time >= '2018-09-07 00:00:00'
AND
arrive_time <= '2018-09-07 23:59:59'
GROUP BY name;
can generate the number of data that occur at the day of 2018-09-07 for each person, and
SELECT
name,
COUNT(*) AS month_count_2018_09
FROM data_table
WHERE
arrive_time >= '2018-09-01 00:00:00'
AND
arrive_time <= '2018-09-30 23:59:59'
GROUP BY name;
can generate the number of data that occur at the month 2018-09 for each person.
But I don't know how to combine the above two queries so that day_count_2018_09_07 and month_count_2018_09 columns can be created in one query.
Here's the SQL fiddle where you can directly get the data in my question.
You can use conditional aggregation to get both results from the same query:
SELECT name,
SUM(CASE WHEN SUBSTR(DATE(arrive_time),1,7)='2018-09' THEN 1 ELSE 0 END) AS month_count_2018_09,
SUM(CASE WHEN DATE(arrive_time)='2018-09-07' THEN 1 ELSE 0 END) AS day_count_2018_09_07
FROM data_table
GROUP BY name
Output:
name month_count_2018_09 day_count_2018_09_07
Ben 3 0
Jane 1 1
John 3 2
Try to combine them like that:
Select DayCounter.name, DayCounter.day_count_2018_09_07, MonthCounter.month_count_2018_09
from
(SELECT
name,
COUNT(*) AS day_count_2018_09_07
FROM data_table
WHERE
arrive_time >= '2018-09-07 00:00:00'
AND
arrive_time <= '2018-09-07 23:59:59'
GROUP BY name) as DayCounter
Inner Join
(SELECT
name,
COUNT(*) AS month_count_2018_09
FROM data_table
WHERE
arrive_time >= '2018-09-01 00:00:00'
AND
arrive_time <= '2018-09-30 23:59:59'
GROUP BY name) as MonthCounter
On DayCounter.name = MonthCounter.name
What about something like this:
SELECT
name,
SUM(CASE WHEN (arrive_time BETWEEN '2018-09-07 00:00:00' AND '2018-09-07 23:59:59') THEN 1 ELSE 0 END) AS day_count_2018_09_07,
SUM(CASE WHEN (arrive_time BETWEEN '2018-09-01 00:00:00' AND '2018-09-30 23:59:59') THEN 1 ELSE 0 END) AS month_count_2018_09
FROM
data_table
GROUP BY
name;
I get the number of tests taken by a unit :
select
date(START_DATE_TIME), product_id, BATCH_SERIAL_NUMBER, count(*)
from
( select START_DATE_TIME, product_id, uut_serial_number, BATCH_SERIAL_NUMBER
from uut_result
where START_DATE_TIME >= '2016-07-01 00:00:00'
and START_DATE_TIME <= '2016-07-07 23:59:59') as passtbl
group by date(START_DATE_TIME), product_id, batch_serial_number;
I fetch the number of tests a unit passed broken down by day:
select
date(START_DATE_TIME), product_id, BATCH_SERIAL_NUMBER, count(*)
from
( select START_DATE_TIME, product_id, uut_serial_number, BATCH_SERIAL_NUMBER
from uut_result
where START_DATE_TIME >= '2016-07-01 00:00:00'
and START_DATE_TIME <= '2016-07-07 23:59:59'
and uut_status = 'passed' ) as passtbl
group by date(START_DATE_TIME), product_id, batch_serial_number;
what I'm finding is that there are units that don't have any pass records at all, so the second query is returning fewer records than the first. This is breaking post processing. Is there a way to catch the absence of a record and replace it with null or some other dummy value?
select date(START_DATE_TIME),
product_id,
BATCH_SERIAL_NUMBER,
status,
count(*)
from (select *,
case when uut_status = 'passed' then uut_status
else 'other statuses'
end status
from uut_result)
where START_DATE_TIME >= '2016-07-01 00:00:00'
and START_DATE_TIME <= '2016-07-07 23:59:59'
group by date(START_DATE_TIME),
status,
product_id,
batch_serial_number;
My standard answer to everything like this is to use a common table expression and window functions, instead of using group by where you lose the details and have to struggle to recover them.
To get a dummy row you might use a union like this:
;with myCTE (unitId, otherdetail, passed)
as (
select unitDetail, otherdetail, Sum(1) Over (partition by unit) as passed
from sourceTable
)
SELECT unitid, otherDetail, passed
from myCTE
where startDate >= lowerbound and startdate < upperBound
UNION
SELECT unitId, otherdetail, 0 as passed
from sourceTable S
where not exists (select 1 from myCTE where myCTE.unitId = S.unitID
and startDate >= lowerbound and startdate < upperBound)
I think that's a pretty good rough sketch of what you need.
Also I would used a half-open interval to compare times
On the off chance that the startTime is between 11:59:59 and 0:00 the next
day.
You never mentioned what db engine [Duh, it's in the title I was looking for a TAG]. CTE is available on SQL Server and Oracle, but not on MySQL.
For most uses you can substitute a correlated subquery,but you have to repeat yourself. The ';' before WITH is a quirk of SQL
Server.
Since you are MySQL, you have to duplicate the CTE as a subquery where it is referenced. Or maybe you have table-valued functions??
SELECT *,
DATE(date) AS post_day
FROM notes
WHERE MONTH(date) = '08'
AND userid = '2'
AND YEAR(date) = '2016'
ORDER BY post_day DESC,
timestamp ASC
In this query I'm grouping my posts by day.
What I'm struggling with is calculating the total word count of all notes for each day. There is a word count column which contains the word count for each post. Is it possible to calculate this sum in the same query or does it need to be made separately?
By table columns:
NoteID UserID Date Note WordCount
SELECT date, SUM(wordCount) AS monthWordCount
FROM Notes
WHERE userID = 2
AND MONTH(date) = '08'
AND YEAR(date) = '2016'
GROUP BY userID, date
Check out this demo using the above code.
If you want to return the notes as well you can do a subquery like below:
SELECT noteID, userID, date, note, wordCount,
(SELECT SUM(wordCount)
FROM Notes
WHERE userID = a.userID
AND date = a.date
GROUP BY userID) AS dayTotalWordCount
FROM Notes a
WHERE a.userID = 102
AND MONTH(date) = '08'
AND YEAR(date) = '2016'
Here's a demo using the above code.
First, don't use select * with a group by query. select * just doesn't make sense with aggregation . . . you need to apply aggregation functions.
I assume that you want something like this:
SELECT DATE(date) as post_day, SUM(WordCount)
FROM notes
WHERE MONTH(date)= '08' AND userid = '2' AND YEAR(date)= '2016'
GROUP BY DATE(date)
ORDER BY post_day DESC, timestamp ASC
i have a query with subqueries for a timeline widget of participants, leads and customers.
For example with 15k rows in the table but only 2k in this date range (January 1st to January 28th) this takes about 40 seconds!
SELECT created_at as date,
(
SELECT COUNT(id)
FROM participant
WHERE created_at <= date
) as participants,
(
SELECT COUNT(DISTINCT id)
FROM participant
WHERE participant_type = "lead"
AND created_at <= date
) as leads,
(
SELECT COUNT(DISTINCT id)
FROM participant
WHERE participant_type = "customer"
AND created_at <= date
) as customer
FROM participant
WHERE created_at >= '2016-01-01 00:00:00'
AND created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
How can i improve the performance?
The table fields are declared as follows:
id => primary_key, INT 10, auto increment
participant_type => ENUM "lead,customer", NULLABLE, ut8_unicode_ci
created_at => TIMESTAMP, default '0000-00-00 00:00:00'
Possibly try using conditions within the counts (or sums) to get the values you want, having cross joined things:-
SELECT a.created_at as date,
SUM(IF(b.created_at <= a.created_at, 1, 0)) AS participants,
COUNT(DISTINCT IF(b.participant_type = "lead" AND b.created_at <= a.created_at, b.id, NULL)) AS leads,
COUNT(DISTINCT IF(b.participant_type = "customer" AND b.created_at <= a.created_at, b.id, NULL)) AS customer
FROM participant a
CROSS JOIN participant b
WHERE a.created_at >= '2016-01-01 00:00:00'
AND a.created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
or maybe move the date check into the join
SELECT a.created_at as date,
COUNT(b.id) AS participants,
COUNT(DISTINCT IF(b.participant_type = "lead", b.id, NULL)) AS leads,
COUNT(DISTINCT IF(b.participant_type = "customer", b.id, NULL)) AS customer
FROM participant a
LEFT OUTER JOIN participant b
ON b.created_at <= a.created_at
WHERE a.created_at >= '2016-01-01 00:00:00'
AND a.created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
I'm not clearly understanding what you want to do with this query. But may I can provide way for optimization.
Try this one:
SELECT
participants.day as day,
participants.total_count,
leads.lead_count,
customer.customer_count
FROM
(
SELECT created_at as day, COUNT(id) as total_count
FROM participant
WHERE created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as participants
LEFT JOIN
(
SELECT created_at as day, COUNT(DISTINCT id) as lead_count
FROM participant
WHERE participant_type = "lead"
AND created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as leads ON (participants.day = leads.day)
LEFT JOIN
(
SELECT created_at as day, COUNT(DISTINCT id) as customer_count
FROM participant
WHERE participant_type = "customer"
AND WHERE created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as customer ON (participants.day = customer.day)
Add index to the query. You can execute Explain on this query.
With the help of EXPLAIN, you can see where you should add indexes to tables so that the statement executes faster by using indexes to find rows.