How to optimize this query with some subqueries in it - mysql

My query is like this:
SELECT date_format( created_at, '%Y-%m-%d' ) AS the_date,
COUNT(s.id) AS total,
(SELECT COUNT(ks.id) FROM kc_shares ks WHERE site = 'facebook' AND date_format( created_at, '%Y-%m-%d' ) = the_date ) AS total_facebook,
(SELECT COUNT(ks.id) FROM kc_shares ks WHERE site = 'twitter' AND date_format( created_at, '%Y-%m-%d' ) = the_date ) AS total_twitter
FROM `kc_shares` s
GROUP BY `the_date`
What I want to get is the number of daily shares with the specification of total, total shares to facebook (thus site = 'facebook') and total shares to twitter. That's why I need the GROUP BY.
When it had, like, a few thousands rows, there's no problem. But the table currently has almost 200,000 rows, and the query is very slow, taking about 20-30 seconds, even more I guess.
I've tried adding indices to site and created_at fields but to no avail.
Thanks

I think the sub queries are eating up performace. So maybe you can do something like this:
SELECT
date_format( created_at, '%Y-%m-%d' ) AS the_date,
COUNT(s.id) AS total,
SUM(CASE WHEN s.site='facebook' THEN 1 ELSE 0 END) AS total_facebook,
SUM(CASE WHEN s.site='twitter' THEN 1 ELSE 0 END) AS total_twitter
FROM
`kc_shares` s
GROUP BY
`the_date
`

Move the subselects so you join against them, rather that doing a subselect for every returned row.
Something like this (untested):-
SELECT date_format( created_at, '%Y-%m-%d' ) AS the_date,
COUNT(s.id) AS total,
Sub1.total_facebook, Sub2.total_twitter
FROM `kc_shares` s
LEFT OUTER JOIN (SELECT date_format( created_at, '%Y-%m-%d' ) AS sub_date, COUNT(ks.id) AS total_facebook FROM kc_shares ks WHERE site = 'facebook' GROUP BY sub_date ) Sub1 ON date_format( created_at, '%Y-%m-%d' ) = Sub1.sub_date
LEFT OUTER JOIN (SELECT date_format( created_at, '%Y-%m-%d' ) AS sub_date, COUNT(ks.id) AS total_twitter FROM kc_shares ks WHERE site = 'twitter' GROUP BY sub_date ) Sub2 ON date_format( created_at, '%Y-%m-%d' ) = Sub2.sub_date
GROUP BY `the_date`
Although finding a way to do a join on a non derived column (ie the date part of the date / time) would also help. Possibly a good case here for a bit or denormalisation, adding a field for just the date in addtion to the date / time currently stored.

An alternative would be to change the way the query works. The following would provide rows for each day/site rather than having the two sites on the same row.
SELECT
date_format( created_at, '%Y-%m-%d' ) AS the_date,site,
count(id)
FROM
kc_shares s
where
(site="facebook" or site="twitter") )
group by
created_at, site
I'm assuming that created_at is a date field.
This should provide the same data (I think, I haven't tried it) but in a different format.
Try an index on (created_at,site).

Related

MySQL Sub Query difficulty

I'm trying to get a count of how many times a user has triggered the following query. And I've concluded that a Sub Query is required.
The below (admittedly indelicate) query works, as far as it goes, without the Sub Query. And the Sub Query works as a standalone query. But after three days of trying, I cannot get the two to work combined. I don't know if I have a glaring syntax error, or whether I'm getting it all wrong in principle. I need help!
SELECT id, status, FirstName, LastName, Track, KeyChange, Version,
DATE_FORMAT(CONVERT_TZ(Created,'+00:00','+1:00'), '%l:%i %p') AS Created_formatted,
TIME_FORMAT(SEC_TO_TIME(TIMESTAMPDIFF(SECOND, pinknoise.Created, CURRENT_TIMESTAMP() - INTERVAL '0' HOUR)),'%Hh %im') AS elapsed,
(SELECT `FirstName`, Count(*) AS 'CountRequests' FROM `pinknoise` GROUP by `FirstName`)
FROM pinknoise
WHERE status = 'incoming'
ORDER BY Created DESC
I don't really understand what your query should achieve, but well formatted it looks like:
SELECT
id,
status,
FirstName,
LastName,
Track,
KeyChange,
Version,
DATE_FORMAT(
CONVERT_TZ(
Created,
'+00:00',
'+1:00'
),
'%l:%i %p'
) AS Created_formatted,
TIME_FORMAT(
SEC_TO_TIME(
TIMESTAMPDIFF(
SECOND,
pinknoise.Created,
CURRENT_TIMESTAMP() - INTERVAL '0' HOUR
)
),
'%Hh %im'
) AS elapsed
(
SELECT
`FirstName`,
Count(*) AS 'CountRequests'
FROM
`pinknoise`
GROUP by
`FirstName`
)
FROM
pinknoise
WHERE
status = 'incoming'
ORDER BY
Created DESC
What I imagine: you want the number of total entries for this particular firstname in the same table. The dirty way would be:
SELECT
id,
status,
FirstName,
LastName,
Track,
KeyChange,
Version,
DATE_FORMAT(
CONVERT_TZ(
Created,
'+00:00',
'+1:00'
),
'%l:%i %p'
) AS Created_formatted,
TIME_FORMAT(
SEC_TO_TIME(
TIMESTAMPDIFF(
SECOND,
pinknoise.Created,
CURRENT_TIMESTAMP() - INTERVAL '0' HOUR
)
),
'%Hh %im'
) AS elapsed,
(
SELECT
Count(*)
FROM
`pinknoise` AS tb
WHERE
tb.FirstName = pinknoise.FirstName
) AS CountRequests
FROM
pinknoise
WHERE
status = 'incoming'
ORDER BY
Created DESC
A much better performance would have a join:
SELECT
pinknoise.id,
pinknoise.status,
pinknoise.FirstName,
pinknoise.LastName,
pinknoise.Track,
pinknoise.KeyChange,
pinknoise.Version,
DATE_FORMAT(
CONVERT_TZ(
pinknoise.Created,
'+00:00',
'+1:00'
),
'%l:%i %p'
) AS Created_formatted,
TIME_FORMAT(
SEC_TO_TIME(
TIMESTAMPDIFF(
SECOND,
pinknoise.Created,
CURRENT_TIMESTAMP() - INTERVAL '0' HOUR
)
),
'%Hh %im'
) AS elapsed,
tabA.CountRequests
FROM
pinknoise
INNER JOIN
(
SELECT
Count(*) AS 'CountRequests',
FirstName
FROM
`pinknoise`
GROUP BY
FirstName
) tabA
ON
pinknoise.FirstName = tabA.FirstName
WHERE
status = 'incoming'
ORDER BY
Created DESC
Your subselect is returning 2 values in the select portion where it only expects one value. I'm guessing you are getting the FirstName with the intent of doing a join. If so, then try this:
SELECT
p.id,
p.status,
p.FirstName,
p.LastName,
p.Track,
p.KeyChange,
p.Version,
DATE_FORMAT(CONVERT_TZ(p.Created,'+00:00','+1:00'), '%l:%i %p') AS Created_formatted,
TIME_FORMAT(SEC_TO_TIME(TIMESTAMPDIFF(SECOND, p.Created, CURRENT_TIMESTAMP() - INTERVAL '0' HOUR)),'%Hh %im') AS elapsed,
cnt.CountRequests
FROM
pinknoise p
inner join (SELECT p.FirstName, Count(*) AS CountRequests FROM pinknoise p GROUP by p.FirstName) cnt on p.FirstName = cnt.FirstName
WHERE
p.status = 'incoming'
ORDER BY
p.Created DESC;

Group by help ( grouping by multiple, have duplicates)

SO i have a task and i need to group my results by Date and by Provider_name but currently my code is listing out multiple dates and Providers. (need to have one provider per day (25 days in all) so my table shows how many messages the provider got that day and how much did they earn)
This needs to be my result. Result table
But this is what i'm currently getting
This is my code currently
SELECT date_format( time, '%Y-%m-%d' ) AS Date, provider_name, COUNT( message_id ) AS Messages_count, SUM( price ) AS Total_price
FROM mobile_log_messages_sms
INNER JOIN service_instances ON service_instances.service_instance_id = mobile_log_messages_sms.service_instance_id
INNER JOIN mobile_providers ON mobile_providers.network_code = mobile_log_messages_sms.network_code
WHERE time
BETWEEN '2017-02-26 00:00:00'
AND time
AND '2017-03-22 00:00:00'
AND price IS NOT NULL
AND price <> ''
AND service IS NOT NULL
AND service <> ''
AND enabled IS NOT NULL
AND enabled >=1
GROUP BY provider_name, time
ORDER BY time DESC
Can you tell me where i've messed up, i really can't figure out the answer.
Try like this:
....
GROUP BY provider_name, date_format( time, '%Y-%m-%d' )
ORDER BY time DESC
You are grouping time which will group the result by time including hour, minute and second so on ... that is why you getting different count from same day. Try grouping by day instead.
time column is datetime. So its grouped by date and time both rather than just date.
Change GROUP BY statement to
GROUP BY provider_name, date_format( time, '%Y-%m-%d' )

Reformat sql to show rows where main query returns null

Consider this sql:
SELECT DATE_FORMAT( Orders.Timestamp, '%Y%m' ) AS Period,
SUM(Price) AS 'Ordersum per month and organisation', Orders.Organisation,
(
SELECT SUM(Amount) AS Returns
FROM Returns
WHERE DATE_FORMAT( Returns.Timestamp, '%Y%m' ) = Period
AND Returns.Organisation = Orders.Organisation
) Returns
FROM Orders
GROUP BY Period, Organisation
Whenever there are rows in the subquery that doesn't have an equivalent period in the main query, the row isn't displayed. The reason is that the query takes its period from the orders table, and when the period of the subquery doesn't match a period in the orders table, it simply doesn't match the query.
Is there a way to reformat this query to achieve what I want?
Sqlfiddle here http://sqlfiddle.com/#!9/ace715/1
You can use left and right join with UNION like this:
SELECT
ifnull(DATE_FORMAT( Orders.Timestamp,'%Y%m' ),DATE_FORMAT(Returns.Timestamp,'%Y%m' )) AS Period,
SUM(Price) AS 'Ordersum per month and organisation',
ifnull(Orders.Organisation,Returns.Organisation) as 'Organisation',
SUM(Amount) AS 'Returns'
FROM Orders
left JOIN Returns
on DATE_FORMAT( Orders.Timestamp,'%Y%m' ) = DATE_FORMAT(Returns.Timestamp, '%Y%m' )
and Returns.Organisation = Orders.Organisation
GROUP BY Period, Returns.Organisation, Orders.Organisation
union
select ifnull(DATE_FORMAT( Orders.Timestamp, '%Y%m' ),DATE_FORMAT(Returns.Timestamp,'%Y%m' )) AS Period,
SUM(Price) AS 'Ordersum per month and organisation',
ifnull(Orders.Organisation,Returns.Organisation),
SUM(Amount) AS 'Returns'
FROM Orders
right JOIN Returns
on DATE_FORMAT( Orders.Timestamp, '%Y%m' ) = DATE_FORMAT(Returns.Timestamp, '%Y%m' )
and Returns.Organisation = Orders.Organisation
GROUP BY Period, Returns.Organisation, Orders.Organisation

Inaccurate New User Query

SELECT COUNT( uid ) AS `Records` , DATE( FROM_UNIXTIME( 'since` ) ) AS `Date`
FROM `accounts` WHERE FROM_UNIXTIME(since) >= FROM_UNIXTIME($tstamp)
GROUP BY WEEK( FROM_UNIXTIME( `since` ) )
LIMIT 200
Was using this to try to get the New user signups daily from a specified date but its turning out to be incredibly inaccurate. Which means either my query is off or possibly there is some issue involving timezones?Below is a example result I got from a example data set I loaded in as well as a page worth of timestamps so you can see what the results should be.
It is suggested to use HAVING instead of WHERE with GROUP BY clause.
Also the backtick(`) operator is not used properly in this code.
So change this query:
SELECT COUNT( uid ) AS Records , DATE( FROM_UNIXTIME( 'since` ) ) AS `Date`
FROM `accounts` WHERE FROM_UNIXTIME(since) >= FROM_UNIXTIME($tstamp)
GROUP BY DATE( FROM_UNIXTIME( `since` ) )
LIMIT 200
to this one:
SELECT COUNT(`uid`) AS Records , DATE( FROM_UNIXTIME(`since`) ) AS Date
FROM accounts
GROUP BY DATE( FROM_UNIXTIME( `since` ) )
HAVING FROM_UNIXTIME(`since`) >= FROM_UNIXTIME($tstamp)
LIMIT 200

How to resolve issue in order by date

SELECT COUNT(patient_id) AS idpateint,
patient_id
FROM patient
WHERE STR_TO_DATE(date_enter,'%d/%m/%Y' )
BETWEEN STR_TO_DATE( '$repeat','%d/%m/%Y' ) AND STR_TO_DATE('$to','%d/%m/%Y')
AND patient_type='opd'
AND patient_id ='$idpatient1'
ORDER BY STR_TO_DATE(date_enter,'%d/%m/%Y' )
Actually I saved date in dd/mm/yyyy format in db. Order by is not working. No SQL error but date is not coming in order.
Here is your query:
select count(patient_id) as numpatients, patient_id
from patient
where STR_TO_DATE(date_enter, '%d/%m/%Y' ) between STR_TO_DATE('$repeat', '%d/%m/%Y' ) and
STR_TO_DATE('$to', '%d/%m/%Y') and
patient_type = 'opd' and
patient_id = '$idpatient1'
order by STR_TO_DATE(date_enter, '%d/%m/%Y' )
You have a count() in the select. This turns the query into an aggregation query, so it only returns one row. In addition, you are selectxing only a single patient id. I could imagine that you want the counts by date, because you are so focused on date.
The following will give counts by date, regardless of the patient:
select STR_TO_DATE(date_enter, '%d/%m/%Y' ), count(patient_id) as numpatients
from patient
where STR_TO_DATE(date_enter, '%d/%m/%Y' ) between STR_TO_DATE('$repeat', '%d/%m/%Y' ) and
STR_TO_DATE('$to', '%d/%m/%Y') and
patient_type = 'opd' and
group by STR_TO_DATE(date_enter, '%d/%m/%Y' )
order by STR_TO_DATE(date_enter, '%d/%m/%Y' );
By the way. Don't you think the query looks ugly with all those calls to STR_TO_DATE(). They are ugly in addition to making the query less efficient. Store dates in the database using the database data types. That is what they are there for.