sql query to count twitter comments by month in 2016 - mysql

I want to list the number of Tweets made according to month in 2016. I am new to SQL but have tried different ways to do this. Below is my latest attempt. I keep getting a message that I am not using datelogged properly. Lastly, I am not sure of how to format Total and Tweet_Cnt.
The format of the date in Twitter is as follows: MESSAGE_POSTED_TIME: 2015-08-06 21:48:34. FYI---- Column Name=MESSAGE_POSTED_TIME; Table Name=DTrumpCampaign_Tweets
Select
Year(DATELOGGED),
Sum(Case When Month(DATELOGGED) = 1 Then 1 Else 0 End) Jan,
Sum(Case When Month(DATELOGGED) = 2 Then 1 Else 0 End) Feb,
Sum(Case When Month(DATELOGGED) = 3 Then 1 Else 0 End) Mar
From
DTrumpCampaign_Tweets
Group By
Year(DATELOGGED);
I would like for my table's format to look like this
Month(2016) Tweet_Cnt
Jan 25
Feb 100
Mar 200
total 325
I greatly appreciate your help.
Thanks.

Assuming that you only want data of year 2016 only.
SELECT
(CASE WHEN t.`month` IS NULL THEN 'total'ELSE t.monthName END) AS 'Month(2016)',
t.Tweet_Cnt
FROM
(
SELECT
MONTHNAME(DATELOGGED) AS monthName,
YEAR (DATELOGGED) `year`,
MONTH (DATELOGGED) `month`,
COUNT(*) Tweet_Cnt
FROM DTrumpCampaign_Tweets
WHERE YEAR (DATELOGGED) = '2016'
GROUP BY `year`,`month` WITH ROLLUP
LIMIT 13
) t;
Demo with some sample data
You will get an output structure like below:
| Month(2016) | Tweet_Cnt |
|-------------|-----------|
| January | 1 |
| February | 2 |
| March | 1 |
| April | 1 |
| May | 1 |
| June | 1 |
| July | 1 |
| August | 1 |
| September | 1 |
| October | 1 |
| November | 1 |
| December | 1 |
| total | 13 |
More:
If you want the month names having only first three letters only then just change the corresponding line in the above query as below:
Change this line : SELECT MONTHNAME(DATELOGGED) AS monthName,
To this: SELECT DATE_FORMAT(DATELOGGED,"%b") AS monthName,
Demo of this modified query

Here is a solution, with a sqlfiddle demo:
http://sqlfiddle.com/#!9/786acb/5
SELECT
COALESCE(`Month(2016)`, 'Total') AS `Month(2016)`, Tweet_cnt
FROM
(
Select
DATE_FORMAT(datelogged, '%b') AS `Month(2016)`,
COUNT(*) AS Tweet_cnt
From dtrumpcampaign_tweets
WHERE YEAR(datelogged) = '2016'
Group BY `Month(2016)` WITH ROLLUP
) t;
The output is like:
+-------------+-----------+
| Month(2016) | Tweet_cnt |
+-------------+-----------+
| Feb | 1 |
| Jan | 2 |
| Mar | 3 |
| Total | 6 |
+-------------+-----------+
4 rows in set (0.00 sec)

Related

How to sum non-consecutive values when using group by in MySQL

I have a data set representing alarms' state at a given timestamp (every 15 minutes). When the value is 1 the alarm is ON, 0 when OFF. I am trying to count the number of times the alarm has been triggered per hour (non-consecutive 1).
I took a look at Count max number of consecutive occurrences of a value in SQL Server but couldn't manage to adapt the answer.
Basically the data set for one alarm looks like this:
| id | value | registered_at |
| -- | ---------|---------------------|
| 1 | 1 | 2012-07-15 06:00 |
| 2 | 0 | 2012-07-15 06:15 |
| 3 | 1 | 2012-07-15 06:30 |
| 4 | 0 | 2012-07-15 06:45 |
| 5 | 1 | 2012-07-15 07:00 |
| 6 | 1 | 2012-07-15 07:15 |
| 7 | 1 | 2012-07-15 07:30 |
| 8 | 0 | 2012-07-15 07:45 |
| 8 | 0 | 2012-07-15 08:00 |
The results I am looking for is the following
| registered_at | alarm_triggered |
|--------------------|-----------------|
| 2012-07-15 06 | 2 |
| 2012-07-15 07 | 1 |
| 2012-07-15 08 | 0 |
To create groups I use EXTRACT(DAY_HOUR from registered_at).
Can you help me create the query?
(First time poster on SO, any feedback about the form of this post would be greatly appreciated as well)
Use LAG() window function to check the value of value of the previous row and if it is different and the current row is 1 then sum:
SELECT registered_at,
SUM(value * flag) alarm_triggered
FROM (
SELECT value,
DATE_FORMAT(registered_at, '%Y-%m-%d %H') registered_at,
value <> LAG(value, 1, 0) OVER (PARTITION BY DATE_FORMAT(registered_at, '%m-%d-%Y %H') ORDER BY registered_at) flag
FROM tablename
) t
GROUP BY registered_at
See the demo.
Results:
registered_at
alarm_triggered
2012-07-15 06
2
2012-07-15 07
1
2012-07-15 08
0
I assume the registered_at field is datetime so you need to use datetime function.
here is a query for this:
SELECT DATE_FORMAT(registered_at, "%Y-%m-%d %H:00:00") AS registered_at, SUM(VALUE) AS alarm_triggered
FROM ALARMS
GROUP BY DATE_FORMAT(registered_at, "%Y-%m-%d %H:00:00")
and sqlfiddle to see example:
example
If you need only notified days
select count(value), date_format(registered_at, '%m-%d-%Y %H') as c_at
from notifications
where value = 1
group by date_format(registered_at, '%m-%d-%Y %H');
Or all days
select sum(value), date_format(registered_at, '%m-%d-%Y %H') as c_at
from notifications
group by date_format(registered_at, '%m-%d-%Y %H');
Try this!
You can select it like this:
SELECT CONCAT(YEAR(registered_at), '-', MONTH(registered_at), '-', DAYOFMONTH(registered_at), ' ' HOUR(registered_at)), count(*)
FROM alarms
WHERE value = 1
GROUP BY YEAR(registered_at), MONT(registered_at), DAYOFMONTH(registered_at), HOUR(registered_at);
Explanation
First, we find the records whose value is 1, then group them by year, month, day of month and hour and finally we find out their count.

Group Rows in group by though it contains NULL value in mysql / postgres

I have a table from where I am getting month names and some quantity measures.
Table Name = Month_Name
SELECT month_name,q1,q2 FROM month_name;
mysql> SELECT * FROM MONTH;
+------------+------+------+
| month_name | q1 | q2 |
+------------+------+------+
| January | 10 | 20 |
| March | 30 | 40 |
| March | 10 | 5 |
+------------+------+------+
Expected Output:
mysql> SELECT month_name ,SUM(q1),SUM(q2) FROM MONTH GROUP BY month_name;
+------------+---------+---------+
| month_name | sum(q1) | sum(q2) |
+------------+---------+---------+
| January | 10 | 20 |
| Febuary | 0 | 0 |
| March | 40 | 45 |
| April | 0 | 0 |
+------------+---------+---------+
Group by month will not print February and April since these 2 months are not present in base table. I do not want to use Union All since there will be performance issues with union All, Is there any other optimised approach to this.
You can use a calendar table which keeps track of all the month names which you want to appear in your report.
SELECT
m1.month_name,
SUM(q1) AS q1_sum,
SUM(q2) AS q2_sum
FROM
(
SELECT 'January' AS month_name UNION ALL
SELECT 'February' UNION ALL
SELECT 'March' UNION ALL
...
SELECT 'December'
) m1
LEFT JOIN month m2
ON m1.month_name = m2.month_name
GROUP BY
m1.month_name;
Note that while this solve your immediate problem, it is still not ideal, because we don't have any easy way to sort the months. A much better table design would be to maintain a date column. The month name is easily derived from the date.

Is there a way to count occurrence of values of multiple column in SQL?

Let say I have a table storing survey result, and the syntax looks something like this:
id | q1 | ..... | q30 | created_at
created_at is a timestamp column and all others are integers fields.
Now I want to have a result of the survey according to month. To do that for one question, I have:
SELECT YEAR(created_at) as year, MONTH(created_at) as month, q1, count(*) as occurrence
FROM survey_table
GROUP BY YEAR(created_at), MONTH(created_at), q1
The return will be something like:
year | month| q1 | occurence
2016 | 11 | 1 | 10
2016 | 11 | 2 | 15
2016 | 11 | 3 | 2
2016 | 10 | 1 | 12
2016 | 10 | 2 | 2
2016 | 10 | 3 | 50
The data will be passed to my PHP script for further calculation and finally some data-display.
To do calculation on 30 columns, one way is to perform this query 30 times for different question. I am wondering if there is a way to do that in single query so that the output will be something like this:
year | month| q1_1 | q1_2 | q1_3 | q2_1 | q2_2 | q2_3 | ... | q30_1 | q30_2 | q30_3
2016 | 11 | 10 | 15 | 2 | 2 | 20 | 5 | ... | 5 | 15 | 7
2016 | 10 | 12 | 2 | 50 | 25 | 27 | 12 | ... | 20 | 24 | 20
Is there a way to do this in one query? If yes, is this performance better?
This is how your query would look:
select
year(created_at) as year,
month(created_at) as month,
count(q1 = 1) as q1_1,
count(q1 = 1) as q1_2,
count(q1 = 1) as q1_3,
count(q1 = 2) as q2_1,
...
count(q30 = 3) as q30_3
from survey_table
group by year(created_at), month(created_at);
It seems, however, it would be much better to change your table design:
q_type | q_value |created_at
-------+---------+----------
1 | 1 | 2016-10-05
2 | 3 | 2016-10-05
3 | 1 | 2016-10-05
4 | 2 | 2016-10-05
...
30 | 1 | 2016-10-05
...
29 | 1 | 2016-10-08
30 | 2 | 2016-10-08
And your query would simply be:
select
year(created_at) as year,
month(created_at) as month,
q_type,
q_value,
count(*)
from survey_table
group by year(created_at), month(created_at), q_type, q_value;
You'd do the formatting, i.e. putting the data in a grid, in PHP. This is more flexible, as your query doesn't have to know any longer how many q types and how many q values exist.
Here is the UNION ALL query I mentioned in a comment to my other answer. You don't have to know what q values exist when writing the query, and due to UNION ALL it's just one query that gets executed (so as to avoid unnecessary round trips).
It is still not a speeding fast query due to the current table design.
select year(created_at) as year, month(created_at) as month, 'Q1', q1, count(*) as cnt
from survey_table
group by year(created_at), month(created_at), q1
UNION ALL
select year(created_at) as year, month(created_at) as month, 'Q2', q2, count(*) as cnt
from survey_table
group by year(created_at), month(created_at), q2
UNION ALL
...
UNION ALL
select year(created_at) as year, month(created_at) as month, 'Q30', q30, count(*) as cnt
from survey_table
group by year(created_at), month(created_at), q30;

MySQL Group By Counting for Two fields

mysql> SELECT date_format(FROM_UNIXTIME(timestamp), '%M') AS month,
YEAR(FROM_UNIXTIME(timestamp)) as year, left(content_value, 12) AS status,
count(*) AS count FROM gg_groom_content
WHERE content_value LIKE '%created ofi%' OR
content_value LIKE '%ofi rejected%'
GROUP BY MONTH(from_unixtime(timestamp)), YEAR(from_unixtime(timestamp));
Result:
+-----------+------+--------------+-------+
| month | year | status | count |
+-----------+------+--------------+-------+
| January | 2014 | OFI Rejected | 861 |
| February | 2014 | Created OFI: | 777 |
| March | 2014 | Created OFI: | 537 |
| April | 2014 | OFI Rejected | 285 |
| May | 2014 | OFI Rejected | 198 |
| September | 2011 | (06:32:40 PM | 1 |
| November | 2013 | Created OFI: | 86 |
| December | 2013 | Created OFI: | 561 |
+-----------+------+--------------+-------+
8 rows in set (0.91 sec)
However I am trying to have each Status for each month:
For example:
May should have a total count of OFI Rejected and a total count of Created OFI. How can I accomplish this?
There may be a better solution but you can try making 2 separate queries using UNION ALL like below
SELECT date_format(FROM_UNIXTIME(timestamp), '%M') AS month,
YEAR(FROM_UNIXTIME(timestamp)) as year, left(content_value, 12) AS status,
count(*) AS count FROM gg_groom_content
WHERE content_value LIKE '%created ofi%'
GROUP BY MONTH(from_unixtime(timestamp)), YEAR(from_unixtime(timestamp));
UNION ALL
SELECT date_format(FROM_UNIXTIME(timestamp), '%M') AS month,
YEAR(FROM_UNIXTIME(timestamp)) as year, left(content_value, 12) AS status,
count(*) AS count FROM gg_groom_content
WHERE content_value LIKE '%ofi rejected%'
GROUP BY MONTH(from_unixtime(timestamp)), YEAR(from_unixtime(timestamp));
I think what you are looking for is conditional aggregation:
SELECT date_format(FROM_UNIXTIME(timestamp), '%M') AS month,
YEAR(FROM_UNIXTIME(timestamp)) as year, left(content_value, 12) AS status,
sum(content_value LIKE '%created ofi%') as CreatedOFI,
sum(content_value LIKE '%ofi rejected%') as RejectedOFI
FROM gg_groom_content
WHERE content_value LIKE '%created ofi%' OR
content_value LIKE '%ofi rejected%'
GROUP BY MONTH(from_unixtime(timestamp)), YEAR(from_unixtime(timestamp))
ORDER BY MIN(timestamp);

count data from multiple year and group by month

I want to make chart, here's my table;
mrp
+------------+
| date |
+------------+
| 2011-10-xx |
| 2011-12-xx |
| 2012-01-xx |
| 2012-05-xx |
| 2013-01-xx |
| 2013-02-xx |
+------------+
I want to count data from last 3 years, group by month, here's what I'm trying to achieve;
+--------+--------+--------+--------+
| quarty | 2011 | 2012 | 2013 |
+--------+--------+--------+--------+
|jan-mar | 0 | 1 | 2 |
|apr-jun | 0 | 1 | 0 |
|jul-sept| 0 | 0 | 0 |
|oct-dec | 2 | 0 | 0 |
+--------+--------+--------+--------+
I tried this;
select case when month(date) between 1 and 3 then 'Jan-Mar'
when month(date) between 4 and 6 then 'Apr-Jun'
when month(date) between 7 and 9 then 'Jul-Sept'
else 'Oct-Dec' end 'quarty',
SUM(year(date) = 2011) AS `2011`,
SUM(year(date) = 2012) AS `2012`,
SUM(year(date) = 2013) AS `2013`
from `mrp` where year(date) >= 2011
group by 'quarty'
but somehow it only show 'Oct-Dec' in 2011, 2012, and 2013; is there any way to make it?
Note: I already find another query how to show all month but only in one year, and I can't sort it correctly, It show apr-jun first, then jan-mar, jul-sept, and oct-dec, how can I sort it correctly?
You can take advantage of the QUARTER function to do the work for you.
SELECT CASE WHEN QUARTER(`date`) = 1 THEN 'Jan-Mar'
WHEN QUARTER(`date`) = 2 THEN 'Apr-Jun'
WHEN QUARTER(`date`) = 3 THEN 'Jul-Sep'
WHEN QUARTER(`date`) = 4 THEN 'Oct-Dec'
END AS `Quarty`,
SUM(year(`date`) = 2011) AS `2011`,
SUM(year(`date`) = 2012) AS `2012`,
SUM(year(`date`) = 2013) AS `2013`
FROM `mrp`
WHERE year(`date`) >= 2011
GROUP BY QUARTER(`date`)
Live Sample
this is just from 2011, but you can add table from 2012 and 2013 by your self
select
case
when monthvalue=1 then 'Jan-Mar'
when monthvalue=2 then 'Apr-Jun'
when monthvalue=3 then 'Jul-Sep'
when monthvalue=4 then 'Oct-Dec'
end as period,
t2011.x as freq2011 from
(SELECT CEIL(MONTH(date)/3) as monthVALUE,count(*) as x
FROM mrp
where year(date)=2011
GROUP BY monthVALUE) t2011;