Count references to own ID in MySQL with Grouping - mysql

I'm running a query on posts in a database. If the post is a response to another post, it has a parent_id greater than zero otherwise zero.
To generate a report on posts, I use the following in my SELECT and then group by user.
SELECT
SUM(IF(parent_id = 0, 1, 0)) as 'NewPosts',
SUM(IF(parent_id > 0, 1,0)) as 'Responses',
COUNT(parent_id) as 'TotalPosts',
FROM posts
GROUP BY user
Now I need to add a column that show's self responses by a user. Something like...
SUM(IF(parent_id IN id, 1, 0)) as 'SelfResponses'
Of course I know that is wrong but I hope it sends the idea across.
Edit: Data would look like :
User Id parent_id
Henry 12 0
Henry 24 12
Henry 32 16
Joseph 16 0
So in this case the output would be:
User NewPosts Responses TotalPosts SelfResponses
Henry 2 2 3 1
Joseph 1 0 1 0

Assuming for response the parentId is the postId for the response then you can achieve this by the following way
Query 1:
SELECT
a.user,
SUM(IF(a.parent_id = 0, 1, 0)) as 'NewPosts',
SUM(IF(a.parent_id > 0, 1,0)) as 'Responses',
COUNT(a.parent_id) as 'TotalPosts',
SUM(IF(a.user = b.user, 1, 0)) as 'SelfResponses'
FROM
Table1 a
LEFT JOIN
Table1 b
ON
a.parent_id = b.id
GROUP BY
a.user
Results:
| USER | NEWPOSTS | RESPONSES | TOTALPOSTS | SELFRESPONSES |
--------------------------------------------------------------
| Henry | 1 | 2 | 3 | 1 |
| Joseph | 1 | 0 | 1 | 0 |
SQL FIDDLE
Hope this helps

Related

Retrieve data from a complex table

While searches the date range, start date (date_reg) and end date (date_reg)
, the mysql result should be have each main_table rows contains latest return, received, balance of each products.
E.g.: Between 10-01-2014 and 10-05-2014, should retrieve values of each product within the date
Client Id | Return | Received | Balance
| prod 1 prod 2 | prod 1 prod 2 | prod 1 prod 2
--------------------------------------------------------------
1 | 2 [3] 2 [7] | 5 5 | 8 5
2 | 1 [5] 0 [8] | 5 5 | 9 3
3 | 0 [6] 1 [10]| 5 5 | 7 6
[id], where id is the primary key of sub_table
I have tried mysql query
SELECT p.product_name, ipd.id as ipd_id, i.id as i_id, ipd.*, i.*
FROM main_table i
LEFT JOIN sub_table ipd ON ipd.main_table_id=i.id AND ipd.product_id IN (1,2)
LEFT JOIN product p ON ipd.product_id=p.id
WHERE ipd.date_reg IN (SELECT MAX(ipd1.date_reg)
FROM sub_table ipd1
WHERE ipd1.main_table_id=i.id AND
date_reg BETWEEN '10-01-2014' AND '10-05-2014')
ORDER BY cl.id ASC LIMIT 0, 20
it only return single product of return, received and balance of each client
When you use the subquery WHERE 'ipd.date_reg IN 'SELECT MAX...' you're only going to get 1 entry based on your data - 10-04-2014. Working correctly.
Try use GROUP BY in the sub query
Also GROUP_CONCAT(expr); helps to do many-to-many info's which can be be used to concatenate column values into a single string.
I got the output. Thanks everyone for the helps.
I have used GROUP_CANCAT to concatenate the results into one string with comma seperated
SELECT p.product_name, ipd.id as ipd_id, i.id as i_id, ipd.*, i.*,
GROUP_CONCAT(product_id SEPARATOR ',') as group_product_id,
GROUP_CONCAT(ipd.return SEPARATOR ',') as group_return,
GROUP_CONCAT(ipd.received SEPARATOR ',') as group_received,
GROUP_CONCAT(ipd.balance SEPARATOR ',') as group_balance
FROM main_table i
LEFT JOIN sub_table ipd ON ipd.main_table_id=i.id AND ipd.product_id IN (1,2)
LEFT JOIN product p ON ipd.product_id=p.id
WHERE ipd.date_reg IN (SELECT MAX(ipd1.date_reg)
FROM sub_table ipd1
WHERE ipd1.main_table_id=i.id AND
date_reg BETWEEN '10-01-2014' AND '10-05-2014'
GROUP BY ipd1.product_id)
ORDER BY cl.id ASC LIMIT 0, 20
The Result
Client Id | group_product_id | group_return | group_received | group_balance
--------------------------------------------------------------------------
1 | 1, 2 | 2, 2 | 5,5 | 8,5
2 | 1, 2 | 1, 0 | 5,5 | 9,3
3 | 1, 2 | 0, 1 | 5,5 | 7,6
Then the strings can be exploded into an array.

MySQL count per item

I'm currently trying to make a mysql query that will count the number of zeros and ones per item, in the following way:
Table:
ID | PollID | Value
------------------------------------
1 | 1 | 1
2 | 1 | 1
3 | 2 | 0
4 | 2 | 1
5 | 1 | 0
And the result I want is:
Poll | one | zero
----------------------------------
1 | 2 | 1
2 | 1 | 1
Thanks for the help!
This is the shortest possible answer in MySQL because it supports boolean arithmetic.
SELECT PollID,
SUM(value = 1) AS `One`,
SUM(value = 0) AS `Zero`
FROM tableName
GROUP BY PollID
SQLFiddle Demo
select z.pollid,z.ones,s.zeros
from (select a.pollid,count(a.value) as ones from test a
where a.value=1
group by a.pollid) z
left join
(select b.pollid,count(b.value) as zeros from test b
where b.value=0 group by b.pollid) s
on z.pollid=s.pollid;
try this
select table.pollid,
Switch(table.value Like 1, 1)AS one,
Switch(table.value Like 0, 1)AS zero
from table
group by pollid

MYSQL Inner join with select statement

Right now I have the following query:
SELECT
rm.reward_name,
rm.rewardid,
rc.reward_code,
rc.status,
rc.rewardid,
rc.add_date,
rc.status
from rewards_codes as rc
INNER JOIN reward_mast as rm on rc.rewardid on rm.rewardid
where DATE(rc.add_date) between '2012-03-16' AND '2013-03-16';
I want to fetch total no of codes,available codes from all codes,used codes
i have taken status field in rewards_codes field for differentiate code status
0 - Available to use
1- Used code
So my final output should be like following:
-----------------------------------------------------------
Reward Name Total Codes Available code Used code
my_reward 100 40 60
extra_reward 100 90 10
-----------------------------------------------------------
[Update]
Here is some sample data from both table...
reward_mast
rewardid rewrd_name
1 my_reward
2 extra_reward
3 test_reward
rewards_codes
codeId rewardid reward_code add_date status
1 1 aka454 2012-11-21 0
2 2 ala499 2012-04-21 0
3 1 pao789 2012-08-21 0
4 3 zlk753 2012-01-21 0
5 2 qra954 2012-05-21 0
Try this:
SELECT
rm.rewardid,
rm.reward_name,
IFNULL(COUNT(rc.reward_code), 0) AS 'Total Codes',
IFNULL(SUM(rc.status = 0), 0) AS 'Available code',
IFNULL(SUM(rc.status = 1), 0) AS 'Used Codes'
FROM reward_mast as rm
LEFT JOIN rewards_codes as rc on rc.rewardid = rm.rewardid
WHERE DATE(rc.add_date) between '2012-03-16' AND '2013-03-16'
GROUP BY rm.reward_name,
rm.rewardid;
This will give you the count of each category of status codes individually, Totalcodes, Available Codes and Used Codes.
SQL Fiddle Demo
This will give you:
| REWARDID | REWARD_NAME | TOTAL CODES | AVAILABLE CODE | USED CODES |
-----------------------------------------------------------------------
| 1 | my_reward | 2 | 2 | 0 |
| 2 | extra_reward | 2 | 2 | 0 |

grouping resultset - mysql

I have the following sql which returns the total number of books grouped by status
select COUNT(BOOK_ID) AS book_num, BOOK_STATUS_FK from BOOKS group by BOOK_STATUS_FK;
+---------+------------------+
| book_num | BOOK_STATUS_FK |
+---------+------------------+
| 57 | 2 |
| 162 | 3 |
| 9736 | 4 |
| 104 | 5 |
| 29 | 22 |
| 1 | 23 |
| 5 | 25 |
| 14 | 54 |
+---------+------------------+
I would like to group the resultset into 2 rows only where one row represents the number of books with BOOK_STATUS_FK > 4 and the 2nd to represent the number of books with BOOK_STATUS_FK <= 4
Is there a way of doing that in sql?
Thanks for your suggestions.
The 2 row solution Gordon Linoff suggests wont produce 2 rows when one of the counts is 0.
The following will give both counts in a single row:
select ifnull( sum( if( book_status_fk > 4, 1, 0 ) ), 0), ifnull( sum( if( book_status_fk <= 4, 1, 0 ) ), 0 )
from books
Edit: added ifnull's
This is an aggregation with a case statement:
select (case when book_tatus_fk > 4 then '>4' else '<=4' end) as grp, count(*)
from books
group by (case when book_tatus_fk > 4 then '>4' else '<=4' end)
If you always need two rows, even if count of a group is 0, you can use palindrom's solution or you can use this slightly modified version of Gordon Linoff's query:
select grp.g, count(BOOK_STATUS_FK)
from
(select '<=4' g union all select '>4') grp left join books
on grp.g = case when book_status_fk > 4 then '>4' else '<=4' end
group by grp.g

How to include dates with zero messages into the resultset anyway?

I have the following table with messages:
+---------+---------+------------+----------+
| msg_id | user_id | m_date | m_time |
+-------------------+------------+----------+
| 1 | 1 | 2011-01-22 | 06:23:11 |
| 2 | 1 | 2011-01-23 | 16:17:03 |
| 3 | 1 | 2011-01-23 | 17:05:45 |
| 4 | 2 | 2011-01-22 | 23:58:13 |
| 5 | 2 | 2011-01-23 | 23:59:32 |
| 6 | 2 | 2011-01-24 | 21:02:41 |
| 7 | 3 | 2011-01-22 | 13:45:00 |
| 8 | 3 | 2011-01-23 | 13:22:34 |
| 9 | 3 | 2011-01-23 | 18:22:34 |
| 10 | 3 | 2011-01-24 | 02:22:22 |
| 11 | 3 | 2011-01-24 | 13:12:00 |
+---------+---------+------------+----------+
What I want is for each day, to see how many messages each user has sent BEFORE and AFTER 16:00:
SELECT
user_id,
m_date,
SUM(m_time <= '16:00') AS before16,
SUM(m_time > '16:00') AS after16
FROM messages
GROUP BY user_id, m_date
ORDER BY user_id, m_date ASC
This produces:
user_id m_date before16 after16
-------------------------------------
1 2011-01-22 1 0
1 2011-01-23 0 2
2 2011-01-22 0 1
2 2011-01-23 0 1
2 2011-01-24 0 1
3 2011-01-22 1 0
3 2011-01-23 1 1
3 2011-01-24 2 0
Because user 1 has written no messages on 2011-01-24, this date is not in the resultset. However, this is undesirable. I have a second table in my database, called "date_range":
+---------+------------+
| date_id | d_date |
+---------+------------+
| 1 | 2011-01-21 |
| 1 | 2011-01-22 |
| 1 | 2011-01-23 |
| 1 | 2011-01-24 |
+---------+------------+
I want to check the "messages" against this table. For each user, all these dates have to be in the resultset. As you can see, none of the users have written messages on 2011-01-21, and as said, user 1 has no messages on 2011-01-24. The desired output of the query would be:
user_id d_date before16 after16
-------------------------------------
1 2011-01-21 0 0
1 2011-01-22 1 0
1 2011-01-23 0 2
1 2011-01-24 0 0
2 2011-01-21 0 0
2 2011-01-22 0 1
2 2011-01-23 0 1
2 2011-01-24 0 1
3 2011-01-21 0 0
3 2011-01-22 1 0
3 2011-01-23 1 1
3 2011-01-24 2 0
How can I link the two tables so that the query result also holds rows with zero values for before16 and after16?
Edit: yes, I have a "users" table:
+---------+------------+
| user_id | user_date |
+---------+------------+
| 1 | foo |
| 2 | bar |
| 3 | foobar |
+---------+------------+
Test bed:
create table messages (msg_id integer, user_id integer, _date date, _time time);
create table date_range (date_id integer, _date date);
insert into messages values
(1,1,'2011-01-22','06:23:11'),
(2,1,'2011-01-23','16:17:03'),
(3,1,'2011-01-23','17:05:05');
insert into date_range values
(1, '2011-01-21'),
(1, '2011-01-22'),
(1, '2011-01-23'),
(1, '2011-01-24');
Query:
SELECT p._date, p.user_id,
coalesce(m.before16, 0) b16, coalesce(m.after16, 0) a16
FROM
(SELECT DISTINCT user_id, dr._date FROM messages m, date_range dr) p
LEFT JOIN
(SELECT user_id, _date,
SUM(_time <= '16:00') AS before16,
SUM(_time > '16:00') AS after16
FROM messages
GROUP BY user_id, _date
ORDER BY user_id, _date ASC) m
ON p.user_id = m.user_id AND p._date = m._date;
EDIT:
Your initial query is left as is, I hope it doesn't requires any explanations;
SELECT DISTINCT user_id, dr._date FROM messages m, date_range dr will return a cartesian or CROSS JOIN of two tables, which will give me all required date range for each user in subject. As I'm interested in each pair only once, I use DISTINCT clause. Try this query with and without it;
Then I use LEFT JOIN on two sub-selects.
This join means: first, INNER join is performed, i.e. all rows with matching fields in the ON condition are returned. Then, for each row in the left-side relation of the join that has no matches on the right side, return NULLs (thus the name, LEFT JOIN, i.e. left relation is always there and right is expected to have NULLs). This join will do what you expect — return user_id + date combinations even if there were no messages in the given date for a given user. Note that I use user_id + date sub-select first (on the left) and messages query second (on the right);
coalesce() is used to replace NULL with zero.
I hope this clarifies how this query works.
Give this a shot:
select u.user_id, u._date,
sum(_time <= '16:00') as before16,
sum(_time > '16:00') as after16
from (
select m.user_id, d._date
from messages m
cross join date_range d
group by m.user_id, d._date
) u
left join messages m on u.user_id=m.user_id
and u._date=m._date
group by u.user_id, u._date
The inner query is just building a set of all possible/desired user-date pairs. It would be more efficient to use a users table, but you didn't mention that you had one, so I won't assume. otherwise, you just need the left join to not remove the non-joined records.
EDIT
--More detailed explanation: taking the query apart.
Start with the innermost query; the goal is to get a list of all desired dates for every user. Since there's a table of users and a table of dates it can look like this:
select distinct u.user_id, d.d_date
from users u
cross join date_range d
The key here is the cross join, taking every row in the users table and associating it with every row in the date_range table. The distinct keyword is really just a shorthand for a group by on all columns, and is here just in case there's duplicated data.
Note that there are several other methods of getting this same result set (like in my original query), but this is probably the simplest from both a logical and computational standpoint.
Really, the only other steps are to add the left join (associating all of the rows we got above to all available data, and not removing anything that doesn't have any data) and the group by and select components which are basically the same as you had before. So, putting everything together it looks like this:
select t.user_id, t.d_date,
sum(m.m_time <= '16:00') as before16,
sum(m.m_time > '16:00') as after16
from (
select distinct u.user_id, d.d_date
from users u
cross join date_range d
) t
left join messages m on t.user_id = m.user_id
and t.d_date = m.m_date
group by t.user_id, t.d_date
Based on some other comments/questions, note the explicit use of prefixes for all uses of all tables and sub-queries (which is pretty straight forward since we're not using any table more than once anymore): u for the users table, d for the date_range table, t for the sub-query containing the dates to use for each user, and m for the message table. This is probably where my first explanation fell a little short, since I used the message table twice, both times with the same prefix. It works there because of the context of both uses (one was in a sub-query), but it probably isn't the best practice.
It is not neat. But if you have a user table. Then maybe something like this:
SELECT
user_id,
_date,
SUM(_time <= '16:00') AS before16,
SUM(_time > '16:00') AS after16
FROM messages
GROUP BY user_id, _date
UNION
SELECT
user_id,
date_range,
0 AS before16,
0 AS after16
FROM
users,
date_range
ORDER BY user_id, _date ASC
chezy525's solution works great, I ported it to postgresql and removed/renamed some aliases:
select users_and_dates.user_id, users_and_dates._date,
SUM(case when _time <= '16:00' then 1 else 0 end) as before16,
SUM(case when _time > '16:00' then 1 else 0 end) as after16
from (
select messages.user_id, date_range._date
from messages
cross join date_range
group by messages.user_id, date_range._date
) users_and_dates
left join messages on users_and_dates.user_id=messages.user_id
and users_and_dates._date=messages._date
group by users_and_dates.user_id, users_and_dates._date;
and ran on my machine, worked perfectly