select two tables mysql without join - mysql

There are two tables, recharge and purchase.
select * from recharge;
+-----+------+--------+---------------------+
| idx | user | amount | created |
+-----+------+--------+---------------------+
| 1 | 3 | 10 | 2016-01-09 20:16:18 |
| 2 | 3 | 5 | 2016-01-09 20:16:45 |
+-----+------+--------+---------------------+
select * from purchase;
+-----+------+----------+---------------------+
| idx | user | resource | created |
+-----+------+----------+---------------------+
| 1 | 3 | 2 | 2016-01-09 20:55:30 |
| 2 | 3 | 1 | 2016-01-09 20:55:30 |
+-----+------+----------+---------------------+
I want to figure out balance of users which is SUM(amount) - COUNT(purchase.idx). (in this case, 13)
So I had tried
SELECT (SUM(`amount`)-COUNT(purchase.idx)) AS balance
FROM `recharge`, `purchase`
WHERE purchase.user = 3 AND recharge.user = 3
but, it returned error.

If you want an accurate count, then aggregate before doing arithmetic. For your particular case:
select ((select sum(r.amount) from recharge where r.user = 3) -
(select count(*) from purchase p where p.user = 3)
)
To do this for multiple users, move the subqueries to the from clause or use union all and aggregation. The second is safer if a user might only be in one table:
select user, coalesce(sum(suma), 0) - coalesce(sum(countp), 0)
from ((select user, sum(amount) as suma, null as countp
from recharge
group by user
) union all
(select user, null, count(*)
from purchase
group by user
)
) rp
group by user

It is possible to using union like this
SELECT SUM(`amount`-aidx) AS balance
FROM(
SELECT SUM(`amount`) as amount, 0 as aidx
from `recharge` where recharge.user = 3
union
select 0 as amount, COUNT(purchase.idx) as aidx
from `purchase`
WHERE purchase.user = 3 )a

Related

First Unique Sql row

I have a MySql table of users order and it has columns such as:
user_id | timestamp | is_order_Parent | Status |
1 | 10-02-2020 | N | C |
2 | 11-02-2010 | Y | D |
3 | 11-02-2020 | N | C |
1 | 12-02-2010 | N | C |
1 | 15-02-2020 | N | C |
2 | 15-02-2010 | N | C |
I want to count number of new custmer per day defined as: a customer who orders non-parent order and his order status is C AND WHEN COUNTING A USER ONCE IN A DAY WE DONT COUNT HIM FOR OTHER DAYS
An ideal resulted table will be:
Timestamp: Day | Distinct values of User ID
10-02-2020 | 1
11-02-2010 | 1
12-02-2010 | 0 <--- already counted user_id = 1 above, so no need to count it here
15-02-2010 | 1
table name is cscart_orders
If you are running MySQL 8.0, you can do this with window functions an aggregation:
select timestamp, sum(timestamp = timestamp0) new_users
from (
select
t.*,
min(case when is_order_parent = 'N' and status = 'C' then timestamp end) over(partition by user_id) timestamp0
from mytable t
) t
group by timestamp
The window min() computes the timestamp when each user became a "new user". Then, the outer query aggregates by date, and counts how many new users were found on that date.
A nice thing about this approach is that it does not require enumerating the dates separately.
You can use two levels of aggregation:
select first_timestamp, count(*)
from (select t.user_id, min(timestamp) as first_timestamp
from t
where is_order_parent = 'N' and status = 'C'
group by t.user_id
) t
group by first_timestamp;

Can't get appropriate values from query?

I want to list companyIds and with the mostly occur commentable type (0,1,2).
This is subquery
select a.companyId, a.commentable, count(1) _count
from article a
group by a.companyId, a.commentable
| companyId | commentable | _count |
|-----------|-------------|--------|
| 1 | 0 | 1 |
| 1 | 1 | 1 |
| 2 | 0 | 7759 |
| 2 | 1 | 7586 |
| 2 | 2 | 7856 |
| 3 | 0 | 7828 |
| 3 | 1 | 7866 |
| 3 | 2 | 7706 |
| 4 | 0 | 7851 |
| 4 | 1 | 7901 |
| 4 | 2 | 7738 |
| 5 | 0 | 7775 |
| 5 | 1 | 7884 |
| 5 | 2 | 7602 |
| 25 | 0 | 7888 |
| 25 | 1 | 7939 |
| 25 | 2 | 7784 |
For example above
Most commentable type occur for companyId=4 is 7901 and commentable type for that is 1. In below query , i see 4-0-7901, but i expected 4-1-7901
SELECT x.companyId, x.commentable, MAX(x._count) _count
FROM
( SELECT a.companyId, a.commentable, COUNT(1) _count
FROM article a
GROUP BY a.companyId, a.commentable
) AS X
GROUP BY x.companyId;
companyId commentable _count
1 0 1
2 0 7856
3 0 7866
4 0 7901
5 0 7884
25 0 7939
Expected result
companyId commentable _count
1 0 1
2 2 7856
3 1 7866
4 1 7901
5 1 7884
25 1 7939
I dont understand 'why is all commentable column is '0' .
You need a big ugly join here. In the query below, you may view the GROUP BY query on the company and comment type the base unit of work. This query appears as itself, aliased as t1. In alias t2, we subquery and aggregate only by commentable, to find the max count for each such comment type. This, we join back to t1 to restrict only the company having the max count.
SELECT
t1.companyId,
t1.commentable,
t1.cnt
FROM
(
SELECT companyId, commentable, COUNT(*) cnt
FROM article
GROUP BY companyId, commentable
) t1
INNER JOIN
(
SELECT companyId, MAX(cnt) max_cnt
FROM
(
SELECT companyId, commentable, COUNT(*) cnt
FROM article
GROUP BY companyId, commentable
) t
GROUP BY companyId
) t2
ON t1.companyId = t2.companyId AND t1.cnt = t2.max_cnt;
By the way, things get somewhat nicer in MySQL 8+, where we can take advantage of analytic functions:
WITH cte AS (
SELECT companyId, commentable, COUNT(*) cnt,
ROW_NUMBER() OVER (PARTITION BY commentable ORDER BY COUNT(*) DESC) rn
FROM article
GROUP BY companyId, commentable
)
SELECT companyId, commentable, cnt
FROM cte
WHERE rn = 1;
You can do this using a having clause:
SELECT a.companyId, a.commentable, COUNT(*) as _count
FROM article a
GROUP BY a.companyId, a.commentable
HAVING COUNT(*) = (SELECT COUNT(*)
FROM article a2
WHERE a2.companyId = a.companyId
GROUP BY a2.commentable
ORDER BY COUNT(*) DESC
LIMIT 1
);
In the event of ties, you will get multiple rows. If you want only one row per company, you can instead use commentable for the comparison in the HAVING:
SELECT a.companyId, a.commentable, COUNT(*) as _count
FROM article a
GROUP BY a.companyId, a.commentable
HAVING a.commentable = (SELECT a2.commentable
FROM article a2
WHERE a2.companyId = a.companyId
GROUP BY a2.commentable
ORDER BY COUNT(*) DESC
LIMIT 1
);
As others have mentioned, your problem is the mis-use of GROUP BY. The unaggregated columns in the SELECT need to match the GROUP BY keys -- and vice versa.
Cause commentable is not one of group by columns. In this case, with ONLY_FULL_GROUP_BY disabled, MySQL is free to choose any one value for this column.
From MySQL doc
If ONLY_FULL_GROUP_BY is disabled, a MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns. This causes MySQL to accept the preceding query. In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want.

MySQL query for distinct rows on count

I have such query that gives me results about bestseller items from shops, at the moment it works fine, but now I want to get only one product from each shop so to have a distinct si.shop_id only one bestseller product from a shop
SELECT `si`.`id`, si.shop_id,
(SELECT COUNT(*)
FROM `transaction_item` AS `tis`
JOIN `transaction` as `t`
ON `t`.`id` = `tis`.`transaction_id`
WHERE `tis`.`shop_item_id` = `si`.`id`
AND `t`.`added_date` >= '2014-02-26 00:00:00')
AS `count`
FROM `shop_item` AS `si`
INNER JOIN `transaction_item` AS `ti`
ON ti.shop_item_id = si.id
GROUP BY `si`.`id`
ORDER BY `count` DESC LIMIT 7
and that gives mu a result like:
+--------+---------+-------+
| id | shop_id | count |
+--------+---------+-------+
| 425030 | 38027 | 111 |
| 291974 | 5368 | 20 |
| 425033 | 38027 | 18 |
| 291975 | 5368 | 12 |
| 142776 | 5368 | 10 |
| 397016 | 38027 | 9 |
| 291881 | 5368 | 8 |
+--------+---------+-------+
any ideas?
EDIT
so I created a fiddle for it
http://sqlfiddle.com/#!2/cfc4c/1
Now the query returns best selling products I want it to return only one product from shopso the result of fiddle should be
+----+---------+-------+
| ID | SHOP_ID | COUNT |
+----+---------+-------+
| 1 | 222 | 3 |
| 4 | 333 | 2 |
| 8 | 555 | 1 |
| 9 | 777 | 1 |
+----+---------+-------+
Possibly something like this:-
SELECT si.shop_id,
SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(':', si.id, sub1.item_count) ORDER BY sub1.item_count DESC), ',', 1) AS `count`
FROM shop_item AS si
INNER JOIN
(
SELECT tis.shop_item_id, COUNT(*) AS item_count
FROM transaction_item AS tis
JOIN `transaction` as t
ON t.id = tis.transaction_id
AND t.added_date >= '2014-02-26 00:00:00'
GROUP BY tis.shop_item_id
) sub1
ON sub1.shop_item_id = si.id
GROUP BY si.shop_id
ORDER BY `count` DESC LIMIT 7
The sub query gets the count of items for each shop. Then the main query concatenates the item id and the item count together, group concatenates all those for a single shop together (ordered by the count descending) and then uses SUBSTRING_INDEX to grab the first one (ie, everything before the first comma).
You will have to split up the count field to get the item id and count separately (the separator is a : ).
This is taking a few guesses about what you really want, and with no table declares or data it isn't tested.
EDIT - now tested with the SQL fiddle example:-
SELECT SUBSTRING_INDEX(`count`, ':', 1) AS ID,
shop_id,
SUBSTRING_INDEX(`count`, ':', -1) AS `count`
FROM
(
SELECT si.shop_id,
SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(':', si.id, sub1.item_count) ORDER BY sub1.item_count DESC), ',', 1) AS `count`
FROM shop_item AS si
INNER JOIN transaction_item AS ti
ON ti.shop_item_id = si.id
INNER JOIN
(
SELECT tis.shop_item_id, COUNT(*) AS item_count
FROM transaction_item AS tis
JOIN `transaction` as t
ON t.id = tis.transaction_id
AND t.added_date >= '2014-02-26 00:00:00'
GROUP BY tis.shop_item_id
) sub1
ON sub1.shop_item_id = si.id
GROUP BY si.shop_id
) sub2
ORDER BY `count` DESC LIMIT 7;

SUM a pair of COUNTs from two tables based on a time variable

Been searching for an answer to this for the better part of an hour without much luck. I have two regional tables laid out with the same column names and I can put out a result list for either table based on the following query (swap Table2 for Table1):
SELECT Table1.YEAR, FORMAT(COUNT(Table1.id),0) AS Total
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
Ideally I'd like to get a result that gives me a total sum of the counts by year, so instead of:
| REGION 1 | | REGION 2 |
| YEAR | Total | | YEAR | Total |
| 2010 | 5 | | 2010 | 1 |
| 2009 | 2 | | 2009 | 3 |
| | | | 2008 | 4 |
I'd have:
| MERGED |
| YEAR | Total |
| 2010 | 6 |
| 2009 | 5 |
| 2008 | 4 |
I've tried a variety of JOINs and other ideas but I think I'm caught up on the SUM and COUNT issue. Any help would be appreciated, thanks!
SELECT `YEAR`, FORMAT(SUM(`count`), 0) AS `Total`
FROM (
SELECT `Table1`.`YEAR`, COUNT(*) AS `count`
WHERE `Table1`.`variable` = 'Y'
GROUP BY `Table1`.`YEAR`
UNION ALL
SELECT `Table2`.`YEAR`, COUNT(*) AS `count`
WHERE `Table2`.`variable` = 'Y'
GROUP BY `Table2`.`YEAR`
) AS `union`
GROUP BY `YEAR`
You should use an UNION:
SELECT
t.YEAR,
COUNT(*) as TOTAL
FROM (
SELECT *
FROM Table1
UNION ALL
SELECT *
FROM Table2
) t
WHERE t.variable='Y'
GROUP BY t.YEAR;
Select year, sum(counts) from (
SELECT Table1.YEAR, FORMAT(COUNT(Table1.id),0) AS Total
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
UNION ALL
SELECT Table2.YEAR, FORMAT(COUNT(Table2.id),0) AS Total
FROM Table2
WHERE Table2.variable='Y'
GROUP BY Table2.YEAR ) GROUP BY year
To improve upon Shehzad's answer:
SELECT YEAR, FORMAT(SUM(counts),0) AS total FROM (
SELECT Table1.YEAR, COUNT(Table1.id) AS counts
FROM Table1
WHERE Table1.variable='Y'
GROUP BY Table1.YEAR
UNION ALL
SELECT Table2.YEAR, COUNT(Table2.id) AS counts
FROM Table2
WHERE Table2.variable='Y'
GROUP BY Table2.YEAR ) AS newTable GROUP BY YEAR

How to include dates with zero messages into the resultset anyway?

I have the following table with messages:
+---------+---------+------------+----------+
| msg_id | user_id | m_date | m_time |
+-------------------+------------+----------+
| 1 | 1 | 2011-01-22 | 06:23:11 |
| 2 | 1 | 2011-01-23 | 16:17:03 |
| 3 | 1 | 2011-01-23 | 17:05:45 |
| 4 | 2 | 2011-01-22 | 23:58:13 |
| 5 | 2 | 2011-01-23 | 23:59:32 |
| 6 | 2 | 2011-01-24 | 21:02:41 |
| 7 | 3 | 2011-01-22 | 13:45:00 |
| 8 | 3 | 2011-01-23 | 13:22:34 |
| 9 | 3 | 2011-01-23 | 18:22:34 |
| 10 | 3 | 2011-01-24 | 02:22:22 |
| 11 | 3 | 2011-01-24 | 13:12:00 |
+---------+---------+------------+----------+
What I want is for each day, to see how many messages each user has sent BEFORE and AFTER 16:00:
SELECT
user_id,
m_date,
SUM(m_time <= '16:00') AS before16,
SUM(m_time > '16:00') AS after16
FROM messages
GROUP BY user_id, m_date
ORDER BY user_id, m_date ASC
This produces:
user_id m_date before16 after16
-------------------------------------
1 2011-01-22 1 0
1 2011-01-23 0 2
2 2011-01-22 0 1
2 2011-01-23 0 1
2 2011-01-24 0 1
3 2011-01-22 1 0
3 2011-01-23 1 1
3 2011-01-24 2 0
Because user 1 has written no messages on 2011-01-24, this date is not in the resultset. However, this is undesirable. I have a second table in my database, called "date_range":
+---------+------------+
| date_id | d_date |
+---------+------------+
| 1 | 2011-01-21 |
| 1 | 2011-01-22 |
| 1 | 2011-01-23 |
| 1 | 2011-01-24 |
+---------+------------+
I want to check the "messages" against this table. For each user, all these dates have to be in the resultset. As you can see, none of the users have written messages on 2011-01-21, and as said, user 1 has no messages on 2011-01-24. The desired output of the query would be:
user_id d_date before16 after16
-------------------------------------
1 2011-01-21 0 0
1 2011-01-22 1 0
1 2011-01-23 0 2
1 2011-01-24 0 0
2 2011-01-21 0 0
2 2011-01-22 0 1
2 2011-01-23 0 1
2 2011-01-24 0 1
3 2011-01-21 0 0
3 2011-01-22 1 0
3 2011-01-23 1 1
3 2011-01-24 2 0
How can I link the two tables so that the query result also holds rows with zero values for before16 and after16?
Edit: yes, I have a "users" table:
+---------+------------+
| user_id | user_date |
+---------+------------+
| 1 | foo |
| 2 | bar |
| 3 | foobar |
+---------+------------+
Test bed:
create table messages (msg_id integer, user_id integer, _date date, _time time);
create table date_range (date_id integer, _date date);
insert into messages values
(1,1,'2011-01-22','06:23:11'),
(2,1,'2011-01-23','16:17:03'),
(3,1,'2011-01-23','17:05:05');
insert into date_range values
(1, '2011-01-21'),
(1, '2011-01-22'),
(1, '2011-01-23'),
(1, '2011-01-24');
Query:
SELECT p._date, p.user_id,
coalesce(m.before16, 0) b16, coalesce(m.after16, 0) a16
FROM
(SELECT DISTINCT user_id, dr._date FROM messages m, date_range dr) p
LEFT JOIN
(SELECT user_id, _date,
SUM(_time <= '16:00') AS before16,
SUM(_time > '16:00') AS after16
FROM messages
GROUP BY user_id, _date
ORDER BY user_id, _date ASC) m
ON p.user_id = m.user_id AND p._date = m._date;
EDIT:
Your initial query is left as is, I hope it doesn't requires any explanations;
SELECT DISTINCT user_id, dr._date FROM messages m, date_range dr will return a cartesian or CROSS JOIN of two tables, which will give me all required date range for each user in subject. As I'm interested in each pair only once, I use DISTINCT clause. Try this query with and without it;
Then I use LEFT JOIN on two sub-selects.
This join means: first, INNER join is performed, i.e. all rows with matching fields in the ON condition are returned. Then, for each row in the left-side relation of the join that has no matches on the right side, return NULLs (thus the name, LEFT JOIN, i.e. left relation is always there and right is expected to have NULLs). This join will do what you expect — return user_id + date combinations even if there were no messages in the given date for a given user. Note that I use user_id + date sub-select first (on the left) and messages query second (on the right);
coalesce() is used to replace NULL with zero.
I hope this clarifies how this query works.
Give this a shot:
select u.user_id, u._date,
sum(_time <= '16:00') as before16,
sum(_time > '16:00') as after16
from (
select m.user_id, d._date
from messages m
cross join date_range d
group by m.user_id, d._date
) u
left join messages m on u.user_id=m.user_id
and u._date=m._date
group by u.user_id, u._date
The inner query is just building a set of all possible/desired user-date pairs. It would be more efficient to use a users table, but you didn't mention that you had one, so I won't assume. otherwise, you just need the left join to not remove the non-joined records.
EDIT
--More detailed explanation: taking the query apart.
Start with the innermost query; the goal is to get a list of all desired dates for every user. Since there's a table of users and a table of dates it can look like this:
select distinct u.user_id, d.d_date
from users u
cross join date_range d
The key here is the cross join, taking every row in the users table and associating it with every row in the date_range table. The distinct keyword is really just a shorthand for a group by on all columns, and is here just in case there's duplicated data.
Note that there are several other methods of getting this same result set (like in my original query), but this is probably the simplest from both a logical and computational standpoint.
Really, the only other steps are to add the left join (associating all of the rows we got above to all available data, and not removing anything that doesn't have any data) and the group by and select components which are basically the same as you had before. So, putting everything together it looks like this:
select t.user_id, t.d_date,
sum(m.m_time <= '16:00') as before16,
sum(m.m_time > '16:00') as after16
from (
select distinct u.user_id, d.d_date
from users u
cross join date_range d
) t
left join messages m on t.user_id = m.user_id
and t.d_date = m.m_date
group by t.user_id, t.d_date
Based on some other comments/questions, note the explicit use of prefixes for all uses of all tables and sub-queries (which is pretty straight forward since we're not using any table more than once anymore): u for the users table, d for the date_range table, t for the sub-query containing the dates to use for each user, and m for the message table. This is probably where my first explanation fell a little short, since I used the message table twice, both times with the same prefix. It works there because of the context of both uses (one was in a sub-query), but it probably isn't the best practice.
It is not neat. But if you have a user table. Then maybe something like this:
SELECT
user_id,
_date,
SUM(_time <= '16:00') AS before16,
SUM(_time > '16:00') AS after16
FROM messages
GROUP BY user_id, _date
UNION
SELECT
user_id,
date_range,
0 AS before16,
0 AS after16
FROM
users,
date_range
ORDER BY user_id, _date ASC
chezy525's solution works great, I ported it to postgresql and removed/renamed some aliases:
select users_and_dates.user_id, users_and_dates._date,
SUM(case when _time <= '16:00' then 1 else 0 end) as before16,
SUM(case when _time > '16:00' then 1 else 0 end) as after16
from (
select messages.user_id, date_range._date
from messages
cross join date_range
group by messages.user_id, date_range._date
) users_and_dates
left join messages on users_and_dates.user_id=messages.user_id
and users_and_dates._date=messages._date
group by users_and_dates.user_id, users_and_dates._date;
and ran on my machine, worked perfectly