I wanted to know if there's a way to join two or more result sets into one.
i have the following two queries
First query:
SELECT
CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)) as day_month_year,
db.country.country ,
count(concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on))) as count ,
COUNT(DISTINCT db.prod_id.email) AS MAIL
from db.prod_id
left join db.country on db.prod_id.branch_id = db.country.id
where db.prod_id.created_on > '2020-11-17' and (db.country.type = 1 or db.country.type = 2)
group by
concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)),
db.country.country
order by db.prod_id.created_on
The second query:
select
CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)) as day_month_year,
db.country.country,
count(CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on))) as count_BUY
from db.prod_id
left join db.prod_evaluations on db.prod_id.id = db.prod_evaluations.id
left join db.country on db.prod_id.branch_id = db.country.id
left join (Select prod_properties.prod_id, prod_properties.value From prod_properties Where prod_properties.property_id = 5) as db3 on db3.prod_id = db.prod_id.id
where db.prod_id.created_on > '2020-11-17'
and db3.value = 'online-buy' and db.prod_id.status_id <> 25
group by
concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)),
db.country.country
order by db.prod_id.created_on
The first query give the following result:
+------------+---------+-------+------+
| day | Country | Count | Mail |
+------------+---------+-------+------+
| 17-11-2020 | IT | 200 | 100 |
| 17-11-2020 | US | 250 | 100 |
| 18-11-2020 | IT | 350 | 300 |
| 18-11-2020 | US | 200 | 100 |
+------------+---------+-------+------+
The second query give:
+------------+---------+-----------+
| day | Country | Count_BUY |
+------------+---------+-----------+
| 17-11-2020 | IT | 50 |
| 17-11-2020 | US | 70 |
| 18-11-2020 | IT | 200 |
| 18-11-2020 | US | 50 |
+------------+---------+-----------+
Now i want to merge these two result in one:
+------------+---------+-------+------+-----------+
| day | Country | Count | Mail | Count_BUY |
+------------+---------+-------+------+-----------+
| 17-11-2020 | IT | 200 | 100 | 50 |
| 17-11-2020 | US | 250 | 100 | 70 |
| 18-11-2020 | IT | 350 | 300 | 200 |
| 18-11-2020 | US | 200 | 100 | 50 |
+------------+---------+-------+------+-----------+
How can i perform this query?
I'm using mysql
Thanks
The simple way: You can join queries.
select *
from ( <your first query here> ) first_query
join ( <your second query here> ) second_query using (day_month_year, country)
order by day_month_year, country;
This is an inner join. You can also outer join of course. MySQL doesn't support full outer joins, though. If you want that, you'll have to look up how to emulate a full outer join in MySQL.
The hard way ;-) Merge the queries.
If I am not mistaken, your two queries can be reduced to
select
date(created_on),
branch_id as country,
count(*) as count_products,
count(distinct p.email) as count_emails
from db.prod_id
where created_on >= date '2020-11-17'
and branch_id in (select country from db.country where type in (1, 2))
group by date(created_on), branch_id
order by date(created_on), branch_id;
and
select
date(created_on),
branch_id as country,
count(*) as count_buy
from db.prod_id
where created_on >= date '2020-11-17'
and status_id <> 25
and prod_id in (select prod_id from prod_properties where property_id = 5 and status_id <> 25)
group by date(created_on), branch_id
order by date(created_on), branch_id;
The two combined should be
select
date(created_on),
branch_id as country,
sum(branch_id in (select country from db.country where type in (1, 2)) as count_products,
count(distinct case when branch_id in (select country from db.country where type in (1, 2) then p.email end) as count_emails,
sum(status_id <> 25 and prod_id in (select prod_id from prod_properties where property_id = 5 and status_id <> 25)) as count_buy
from db.prod_id
where created_on >= date '2020-11-17'
group by date(created_on), branch_id
order by date(created_on), branch_id;
You see, the conditions the queries have in common remain in the where clause and the other conditions go inside the aggregation functions.
sum(boolean) is short for sum(case when boolean then 1 else 0 end), i.e. this counts the rows where the condition is met in MySQL.
Related
I have a table records of store id, processing batch id and start time as follows:
|store_id | batch_id | process_start_time |
| A | 1 | 10 |
| B | 1 | 40 |
| C | 1 | 30 |
| A | 2 | 400 |
| B | 2 | 800 |
| C | 2 | 600 |
| A | 3 | 10 |
| B | 3 | 80 |
| C | 3 | 90 |
Here, rows needed to be grouped by batch_id and time_taken is difference of process_start_time of store A and store C.
So, the expected result would be:
batch_id | time_taken
1 | 20
2 | 200
3 | 80
I tried to do something like:
select batch_id, ((select process_start_time from records where store_id = 'C') - (select process_start_time from records where store_id = 'A')) as time_taken
from records group by batch_id;
But couldn't figure out to select specific rows in that particular group.
Thank you for looking into!
Update: the process_start_time column not necessarily max for store C
You seem to want conditional aggregation and arithmetic:
select batch_id,
(max(case when store_id = 'C' then process_start_time end) -
min(case when store_id = 'A' then process_start_time end)
) as diff
from records
group by batch_id;
You can try a self join.
SELECT r1.batch_id,
r1.process_start_time - r2.process_start_time time_taken
FROM records r1
INNER JOIN records r2
ON r1.batch_id = r2.batch_id
WHERE r1.store_id = 'C'
AND r2.store_id = 'A';
Here's another answer. This is using two instances of the records table and we link them up with where clauses and exists as follows:
select a.batch_id,
c.process_start_time - a.process_start_time as time_taken
from records a,
records c
where a.store_id = 'A'
and c.store_id = 'C'
and exists (
select 1
from records x
where x.batch_id = a.batch_id
and x.batch_id = c.batch_id
);
SELECT DISTINCT
store_a.batch_id,
store_c.process_start_time - store_a.process_start_time AS 'time_taken'
FROM records store_a
INNER JOIN records store_c
ON store_a.batch_id = store_c.batch_id
AND store_c.store_id = 'C'
AND store_a.store_id = 'A'
I have two tables. One is a list of Orders, and one is a list of Events.
For each Order, I want to join the single last Event that happened (using clicked_at) before the created_at of the Order.
I have tried numerous ways to get this to work and tried several other answers on Stack Overflow but I am struggling to return the correct data.
The sudo logic for the subquery in my mind is something like:
SELECT campaign, user_id, created_at
FROM `Events`
WHERE order.user_id = user_id AND clicked_at < order.created_at
ORDER created_at DESC
LIMIT 1
Please see the example data below:
# Orders
| order_id | user_id | created_at |
-----------------------------------
| 123 | abc | 2020-07-04 |
| 456 | abc | 2020-05-01 |
# Events
| campaign | keyword | user_id | clicked_at |
----------------------------------------------
| facebook | shoes | abc | 2020-07-03 |
| google | hair | abc | 2020-07-01 |
My desired result
# Orders with campaign attribution
| order_id | user_id | created_at | campaign | keyword |
---------------------------------------------------------
| 123 | abc | 2020-07-04 | facebook | shoes |
| 456 | abc | 2020-06-04 | null | null |
Thanks!
Alex
Below is for BigQuery Standard SQL
#standardSQL
SELECT a.*, campaign, keyword
FROM `project.dataset.orders` a
LEFT JOIN (
SELECT
ANY_VALUE(o).*,
ARRAY_AGG(STRUCT(campaign, keyword) ORDER BY clicked_at DESC LIMIT 1)[OFFSET(0)].*
FROM `project.dataset.orders` o
JOIN `project.dataset.events` e
ON o.user_id = e.user_id
AND clicked_at < created_at
GROUP BY FORMAT('%t', o)
)
USING(order_id)
if applied to sample data from our question - result is
Row order_id user_id created_at campaign keyword
1 123 abc 2020-07-04 facebook shoes
2 456 abc 2020-05-01 null null
with orders as (
select 123 as order_id, 'abc' as user_id, cast('2020-07-04' as date) as created_at union all
select 456, 'abc', '2020-05-01'
),
events as (
select 'facebook' as campaign, 'shoes' as keyword, 'abc' as user_id, cast('2020-07-03' as date) as clicked_at union all
select 'google', 'hair', 'abc', '2020-07-01'
),
logic as (
select
orders.order_id,
orders.user_id,
orders.created_at,
events.clicked_at,
events.campaign,
events.keyword,
row_number() over (partition by orders.order_id order by events.clicked_at desc) as rn
from orders
left join events
on orders.user_id = events.user_id and events.clicked_at < orders.created_at
)
select * except(rn)
from logic
where rn = 1
Not sure on how to query this, but let's say I've got two tables as such
Table 1
| id | userid | points |
|:-----------|------------:|:------------:|
| 1 | 1 | 30
| 2 | 3 | 40
| 3 | 1 | 30
| 4 | 3 | 40
| 5 | 1 | 30
| 6 | 3 | 40
Table 2
| id | userid | productid |
|:-----------|------------:|:------------:|
| 1 | 1 | 4
| 2 | 3 | 4
| 3 | 1 | 3
| 4 | 3 | 3
| 5 | 1 | 3
| 6 | 3 | 3
I need to get all rows with s from table 1 where points are above 30 and where table2 has a productid of 4
At the moment I have a raw query like this:
SELECT userid, SUM(points) as points FROM table1 GROUP BY userid HAVING SUM(points) >= 30 ORDER BY SUM(points) DESC, userid
Through DB::select
How can I make sure that all of the results only have a product id of 4 via table2 connected via the userid? Is this where join is applicable and then I see leftjoin and others so I'm not too sure how to go about this, any suggestions appreciated.
EDIT:
I just got this working:
SELECT userid, SUM(points) as points FROM table1 LEFTJOIN table2 on table1.userid = table2.userid WHERE table2.productid = '4' GROUP BY userid HAVING SUM(points) >= 30 ORDER BY SUM(points) DESC, userid
It is giving me back to correct results, but not 100%sure on join/leftjoin, any feedback if that is OK?
If you use inner join you get only the related row that match between productid =4 and sum only this
SELECT userid, SUM(points) as points
FROM table1
inner join table2 on table1.id = table2.userid and productid=4
GROUP BY userid
HAVING SUM(points) >= 30
RDER BY SUM(points) DESC, userid
or if you are looking for the user that have on of the product = 4 then you can use
SELECT userid, SUM(points) as points
FROM table1
inner join (
select distinct userid
from table2 where productid =4
) t on table1.id = t.userid
GROUP BY userid
HAVING SUM(points) >= 30
RDER BY SUM(points) DESC, userid
I have two tables that I am attempting to join in MySQL:
reviews:
| review_id | comment | reviewer_id | user_id |
-----------------------------------------------------------
| 1 | some text. | 501 | 100 |
| 2 | lorem ipsum | 606 | 100 |
| 3 | blah blah. | 798 | 120 |
| 4 | foo bar! | 798 | 133 |
-----------------------------------------------------------
review_status:
| review_id | status | timestamp |
----------------------------------------
| 1 | 10 | 1364507521 |
| 1 | 101 | 1364508057 |
| 2 | 100 | 1364509033 |
| 1 | 150 | 1364509149 |
| 2 | 120 | 1364509283 |
| 2 | 122 | 1364855948 |
| 3 | 120 | 1364509283 |
| 3 | 122 | 1364855948 |
| 1 | 110 | 1364855945 |
| 4 | 100 | 1364509283 |
| 4 | 115 | 1364855948 |
| 4 | 210 | 1364855945 |
----------------------------------------
What I WANT is a result that looks something like this:
result
| review_id | comment | reviewer_id | user_id | status | timestamp |
--------------------------------------------------------------------------
| 1 | some text. | 501 | 100 | 200 | 1364855945 |
| 2 | lorem ipsum | 606 | 120 | 122 | 1364855948 |
--------------------------------------------------------------------------
I'm after: 1) The newest entry from the review_status table 2) A certain range of status codes (100 - 199 in this case) 3) And multiple user_id's from the review table.
This is currently my query, that I can't get to work for the life of me:
SELECT r.review_id, r.comment, r.reviewer_id, r.user_id
FROM reviews AS r
INNER JOIN
(SELECT s.status, max(s.timestamp)
FROM review_status AS s
WHERE s.status < 200
AND s.status > 99;
GROUP BY s.review_id) AS r_s
ON r.review_id = r_s.review_id
WHERE r.user_id IN (100,120);
Any help is greatly appreciated! Thanks.
You have a few issues with your current query.
the subquery is not returning review_id so you cannot use that in the join
you have an extra semi-colon in the subquery
I might suggest rewriting the query to use the following:
SELECT r.review_id, r.comment, r.reviewer_id, r.user_id,
rs.status, rs.timestamp
FROM reviews AS r
INNER JOIN review_status rs
ON r.review_id = rs.review_id
INNER JOIN
(
SELECT s.review_id, max(s.timestamp) MaxDate
FROM review_status AS s
WHERE s.status < 200
AND s.status > 99
GROUP BY s.review_id
) AS r_s
ON rs.review_id = r_s.review_id
AND rs.timestamp = r_s.MaxDate
WHERE r.user_id IN (100,120)
and rs.status < 200
AND rs.status > 99
See SQL Fiddle with Demo.
The main reason for the query to be written this way is because in your current query you are grouping by review_id but are returning the status. MySQL uses an extension to the GROUP BY clause that will allow items in the select list to be excluded being used in a GROUP BY or aggregate function but this could cause unexpected results. (see MySQL Extensions to GROUP BY)
From the MySQL Docs:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.
Try this:
SELECT r.*, r_s.*
FROM review_status r_s LEFT JOIN reviews r
ON r.review_id = r_s.review_id
WHERE r_s.user_id > 100 AND r_s.user_id < 120
ORDER BY r_s.timestamp DESC;
SELECT r.review_id, r.comment, r.reviewer_id, r.user_id, tt.status,tt.timestamp
FROM (
SELECT rs2.review_id,rs2.status,rs2.timestamp
FROM (
SELECT MAX(rs.timestamp) as mts
FROM reviews rr
JOIN review_status AS rs ON rs.review_id = rr.id
WHERE rs.status < 200 AND rs.status > 99
AND rr.user_id IN (100,120)
GROUP BY rs.review_id
) as t
JOIN review_status rs2 ON rs2.timestamp = t.mts
GROUP BY rs2.review_id #remove duplicate statuses with the same timestamp
) as tt
JOIN reviews as r ON r.id = tt.review_id
The user_id and status filters have to be in the innermost query to avoid selecting and join-ing the entire statuses table every time.
Here's my attempt with one JOIN and one correlated sub-query:
SELECT r.*, rs.*
FROM Reviews AS r
INNER JOIN Review_status AS rs ON r.review_id = rs.review_id
WHERE rs.status BETWEEN 99 AND 200 AND
r.user_id IN (100,120) AND
rs.timestamp = (SELECT MAX(timestamp) FROM Review_status
WHERE review_id = r.review_id
ORDER BY timestamp DESC)
ORDER BY r.review_id;
Its SQL Fiddle: http://sqlfiddle.com/#!2/02f18/6
Help please, I have a table like this:
| ID | userId | amount | type |
-------------------------------------
| 1 | 10 | 10 | expense |
| 2 | 10 | 22 | income |
| 3 | 3 | 25 | expense |
| 4 | 3 | 40 | expense |
| 5 | 3 | 63 | income |
I'm looking for a way to use one query and retrive the balance of each user.
The hard part comes when the amounts has to be added on expenses and substracted on incomes.
This would be the result table:
| userId | balance |
--------------------
| 10 | 12 |
| 3 | -2 |
You need to get each totals of income and expense using subquery then later on join them so you can subtract expense from income
SELECT a.UserID,
(b.totalIncome - a.totalExpense) `balance`
FROM
(
SELECT userID, SUM(amount) totalExpense
FROM myTable
WHERE type = 'expense'
GROUP BY userID
) a INNER JOIN
(
SELECT userID, SUM(amount) totalIncome
FROM myTable
WHERE type = 'income'
GROUP BY userID
) b on a.userID = b.userid
SQLFiddle Demo
This is easiest to do with a single group by:
select user_id,
sum(case when type = 'income' then amount else - amount end) as balance
from t
group by user_id
You could have 2 sub-queries, each grouped by id: one sums the incomes, the other the expenses. Then you could join these together, so that each row had an id, the sum of the expenses and the sum of the income(s), from which you can easily compute the balance.