MYSQL - facing difficulty combining two columns from two different tables - mysql

How can I find the restaurant name and the total number of orders for each in Jan 2021? The issue I'm facing is that the restaurant names and the orders are on separate tables as you can see from the code below.
create table orders (id integer, country text, customer_id integer,
restaurant_id INTEGER, date date, order_value integer);
create table customers (id integer, name text, country text);
create table restaurants (id integer, name text, country text);
INSERT INTO orders (
id,
country,
customer_id,
restaurant_id,
date,
order_value)
VALUES
(1, 'Pakistan', 1, 1, '2021-01-01', 400),
(2, 'Pakistan', 2, 1, '2021-01-01', 500),
(3, 'Pakistan', 4, 2, '2021-01-01', 300),
(4, 'Pakistan', 4, 3, '2021-01-05', 200),
(5, 'Pakistan', 5, 4, '2021-01-01', 250),
(6, 'Pakistan', 4, 1, '2021-01-09', 266),
(7, 'Pakistan', 3, 2, '2021-01-07', 322),
(1, 'Holland', 1, 1, '2021-01-01', 378),
(8, 'Pakistan', 1, 3, '2021-06-01', 289),
(2, 'Holland', 1, 1, '2021-08-01', 480),
(9, 'Pakistan', 1, 1, '2021-03-01', 580),
(10, 'Pakistan', 3, 2, '2021-07-01', 360),
(3, 'Holland', 1, 1, '2021-09-01', 550),
(11, 'Pakistan', 4, 3, '2021-04-01', 991),
(12, 'Pakistan', 5, 1, '2021-04-01', 875),
(4, 'Holland', 1, 1, '2021-03-02', 250),
(13, 'Pakistan', 1, 1, '2021-08-01', 150),
(14, 'Pakistan', 1, 2, '2021-09-01', 290),
(5, 'Holland', 1, 1, '2021-07-01', 240),
(15, 'Pakistan', 1, 3, '2021-03-01', 780),
(16, 'Pakistan', 1, 4, '2021-06-01', 987),
(6, 'Holland', 1, 1, '2021-05-03', 457),
(17, 'Pakistan', 1, 4, '2021-05-04', 258);
INSERT INTO customers (
id,
name,
country)
VALUES
(1, 'Steven Smith', 'Pakistan'),
(2, 'Arthur Chen', 'Holland'),
(3, 'Michael Wren', 'Pakistan'),
(4, 'John Almagro', 'Pakistan'),
(5, 'Luke Pablo', 'Pakistan'),
(6, 'Monty Tron', 'Pakistan');
INSERT INTO restaurants (
id,
name,
country)
VALUES
(1, 'KFC', 'Pakistan'),
(2, "McDonald's", 'Holland'),
(3, 'Howdy', 'Pakistan'),
(4, 'Kitchen Cuisine', 'Pakistan'),
(5, 'JFC', 'Pakistan'),
(6,'Hardees','Pakistan');
I learned about JOIN functions but I'm not able to join the dots.

Joining two table, is made by telling the database, which rows belong together. this is defined in the ON clause, where the joning columns are mentioned.
the WHERE clause is the same as in the last query it removes all rows that have not the right year and month.
The Group By has here three columns, because the restaurant_id has always the same value. We could also had added a aggregation function to the columns, which would have the same effect
SELECT
r.name,
r.country,
COUNT(*) Total_orders
FROM
orders o JOIN restaurants r ON o.restaurant_id = r.id
WHERE YEAR(`date`)= 2021 AND MONTH(`date`)= 1
GROUP BY restaurant_id,r.name,r.country
name | country | Total_orders
:-------------- | :------- | -----------:
KFC | Pakistan | 4
McDonald's | Holland | 2
Howdy | Pakistan | 1
Kitchen Cuisine | Pakistan | 1
SELECT
MAX(r.name) name,
MAX(r.country) country,
COUNT(*) Total_orders
FROM
orders o JOIN restaurants r ON o.restaurant_id = r.id
WHERE YEAR(`date`)= 2021 AND MONTH(`date`)= 1
GROUP BY restaurant_id
name | country | Total_orders
:-------------- | :------- | -----------:
KFC | Pakistan | 4
McDonald's | Holland | 2
Howdy | Pakistan | 1
Kitchen Cuisine | Pakistan | 1
db<>fiddle here

SELECT -- MySQL SELECT statement
MAX(r.name) `Hotel Name`, -- column name as Hotel Name from table restaurants, alias as r
COUNT(*) `Number Of Orders`, -- count all the records for table orders, alias o
SUM(o.order_value) `Total Order Value`, -- SUM all the order_values for match records
MAX(r.country) `Country` -- column country
FROM
orders o -- Running operations on table orders and set alias o
INNER JOIN restaurants r ON o.restaurant_id = r.id -- INNER JOIN second table restaurants as r and joining
-- two tables using field r.id (restaurants primary key id)
-- and o.restaurant_id (foreign key of restaurant's primary key id)
WHERE -- setting condition
YEAR(o. `date`) = 2021 -- Year must be 2021
AND MONTH(o. `date`) = 1 -- and month must be JAN or 1
GROUP BY -- group all same hotels id
o.restaurant_id;
| Hotel Name | Number Of Orders | Total Order Value | Country |
|-----------------|------------------|-------------------|----------|
| KFC | 4 | 1544 | Pakistan |
| McDonald's | 2 | 622 | Holland |
| Howdy | 1 | 200 | Pakistan |
| Kitchen Cuisine | 1 | 250 | Pakistan |
HINT -
To join the two different tables we'd need two columns which has same values, common or has some linking.
Here table orders has restaurant_id which is a foreign key of table restaurants (id). In other word, we can use those id to identify the restaurant details by querying table restaurants.
hence to join table orders and restaurants, we should use column id from table restaurants and column restaurant_id from table orders.
Now since orders table has multiple rows with the same restaurant_id; it's better group them together to make as buckets.
Once we use GROUP BY a column; MySQL group them in a bucket which has same values or given conditions.
Any aggregated statement like SUM, AVG, COUNT, MAX, MIN, etc. would take those individual buckets as logical table and perform the operations.

Related

Sum up query based on 2x join table

My sql looks like:
create table ad(
ad_id int,
ad_name varchar(10)
);
insert into ad(ad_id, ad_name) values
(1,'ad1'),
(2,'ad2'),
(3,'ad3');
create table ad_insight(
id int,
ad_id int,
date date,
clicks int
);
insert into ad_insight(id, ad_id, date, clicks) values
(1, 1, '2021-04-25', 1),
(2, 1, '2021-04-24', 4),
(3, 1, '2021-04-23', 2),
(4, 2, '2021-04-25', 6),
(5, 2, '2021-03-03', 7);
create table product(
product_id int,
product_name varchar(10)
);
insert into product(product_id, product_name) values
(1,'prod1'),
(2,'prod2'),
(3,'prod3'),
(4,'prod4'),
(5,'prod5');
create table product_insight(
id int,
product_id int,
sale int,
date date
);
insert into product_insight(id, product_id, sale, date) values
(1, 1, 12, '2021-04-25'),
(2, 1, 11, '2021-04-24'),
(3, 1, 13, '2021-04-23'),
(4, 1, 14, '2021-04-22'),
(5, 1, 17, '2021-04-21'),
(6, 1, 15, '2021-04-20'),
(7, 1, 13, '2021-04-19'),
(8, 2, 19, '2021-04-25');
create table ads_products(
ad_id int,
product_id int
);
insert into ads_products (ad_id, product_id) values
(1, 1),
(1, 2),
(2, 3),
(2, 4),
(1, 3);
Here you have fiddle
A quick explanation of schema:
I have ads:
each ad has insights, which tell us when a certain ad was active.
each ad has products(many2many - ads_products table). Each product has product_insight which tells us how many sales that product generated on a certain day.
And now I want to get the following table which will sum up clicks from ad_insight table and sum up product_sale from product_insight in 2021-04-23 to 2021-04-25 inclusive.
+----------+--------+--------------+--------------+
| ad_name | clicks | product_sale | products |
+----------+--------+--------------+--------------+
| ad1 | 7 | 55 | prod1, prod2 |
| ad2 | 6 | 0 | prod3, prod4 |
| ad3 | 0 | 36 | prod1 |
+----------+--------+--------------+--------------+
What I have tried?
select ad_name, SUM(ad_insight.clicks) as clicks
from ad
left join ad_insight on ad.ad_id = ad_insight.ad_id
where ad_insight.date >= '2021-04-23' and ad_insight.date <= '2021-04-25'
group by ad.ad_id;
But I do not know how to select product_sale table and products separated by a comma?

Sql sum up based on parent table

My sql looks like:
create table ad(
ad_id int,
ad_name varchar(10)
);
insert into ad(ad_id, ad_name) values
(1,'ad1'),
(2,'ad2'),
(3,'ad3');
create table ad_insight(
id int,
ad_id int,
date date,
clicks int
);
insert into ad_insight(id, ad_id, date, clicks) values
(1, 1, '2021-04-25', 1),
(2, 1, '2021-04-24', 4),
(3, 1, '2021-04-23', 2),
(4, 2, '2021-04-25', 6),
(5, 2, '2021-03-03', 7);
create table product(
product_id int,
ad_id int,
product_name varchar(10)
);
insert into product(product_id, ad_id, product_name) values
(1, 1, 'prod1'),
(2, 1, 'prod2'),
(3, 2, 'prod3'),
(4, 2, 'prod4');
(1, 3, 'prod1');
create table product_insight(
id int,
product_id int,
sale int,
date date
);
insert into product_insight(id, product_id, sale, date) values
(1, 1, 12, '2021-04-25'),
(2, 1, 11, '2021-04-24'),
(3, 1, 13, '2021-04-23'),
(4, 1, 14, '2021-04-22'),
(5, 1, 17, '2021-04-21'),
(6, 1, 15, '2021-04-20'),
(7, 1, 13, '2021-04-19'),
(8, 2, 19, '2021-04-25');
Here you have fiddle
A quick explanation of schema:
I have ads:
each ad has insights, which tell us when a certain ad was active.
each ad has products. Each product has product_insight which tells us how many sales that product generated on a certain day.
And now I want to get the following tables:
which will sum up clicks from ad_insight table and sum up product_sale from product_insight in 2021-04-23 to 2021-04-25 inclusive.
+----------+--------+--------------+--------------+
| ad_name | clicks | product_sale | products |
+----------+--------+--------------+--------------+
| ad1 | 7 | 55 | prod1, prod2 |
| ad2 | 6 | 0 | prod3, prod4 |
| ad3 | 0 | 36 | prod1 |
+----------+--------+--------------+--------------+
The summary row which will sum up everything in the above table:
+------------+--------------+--------------------+----------------------------+
| total_ads | total_clicks | total_product_sale | unique_all_products |
+------------+--------------+--------------------+----------------------------+
| 3 | 13| 91 | prod1, prod2, prod3, prod4 |
+------------+--------------+--------------------+----------------------------+
What I have tried?
# 1) table
select ad_name, SUM(ad_insight.clicks) as clicks
from ad
left join ad_insight on ad.ad_id = ad_insight.ad_id
where ad_insight.date >= '2021-04-23' and ad_insight.date <= '2021-04-25'
group by ad.ad_id;
# 2) table
select count(distinct ad_insight.ad_id) as total, SUM(ad_insight.clicks) as clicks
from ad_insight
left join ad on ad.ad_id = ad_insight.ad_id
where ad_insight.date >= '2021-04-23' and ad_insight.date <= '2021-04-25'
But I do not know how select product_sale table and products separated by comma!
If I understand correctly, you want to aggregate along two different dimensions (clicks and sales) for each ad. Aggregate before joining:
select ad.ad_name, ai.clicks, p.sales, p.products
from ad left join
(select ai.ad_id, sum(ai.clicks) as clicks
from ad_insight ai
where ai.date >= '2021-04-23' and ai.date <= '2021-04-25'
group by ai.ad_id
) ai
on ad.ad_id = ai.ad_id left join
(select p.ad_id, sum(pi.sale) as sales,
group_concat(distinct p.product_name) as products
from product p join
product_insight pi
on pi.product_id = p.product_id
where pi.date >= '2021-04-23' and pi.date <= '2021-04-25'
group by p.ad_id
) p
on p.ad_id = ad.ad_id ;
The second query just aggregates this again.

Select products and join categories hierarchical

I have two tables in my database:
create table category (id integer, name text, parent_id integer);
create table product (id integer, name text, category integer, description text);
insert into category
values
(1, 'Category A', null),
(2, 'Category B', null),
(3, 'Category C', null),
(4, 'Category D', null),
(5, 'Subcategory Of 1', 1),
(6, 'Subcategory Of 5', 5),
(7, 'Subcategory Of 5', 5),
(8, 'Subcategory of D', 4)
;
insert into product
values
(1, 'Product One', 5, 'Our first product'),
(2, 'Product Two', 6, 'Our second product'),
(3, 'Product Three', 8, 'The even better one');
How can I return like this:
product_id | product_name | root_category | category_path
-----------+--------------+---------------+-----------------------------
1 | Product One | 1 | /Category A/Subcategory Of 1
2 | Product Two | 1 | /Category A/Subcategory of 5/Subcategory of 6
I use "WITH RECURSIVE" in categories table but can't find the way to combine product table with 1 time query.
I use example from here
What's the best way to do this ?
Here you go, assumming you have MariaDB 10.2 or newer:
with recursive pt (root_id, id, path) as (
select id, id, concat('/', name) from category where parent_id is null
union all
select pt.root_id, c.id, concat(pt.path, '/', c.name)
from pt join category c on c.parent_id = pt.id
)
select p.id, p.name, pt.root_id, pt.path
from pt
join product p on pt.id = p.category;
Result:
id name root_id path
-- -------------- ------- ---------------------------------------------
1 Product One 1 /Category A/Subcategory Of 1
2 Product Two 1 /Category A/Subcategory Of 1/Subcategory Of 5
3 Product Three 4 /Category D/Subcategory of D

How to aggregate results in many to many Query

I have 3 tables in my database:
Student:
id
name
Student_Course:
student_id
course_id
Course:
id
grade
And I want to list all the students and the results of whether they have pass all of the course they have chosen . Assuming that grade <= 'C' is pass.
I tried sql like:
SELECT s.*,
IF('C'>=ALL(SELECT c.grade FROM from Course c WHERE c.id=sc.course_id),1,0) as isPass
FROM Student s LEFT JOIN Student_Course sc on sc.student_id=s.id
This sql works, but if now I want a column 'isGood' which means all the grade='A', do I need to execute the subquery again? How can I get both 'isGood' and 'isPass' by executing subquery only once?
I believe the grade would be better served in the junction table. Using that, I have created a scenario that might help you solve your question:
Scenario
create table student (id int, fullname varchar(50));
insert into student values (1, 'john'), (2, 'mary'), (3, 'matt'), (4, 'donald');
create table course (id int, coursename varchar(50));
insert into course values (1, 'math'), (2, 'science'), (3, 'business');
create table student_course (student_id int, course_id int, grade char(1));
insert into student_course values
(1, 1, 'C'), (1, 2, 'C'), (1, 3, 'C'),
(2, 1, 'A'), (2, 2, 'A'), (2, 3, 'A'),
(3, 1, 'A'), (3, 2, 'C'), (3, 3, 'C'),
(4, 1, 'A'), (4, 2, 'C'), (4, 3, 'F');
Query
select s.*, case when all_a_grades.student_id is not null then 'GOOD' else 'MEH' end as grades
from student s
left join (
-- find students who got A in all classes
select student_id, count(distinct ca.id) as aclasses, count(distinct sc.course_id) as allclasses
from student_course sc
left join (select id, 'A' as agrade from course) ca
on ca.id = sc.course_id and ca.agrade = sc.grade
group by student_id
having aclasses = allclasses
) all_a_grades on all_a_grades.student_id = s.id
where not exists (
-- let's make sure we filter OUT students who have failed
-- at least one course
select 1
from (
-- find students who have failed at least one course
select distinct student_id
from student_course
where grade not in ('A', 'B', 'C')
) t where t.student_id = s.id
)
Result
| id | fullname | grades |
| 1 | john | MEH |
| 2 | mary | GOOD |
| 3 | matt | MEH |

SQL for filtering

By referencing Collaborative filtering in MySQL? , I have created the following ones:
CREATE TABLE `ub` (
`user_id` int(11) NOT NULL,
`book_id` varchar(10) NOT NULL,
`rate` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`book_id`),
UNIQUE KEY `book_id` (`book_id`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
insert into ub values (1, 'A', '8'), (1, 'B', '7'), (1, 'C', '10');
insert into ub values (2, 'A', '8'), (2, 'B', '7'), (2, 'C', '10'), (2,'D', '8'), (2,'X', '7');
insert into ub values (3, 'X', '10'), (3, 'Y', '8'), (3, 'C', '10'), (3,'Z', '10');
insert into ub values (4, 'W', '8'), (4, 'Q', '8'), (4, 'C', '10'), (4,'Z', '8');
Then, I can able to get the following table and understand how it works.
create temporary table ub_rank as
select similar.user_id,count(*) rank
from ub target
join ub similar on target.book_id= similar.book_id and target.user_id != similar.user_id and target.rate= similar.rate
where target.user_id = 1
group by similar.user_id;
select * from ub_rank;
+---------+------+
| user_id | rank |
+---------+------+
| 2 | 3 |
| 3 | 1 |
| 4 | 1 |
+---------+------+
However, I start to be confused after the following code.
select similar.rate, similar.book_id, sum(ub_rank.rank) total_rank
from ub_rank
join ub similar on ub_rank.user_id = similar.user_id
left join ub target on target.user_id = 1 and target.book_id = similar.book_id and target.Rate= similar.Rate
where target.book_id is null
group by similar.book_id
order by total_rank desc, rate desc;
+---------+------------+
| book_id | total_rank |
+---------+------------+
| X | 4 |
| D | 3 |
| Z | 2 |
| Y | 1 |
| Q | 1 |
| W | 1 |
+---------+------------+
(1, 'A', '8'), (1, 'B', '7'), (1, 'C', '10');
(2, 'A', '8'), (2, 'B', '7'), (2, 'C', '10'), (2,'D', '8'), (2,'X', '7');
What I wanna do is that, suppose user 1 and 2 have similar behavior ( chosen A,B,C before with matched rating), thus I will recommend D to user A , as it has a higher rate.
Seems the code above not to do so? As, the first ranked is X. How can I change the code in order to achieve the goal mentioned?
Or, actually does the existing method is a better/more accuracy for recommendation?
The existing query is ranking the results based on the total value of rank for each book, and then using rate as a tie-break for books which have the same total rank. (Also, rate will essentially be random since similar.rate is not aggregated, grouped on or functionally dependent on a grouping item in the query.)
As such, X will be ranked higher than D because it has been chosen by one user of rank 3 and one user of rank 1, giving a total rank of 4, whereas D has only been chosen by one user of rank 3.
You could change the query to include a rating element weighted by ranking - for example:
select similar.book_id,
sum(ub_rank.rank) total_rank,
sum(ub_rank.rank*similar.rate) wtd_rate
from ub_rank
join ub similar on ub_rank.user_id = similar.user_id
left join ub target on target.user_id = 1 and target.book_id = similar.book_id and target.Rate= similar.Rate
where target.book_id is null
group by similar.book_id
order by wtd_rate desc, total_rank desc
- although in this case this will still rank X higher, as it has a rating of 7 from a user of rank 3 plus a rating of 10 from a user of rank 1, giving a total rank of 31, compared with D's total rank of 24.
(SQLFiddle here)
If you want X to rank higher than D, you need to decide what criteria you are going to use that would rank X higher than D.