MySQL: Query for selecting unique patients - mysql

I'm stuck with a query not returning unique records.
I have following tables:
clinics (id => PK)
patients (id => PK, clinic_id => FK)
patient_visits(id => PK, patient_id => FK, clinic_id => FK)
A patient is registered to a clinic. A patient can visit to any clinic any number of times.
What I want is to return all unique patients who visited in a clinic.
I tried following query which is not returning unique records for a clinic
SELECT v.id
, v.patient_id
, v.clinic_id
, c.name clinic_name
, p.name
, p.mobile
, p.email
, p.gender
, p.created_at
, last_visit_date
, visit_count
FROM
( SELECT DISTINCT patient_id
, clinic_id
FROM patient_visits
) pat
JOIN patient_visits v
ON pat.patient_id = v.patient_id
JOIN clinics c
ON c.id = v.clinic_id
JOIN patients p
ON p.id = v.clinic_id
JOIN
( SELECT patient_id
, MAX(patient_visits.created_at) last_visit_date
, COUNT(patient_visits.created_at) visit_count
FROM patient_visits
GROUP
BY patient_id
) visits_aggregate
ON visits_aggregate.patient_id = p.id
WHERE v.clinic_id = ?
ORDER
BY visit_date
One problem I understand is if I join with patient_visits, it will pick matching duplicate patient_id, clinic_id combination.

You should refrain from JOINing to all the rows in patients_visits as it will cause the notorious combinatorial explosion leading to duplicate rows. You need an aggregate.
But, your example showed patient_visits.id. If you don't want duplicates in your result set for each visit, you cannot show that column; it has a different value for each visit.
You need an aggregate from the patient_visits table, like this:
SELECT patient_id, clinic_id,
MAX(created_at) last_visit_date,
COUNT(*) visit_count
FROM patient_visits
GROUP BY patient_id, clinic_id
That query contains one row per combination of patient and clinic, so you can JOIN it to your other tables without generating duplicate rows. Before you do that, run it separately to convince yourself it works correctly.
Then... use it in your query like this
select patients.id patient_id, clinics.id clinic_id,
clinics.name as clinic_name,
patients.name, patients.mobile, patients.email, patients.gender,
patients.created_at,
pv.last_visit_date, pv.visit_count
from patients
join ( SELECT patient_id, clinic_id,
MAX(created_at) last_visit_date,
COUNT(*) visit_count
FROM patient_visits
GROUP BY patient_id, clinic_id
) pv ON patients.id = pv.patient_id
join clinics ON pv.clinic_id = clinics.id
order by pv.last_visit_date
See how this works? You don't want all the visits, just an aggregate of them giving the date of the most recent one and the count.

I don't realy understand your complex query. Your Query should be straight forward:
select
pv.patient_id,
pv.clinic_id,
max(pv.visit) as last_visit,
count(*) as visit_count
from
patient_visit pv,
patients p
-- You may now here join the other tables
where
pv.patients_id = p.id
and pv.clinic_id = ?
group by
pv.patient_id
As I see this, your use of distinct is in the inner select. And you are selecting patient_visit twice creating a power set of the table getting all combinations.

Related

SQL How to select original(distinct) values from table without using distinct, group by and over keywords?

Currently I'm studying and I received task to write query (join 4 tables: people, goods, orders and order details). So main table Order_details has two columns: Order_id and Good_id, in order to make possible to have many goods in one order (e.g. order with id = 1 has 3 rows in Order_details table but has different goods primary keys in each row).
So the problem is that I don't know any other possible methods(besides using group by, distinct or over()) to receive only one row of each order in Order_details table (like I would get by using for example Distinct keyword). I'm receiving completely same rows of each order (with same Order_id and Good_id) but i don't know how to get only one row of each order.
Here's my query(so basically i need to select sum of all goods in order but i don't think that it really matters in my problem) and scheme (if it'll help)
By the way I'm working with MYSQL.
SELECT
Order_details.Order_id,
Orders.Date, People.First_name,
People.Surname,
(
SELECT SUM(Goods.Price * Order_details.Quantity)
FROM Order_details, Goods
WHERE Order_details.Good_id = Goods.Good_id
AND Order_details.Order_id = Orders.Order_id
) AS Total_price
FROM Order_details, Goods, Orders, People
WHERE Order_details.Order_id = Orders.Order_id
AND Order_details.Good_id = Goods.Good_id
AND Order_details.Order_id = Orders.Order_id
AND Orders.Person_id = People.Person_id
ORDER BY Order_id ASC;
I have tried several methods, but still cant figure it out. Maybe somehow it is possible with subquery? But i'm not sure...
(I have tried method with UNION but it's not the key as well)
Remove the Goods and Order_details tables from the FROM clause and the corresponding conditions in the WHERE clause. You are not selecting anything from it anyway, except the SUM in the subselect. The Order_id can be selected from the Orders table. The join is just causing multiple rows per order.
Also please don't join with comma. Use the JOIN .. ON syntax. This makes it easier to see if the join conditions are reasonable.
SELECT
Orders.Order_id
Orders.Date,
People.First_name,
People.Surname,
(
SELECT SUM(Goods.Price * Order_details.Quantity)
FROM Order_details
JOIN Goods ON Order_details.Good_id = Goods.Good_id
WHERE Order_details.Order_id = Orders.Order_id
) AS Total_price
FROM Orders
JOIN People ON Orders.Person_id = People.Person_id
ORDER BY Orders.Order_id ASC;
you can use row_number() for this kind of thing it will assign a row number based on your criteria and then you can just pick the rows where the value is 1.
with t as (SELECT
Order_details.Order_id,
Orders.Date, People.First_name,
People.Surname,
row_number() over (
partition by order_id, good_id
order by order_id, good_id) rn,
(
SELECT SUM(Goods.Price * Order_details.Quantity)
FROM Order_details, Goods
WHERE Order_details.Good_id = Goods.Good_id
AND Order_details.Order_id = Orders.Order_id
) AS Total_price
FROM Order_details, Goods, Orders, People
WHERE Order_details.Order_id = Orders.Order_id
AND Order_details.Good_id = Goods.Good_id
AND Order_details.Order_id = Orders.Order_id
AND Orders.Person_id = People.Person_id
ORDER BY Order_id ASC)
select * from t where rn = 1

Trying to make a new table by pulling data from two tables and keep getting 'Error: Every derived table must have its own alias' on this query

I have an 'Orders' table and a 'Records' table.
Orders table has the following columns:
order_id order_date seller order_price
Records table has the following columns:
order_id record_created_at record_log
I'm trying to pull and compile the following list of data but I keep getting an error message:
order_week
seller
total_num_orders
under100_count --this is the number of orders that were < $100
over100_count --this is the number of order that >= $100
approved --this is the number of orders that were approved by the payment platform
Here's my query:
SELECT order_week, seller, total_num_orders, under100_count, over100_count, approved
FROM (
SELECT
EXTRACT(WEEK FROM order_created_at) AS order_week,
merchant_name AS seller,
COUNT(merchant_name) AS total_num_orders,
SUM(DISTINCT total_order_price < 100) AS under100_count,
SUM(DISTINCT total_order_price >= 100) AS over100_count
FROM orders o
GROUP BY order_week, seller)
INNER JOIN (
SELECT
COUNT(DISTINCT o.order_id) AS approved
FROM records r
WHERE record_log = 'order approved'
GROUP BY order_id)
ON l.order_id = o.order_id;
What am I doing wrong?
The subquery in the join needs an alias. It also needs to return the order_id column, so it can be joined.
inner join ( select order_id, ... from records ... group by order_id) r --> here
on l.order_id = o.order_id
I would actually write your query as:
select
extract(week from o.order_created_at) as order_week,
o.merchant_name as seller,
count(*) as total_num_orders,
sum(o.total_order_price < 100) as under100_count,
sum(o.total_order_price >= 100) as over100_count,
sum(r.approved) approved
from orders o
inner join (
select order_id, count(*) approved
from records r
where record_log = 'order approved'
group by order_id
) r on r.order_id = o.order_id;
group by order_week, seller, approved
Rationale:
you don't want, and need, distinct in the aggregate functions here; it is inefficient, and might even yield wrong results
count(*) is more efficient count(<expression>) - so, use it, unless you know why you are doing otherwise
I removed an unecessary level of nesting
If there are orders without records, you might want a left join instead.

fetching values from database which are not from specific month

I am trying to fetch hotel id, hotel name and hotel type of hotels which has not taken any orders in the month of 'MAY 19' but i am not getting proper output what is wrong in my query?
select hotel_details.hotel_id,hotel_name,hotel_type
from hotel_details inner join orders on hotel_details.hotel_id=orders.hotel_id
where Month(order_date) between 1 and 4 or Month(order_date) between 6 and 12
order by hotel_id;
You can use the following, using NOT EXISTS to check if there is any order for the hotel in May 2019:
SELECT hotel_id, hotel_name, hotel_type
FROM hotel_details
WHERE NOT EXISTS (
SELECT 1
FROM orders
WHERE hotel_id = hotel_details.hotel_id
AND MONTH(order_date) = 5
AND YEAR(order_date) = 2019
)
The sub-query on EXISTS checks if the hotel_id is available in orders on May 2019. Using NOT in front of EXISTS filters all hotels which have no order in May 2019. The sub-query is connected to the outer part of the query with hotel_id = hotel_details.hotel_id.
Here's a standard, if somewhat old-fashioned approach...
(I've assumed a column name on the orders table, but you can change it to any non-nullable orders column, if it's wrong)
SELECT d.hotel_id
, d.hotel_name
, d.hotel_type
FROM hotel_details d
LEFT
JOIN orders o
ON d.hotel_id = o.hotel_id
AND d.order_date >= '2019-05-01'
AND d.order_date < '2019-06-01'
WHERE o.id IS NULL
ORDER
BY d.hotel_id;
For next time, see: Why should I provide an MCRE for what seems to me to be a very simple SQL query?
SELECT HOTEL_ID,HOTEL_NAME,HOTEL_TYPE FROM HOTEL_DETAILS
WHERE HOTEL_ID NOT IN
(SELECT HOTEL_ID FROM ORDERS
WHERE MONTH(ORDER_DATE) = 5)
ORDER BY HOTEL_ID ASC;
Here in the below sub query we are trying to obtain the HOTEL_ID(s) which have placed order in the month of May using the MONTH function. In outer query which receives a list of HOTEL_ID(s) which have an ordered in the month of may. Now the NOT IN condition omits the HOTEL_ID present in the list and displays the other HOTEL_ID which have not ordered in the month of May.
SELECT DISTINCT h.hotel_id,
h.hotel_name,
h.hotel_type
FROM hotel_details h
WHERE h.hotel_id NOT IN (SELECT od.hotel_id
FROM orders od
WHERE ( h.hotel_id = od.hotel_id
AND Month(order_date) = 05 )
GROUP BY h.hotel_id
ORDER BY h.hotel_id ASC);
Use Nested Queries:
SELECT hotel_id, hotel_name, hotel_type
FROM hotel_details
WHERE hotel_id NOT IN (
SELECT DISTINCT hotel_id
FROM orders
WHERE order_date BETWEEN '2019-05-01' AND '2019-05-31'
)
ORDER BY hotel_id;
Explanation :
In the Inner Query, we are selecting distinct hotel IDs from the order table with orders between May 1 and May 31.
Once we have list of hotel IDs, in the outer query we can display the required columns of the hotel table which have IDs not in the list.

how to get the latest row from the table for each vendor?

Query result
Procurement table
My Query is not giving me what i want to get,
SELECT p.procid
, p.procdate
, p.vendor
, s.sup_name
, p.creditamount
, p.image
FROM procurement as p
, supplier as s
WHERE p.vendor = s.sid
GROUP
BY sid
ORDER
BY p.procid ASC
Query is giving me the first entry in the table for each vendor, while i want to get the last entry for each vendor in the procurement table(the required one's are highlighted in the image), any input will be appreciated, thanks in advance.
You can use a correlated sub-query
select t2.*,s.sup_name from
(
select t.* from procurement t
where t.procid in
(
select max(procid)
from procurement t1
where t1.vendor=t.vendor
)
) as t2 join supplier as s on t2.vendor = s.sid

Select from one table with count from two other tables

I'm doing a query where i select all from a table called employee and want to do a count of employee_id from two other tables and represent the count in 2 seperate columns.
The tables:
employee [id, etc.]
report [id, employee_id, etc.]
office_report[id, employee_id, etc.]
What i did so far is:
SELECT emp.*, COUNT(rep.id ) no_of_field_reports, COUNT(of_rep.id) no_of_office_reports
FROM employee emp
LEFT JOIN report rep
ON (emp.id = rep.employee_id)
LEFT JOIN office_report of_rep
ON (emp.id = of_rep.employee_id)
WHERE emp.user_id =7 AND emp.active = 1
GROUP BY emp.id, emp.name
ORDER BY emp.name ASC
The problem is, as soon as i have reports in BOTH report tables the count messes up. Say i have 16 reports in report table and 2 in office_report table, the count for no_of_field_reports and no_of_office_reports will become 32.
Im missing something obviously but as I'm not a SQL genius I can't figure out what.
Please make sure to explain what is causing the problem so I'm able to learn from my mistakes and get a better understanding of these type of queries as this is not going to be the last time.
I guess the answer will be the same for mariaDB, mySQL, and SQL in general so i added all those tag's for the sake of attention..
Possibly one approach if you're after distinct counts ( though you may need to adjust to the PK field)
SELECT emp.*,
COUNT(distinct rep.id ) no_of_field_reports, --may need to be on Unique key instead
COUNT(distinct of_rep.id) no_of_office_reports --may need to be on Unique key instead)
FROM employee emp
LEFT JOIN report rep
ON (emp.id = rep.employee_id)
LEFT JOIN office_report of_rep
ON (emp.id = of_rep.employee_id)
WHERE emp.user_id =7 AND emp.active = 1
GROUP BY emp.id, emp.name
ORDER BY emp.name ASC
An approach getting the counts before the joins if you're not after a distinct count then this is likely the right approach and offers flexibility.
SELECT emp.*, rep.cnt, of_Rep.cnt
FROM employee emp
LEFT JOIN (SELECT count(ID) cnt , employee_ID
FROM REPORT
GROUP BY employee_ID) rep
ON (emp.id = rep.employee_id)
LEFT JOIN (SELECT count(ID) cnt, Employee_ID
FROM office_report
GROUP BY employee_ID) of_rep
ON (emp.id = of_Rep.employee_id)
WHERE emp.user_id =7 AND emp.active = 1
GROUP BY emp.id, emp.name
ORDER BY emp.name ASC
or use of correlated queries (but not supported all the time Such as when creating materialized views from this SQL)
SELECT emp.*,
(SELECT count(ID)
FROM REPORT
WHERE emp.id = rep.employee_id) Report_Cnt,
(SELECT count(ID)
FROM office_report of_REP
WHERE emp.id = of_Rep.employee_id) of_Rep_Cnt
FROM employee emp
WHERE emp.user_id =7 AND emp.active = 1
GROUP BY emp.id, emp.name
ORDER BY emp.name ASC