I have the below MySQL table,
id customer Field_Name
1 C1 A
2 C1 B
3 C1 C
4 C1 D
5 C2 A
6 C2 D
7 C2 E
9 C3 B
10 C3 F
Customer "C1" has most number of fields (4) - A,B,C,D,
"C2" has 3 fields - A,D,E and
"C3" has 2 fields - B,F
Since customer "C1" has more fields, it should be taken first for comparing the customers
"C2" has A and D - "C1" has these two fields already and E is the only unique in "C2"
"C3" has B - "C1" has this field and F is only unique.
Similarly, it goes on...
I need to select distinct fields based on customers but based on customer with more number of fields.
Expected Result:
id customer Field_Name
1 C1 A
2 C1 B
3 C1 C
4 C1 D
7 C2 E
10 C3 F
If you are using MySQL 8+, then the problem is fairly tractable:
WITH cte1 AS (
SELECT *, COUNT(*) OVER (PARTITION BY customer) cnt
FROM customers
),
cte2 AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Field_Name ORDER BY cnt DESC) rn
FROM cte1
)
SELECT id, customer, Field_Name
FROM cte2
WHERE rn = 1;
Demo
In earlier versions of MySQL, it should be possible to achieve the same logic, but in general simulating ROW_NUMBER can be a pain.
Without window functions, for earlier versions of MySql, you can use NOT EXISTS:
select * from tablename t
where not exists (
select 1 from tablename tt
where tt.customer <> t.customer and tt.field_name = t.field_name and
(select count(*) from tablename where customer = tt.customer) >
(select count(*) from tablename where customer = t.customer)
)
See the demo.
Results:
| id | customer | field_name |
| --- | -------- | ---------- |
| 1 | C1 | A |
| 2 | C1 | B |
| 3 | C1 | C |
| 4 | C1 | D |
| 7 | C2 | E |
| 10 | C3 | F |
Related
I have a table like
id | start | Value | Value2 | Value3
1 | 2019-01-01 22:15:02 | A | P | C
2 | 2019-01-01 22:35:23 | B | O | G
4 | 2019-01-02 22:35:36 | C | D | H
5 | 2019-01-02 22:37:15 | D | C | F
7 | 2019-01-03 17:26:36 | C | K | M
10 | 2019-01-03 12:05:15 | D | J | L
I have a lot of records for the same day, but different time.
I need to select the latest of each day from a DateTime field.
It should return the records of IDs:
id: 2 for Jan 1
id: 5 for Jan 2nd
id: 7 for January 3rd
Tried without success:
SELECT value, value2, value3
FROM myTable AS mt
INNER JOIN (
SELECT id, MAX(start)
FROM myTable
GROUP BY start
) AS b ON mt.id = b.id
I get no errors, but the data are mixed up. It shows the latest dateTime value, but the rest of the fields (Value, value2, value3) are wrong. They don't match with the latest row.
There are several possible solutions:
SELECT mt.<columns>
FROM myTable AS mt
INNER JOIN (
SELECT DATE(start) as start_date, MAX(start) AS start
FROM myTable
GROUP BY DATE(start)
) AS b ON mt.start = b.start;
I like to use an exclusion join. Look for another row with a greater start datetime on the same date. The no such row exists, then mt must have the greatest time for a given date.
SELECT mt.<columns>
FROM myTable AS mt
LEFT OUTER JOIN myTable AS mt2
ON DATE(mt.start) = DATE(mt2.start) AND mt.start < mt2.start
WHERE mt2.start IS NULL;
You can also use a window function if you're using MySQL 8.0:
SELECT * FROM (
SELECT mt.<columns>,
ROW_NUMBER() OVER (PARTITION BY DATE(start) ORDER BY start DESC) AS rownum
FROM myTable AS mt
) AS b
WHERE b.rownum = 1;
There are a lot of questions dealing with max values but I can't find any that relate to this issue.
ID | Company | Result
----------------------
1 | 1 | A
2 | 1 | C
3 | 1 | B <--
4 | 2 | C
5 | 2 | B
6 | 2 | A <!--
7 | 3 | C
8 | 3 | A
9 | 3 | B <--
I need to output the Companies whose last Result (based on ID) was "B".
To further complicate the issue, the $query will be used this:
select * from table where Company in ($query)
Any ideas? Thanks!
On MySQL 8+, here is a query you may try using analytic functions:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Company ORDER BY ID DESC) rn
FROM yourTable
)
SELECT ID, Company, Result
FROM cte
WHERE rn = 1 AND Result = 'B';
Demo
On earlier versions of MySQL, we can try joining to a subquery which finds the most recent record for each company:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT Company, MAX(ID) AS MAX_ID
FROM yourTable
GROUP BY Company
) t2
ON t1.Company = t2.Company AND
t1.ID = t2.MAX_ID
WHERE
t1.Result = 'B';
Demo
I get a set of result as follows
C1 C2 C3
10 2 T
10 3 E
10 6 S
I want my SELECT query in such a way that resultant records may look like
C1 C2 C3
10 2 T
10 3 E
10 4
10 5
10 6 S
where there is a blank line for the missing records. Couldn't figure out the same.
Original query: select C1, C2,C3 from Table
If your mysql version upper than 8.0, you can try to use cte RECURSIVE make a calendar table then do outer join
Schema (MySQL v8.0)
CREATE TABLE T(
C1 int,
C2 int,
C3 varchar(5)
);
INSERT INTO T VALUES (10,2,'T');
INSERT INTO T VALUES (10,3,'E');
INSERT INTO T VALUES (10,6,'S');
Query #1
WITH RECURSIVE CTE AS (
SELECT C1,MIN(C2) minC2,MAX(C2) maxC2
FROM T
GROUP BY C1
UNION ALL
SELECT C1,minC2 +1,maxC2
FROM CTE
WHERE minC2+1 <= maxC2
)
SELECT t1.C1,t1.minC2,t2.C3
FROM CTE t1 LEFT JOIN T t2 on t1.minC2 = t2.C2
ORDER BY C1,minC2;
| C1 | minC2 | C3 |
| --- | ----- | --- |
| 10 | 2 | T |
| 10 | 3 | E |
| 10 | 4 | |
| 10 | 5 | |
| 10 | 6 | S |
View on DB Fiddle
You can create a table of sequential numbers in your database, and then use an outer join to fill in the missing row values for C2.
It will be very useful for other queries as well, and takes very little space.
CREATE TABLE Numbers (Number INTEGER PRIMARY KEY);
INSERT INTO Numbers (Number) VALUES (1),(2),(3),(4),(5),(6) ...
And then:
SELECT T.C1, N.Number AS C2, T.C3
FROM Numbers AS N LEFT OUTER JOIN T ON T.C2 = N.Number
WHERE N.Number BETWEEN (SELECT MIN(C2) FROM T) AND (SELECT MAX(C2) FROM T)
ORDER BY C2;
HTH
We have a group of patients in one table and we want to match each of them to a patient exactly like them in another table - but we want pairs of patients so we cannot match a patient to more than one other patient.
Left Outer Joins add every occurrence of a match - which matches patients to every other possible match - so we need some other approach.
We see lots of answers on SO about matching to the first row - but that leaves us with a single patient being matched to multiple other patients - not a pair like we need.
Is there any possible way to create pair matches without duplication between tables in Google Big Query? (Even if it takes multiple steps.)
ADDENDUM: Here are example tables. It would be great to see a SQL example using this.
Here is what is needed.
Example Source Tables:
Table A
PatientID Race Gender
1 A F
2 B M
3 A F
Table B
PatientID
4 A F
5 A F
6 B M
Results Table Desired:
Table C
A.PatientID B.PatientID_Match
1 4
2 6
3 5
CLARIFICATION: Patients in Table A must match patients from Table B. (They cannot match patients in their own table.)
select min (case tab when 'A' then patientID end) as A_patientID
,min (case tab when 'B' then patientID end) as B_patientID
from (select tab
,patientID
,rank() over (order by race,gender) r
,row_number() over (partition by tab,race,gender order by patientID) rn
from ( select 'A' as tab,A.* from A
union all select 'B' as tab,B.* from B
) t
) t
group by t.r
,t.rn
-- having count(*) = 2
;
+-------------+-------------+
| a_patientid | b_patientid |
+-------------+-------------+
| 3 | 5 |
+-------------+-------------+
| 2 | 6 |
+-------------+-------------+
| 1 | 4 |
+-------------+-------------+
The main idea -
Rows from both tables are divided to groups by their attributes (race,gender).
This is being done using the RANK function.
Within each group of attributes (race,gender) the rows are being ordered, per table, by their patientid .
+-----+-----------+------+--------+ +---+----+
| tab | patientid | race | gender | | r | rn |
+-----+-----------+------+--------+ +---+----+
+-----+-----------+------+--------+ +---+----+
| A | 1 | A | F | | 1 | 1 |
+-----+-----------+------+--------+ +---+----+
| B | 4 | A | F | | 1 | 1 |
+-----+-----------+------+--------+ +---+----+
+-----+-----------+------+--------+ +---+----+
| A | 3 | A | F | | 1 | 2 |
+-----+-----------+------+--------+ +---+----+
| B | 5 | A | F | | 1 | 2 |
+-----+-----------+------+--------+ +---+----+
+-----+-----------+------+--------+ +---+----+
| A | 2 | B | M | | 5 | 1 |
+-----+-----------+------+--------+ +---+----+
| B | 6 | B | M | | 5 | 1 |
+-----+-----------+------+--------+ +---+----+
In the final phase, the rows are being divided into groups (GROUP BY) by their RANK (r) and ROW_NUMBER (rn) values, which means each group has a row from each table (or only a single row if there is no matching row from the other table).
In many databases, a lateral join would be the way to go. In Google, you can use row_number(). The query looks something like this:
select p.*, pp.patient_id as other_patient_id
from patients p cross join
(select p.*,
row_number() over (partition by col1, col2, col3 order by col1) as seqnum
from patients p
) pp
where pp.seqnum = 1;
The columns in the partition by are the columns used for similarity.
SELECT
a.PatientID AS PatientID,
b.PatientID AS PatientID_Match
FROM (
SELECT PatientID, Race, Gender,
ROW_NUMBER() OVER(PARTITION BY Race, Gender) AS Pos
FROM TableA
) AS a
JOIN (
SELECT PatientID, Race, Gender,
ROW_NUMBER() OVER(PARTITION BY Race, Gender) AS Pos
FROM TableB
) AS b
ON a.Race = b.Race AND a.Gender = b.Gender AND a.Pos = b.Pos
Above will leave out those patients from TableA which either do not have match in TableB or potential match in TableB was already used as match for another patient in TableA (as per your we want pairs of patients so we cannot match a patient to more than one other patient. requirement)
To address Dudu's comments about NULL for attributes:
SELECT
a.PatientID AS PatientID,
b.PatientID AS PatientID_Match
FROM (
SELECT
PatientID, IFNULL(Race, 'null') AS Race, IFNULL(Gender, 'null') AS Gender,
ROW_NUMBER() OVER(PARTITION BY Race, Gender) AS Pos
FROM TableA
) AS a
JOIN (
SELECT
PatientID, IFNULL(Race, 'null') AS Race, IFNULL(Gender, 'null') AS Gender,
ROW_NUMBER() OVER(PARTITION BY Race, Gender) AS Pos
FROM TableB
) AS b
ON a.Race = b.Race AND a.Gender = b.Gender AND a.Pos = b.Pos
I have a table
id value
1 a
2 a
3 b
4 b
5 b
6 c
My id is primary.
I have total 2 a , 3 b and 1 c. So I want to count total repeat value in each primary id which matches on it
I want this format
id value_count
1 2
2 2
3 3
4 3
5 3
6 1
Try this query:
SELECT a.id, b.valueCnt
FROM tableA a
INNER JOIN (SELECT a.value, COUNT(a.value) valueCnt
FROM tableA a GROUP BY a.value) AS B ON a.value = b.value;
Check the SQL FIDDLE DEMO
OUTPUT
| ID | VALUECNT |
|----|----------|
| 1 | 2 |
| 2 | 2 |
| 3 | 3 |
| 4 | 3 |
| 5 | 3 |
| 6 | 1 |
Try This
select id, value_count from tablename as a1
join (select count(*) as value_count, value from tablename group by value) as a2
on a1.value= a2.value
I suggest you use a subselect without any joins:
SELECT
a.id,(SELECT COUNT(*) FROM tableA WHERE value = a.value) as valueCnt
FROM tableA a
Fiddle Demo
You need to use subquery.
SELECT table.id , x.value_count
FROM table
INNER JOIN
(SELECT t1.value, count(t1.id) as value_count
FROM table t1
Group by t1.value
) x on x.value = table.value