MySQL - condition on the joined row from from right table - mysql

I have two tables:
mysql> select * from orders;
+------+---------------------+------------+---------+
| id | created_at | foreign_id | data |
+------+---------------------+------------+---------+
| 1 | 2010-10-10 10:10:10 | 3 | order 1 |
| 4 | 2010-10-10 00:00:00 | 6 | order 4 |
| 5 | 2010-10-10 00:00:00 | 7 | order 5 |
+------+---------------------+------------+---------+
mysql> select * from activities;
+------+---------------------+------------+------+
| id | created_at | foreign_id | verb |
+------+---------------------+------------+------+
| 1 | 2010-10-10 10:10:10 | 3 | get |
| 2 | 2010-10-10 10:10:15 | 3 | set |
| 3 | 2010-10-10 10:10:20 | 3 | put |
| 4 | 2010-10-10 00:00:00 | 6 | get |
| 5 | 2010-10-11 00:00:00 | 6 | set |
| 6 | 2010-10-12 00:00:00 | 6 | put |
+------+---------------------+------------+------+
Now I need to join activities with orders on foreign_id column: select only one activity (if exists) for every order such that ABS(TIMESTAMPDIFF(SECOND, orders.created_at, activities.created_at)) is minimal. E.g. the order and the activity were created approximately at the same time.
+----------+---------+---------------------+-------------+------+---------------------+
| order_id | data | order_created_at | activity_id | verb | activity_created_at |
+----------+---------+---------------------+-------------+------+---------------------+
| 1 | order 1 | 2010-10-10 10:10:10 | 1 | get | 2010-10-10 10:10:10 |
| 4 | order 4 | 2010-10-10 00:00:00 | 4 | get | 2010-10-10 00:00:00 |
| 5 | order 5 | 2010-10-10 00:00:00 | NULL | NULL | NULL |
+----------+---------+---------------------+-------------+------+---------------------+
The following query produces set of rows that includes the desired rows. If GROUP BY statement is included then it's not possible to control which row from activities is joined.
SELECT o.id AS order_id
, o.data AS data
, o.created_at AS order_created_at
, a.id AS activity_id
, a.verb AS verb
, a.created_at AS activity_created_at
FROM orders AS o
LEFT JOIN activities AS a ON a.foreign_id = o.foreign_id;
Is it possible to write such a query? Ideally I'd like to avoid using group by because this section is a part of larger reporting querty.

Because both tables reference some mysterious foreign key there's potential for errors with the query below, but it may give you a principle which you can adapt for your purposes...
DROP TABLE IF EXISTS orders;
CREATE TABLE orders
(id INT NOT NULL PRIMARY KEY
,created_at DATETIME NOT NULL
,foreign_id INT NOT NULL
,data VARCHAR(20) NOT NULL
);
INSERT INTO orders VALUES
(1 ,'2010-10-10 10:10:10',3 ,'order 1'),
(4 ,'2010-10-10 00:00:00',6 ,'order 4'),
(5 ,'2010-10-10 00:00:00',7 ,'order 5');
DROP TABLE IF EXISTS activities;
CREATE TABLE activities
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,created_at DATETIME NOT NULL
,foreign_id INT NOT NULL
,verb VARCHAR(20) NOT NULL
);
INSERT INTO activities VALUES
(1,'2010-10-10 10:10:10',3,'get'),
(2,'2010-10-10 10:10:15',3,'set'),
(3,'2010-10-10 10:10:20',3,'put'),
(4,'2010-10-10 00:00:00',6,'get'),
(5,'2010-10-11 00:00:00',6,'set'),
(6,'2010-10-12 00:00:00',6,'put');
SELECT o.id order_id
, o.data
, o.created_at order_created_at
, a.id activity_id
, a.verb
, a.created_at activity_created_at
FROM activities a
JOIN orders o
ON o.foreign_id = a.foreign_id
JOIN
( SELECT a.foreign_id
, MIN(ABS(TIMEDIFF(a.created_at,o.created_at))) x
FROM activities a
JOIN orders o
ON o.foreign_id = a.foreign_id
GROUP
BY a.foreign_id
) m
ON m.foreign_id = a.foreign_id
AND m.x = ABS(TIMEDIFF(a.created_at,o.created_at))
UNION DISTINCT
SELECT o.id
, o.data
, o.created_at
, a.id
, a.verb
, a.created_at
FROM orders o
LEFT
JOIN activities a
ON a.foreign_id = o.foreign_id
WHERE a.foreign_id IS NULL;
;
+----------+---------+---------------------+-------------+------+---------------------+
| order_id | data | order_created_at | activity_id | verb | activity_created_at |
+----------+---------+---------------------+-------------+------+---------------------+
| 1 | order 1 | 2010-10-10 10:10:10 | 1 | get | 2010-10-10 10:10:10 |
| 4 | order 4 | 2010-10-10 00:00:00 | 4 | get | 2010-10-10 00:00:00 |
| 5 | order 5 | 2010-10-10 00:00:00 | NULL | NULL | NULL |
+----------+---------+---------------------+-------------+------+---------------------+

Related

How to select all records for only the first 50 distinct values in a column

I am trying to create a classifier model for a dataset, but I have too many distinct values for my target variable. If I run something like this:
Create or replace model `model_name`
options (model_type="AUTOML_CLASSIFIER", input_label_cols=["ORIGIN_AIRPORT"]) as
select DAY_OF_WEEK, ARRIVAL_TIME, ARRIVAL_DELAY, ORIGIN_AIRPORT
from `table_name`
limit 1000
I end up getting
Error running query
Classification model currently only supports classification with up to 50 unique labels and the label column had 111 unique labels.
So how can I select, for example, all rows that have one of the first 50 values of ORIGIN_AIRPORT?
Select * from “TABLE_NAME” as T1 left outer join (SELECT distinct
COLUMN_NAME from TABLE_NAME Order by COLUMN_NAME limit 50)as T2 on
T1.COLUMN_NAME=T2.COLUMN_NAME
This query will fetch you 50 distinct values in the inner query, then the outer query searches for those particular 50 distinct values using the T1.COLUMN_NAME=T2.COLUMN_NAME commands and returns all the records( it shows null for those not included in the 50 unique list)
Given a table of values (origin_airport), with unique identifiers (id) and date, find the minimum date for each unique value (origin_airport) to decide which N origin_airport values are to be returned.
Return all rows which match the first 3 unique origin_airport values (densely ranked, by min(date) per origin_airport).
Updated: to use columns that more closely match the model, with origin_airport and a date column for ordering.
Full working test case
The test data:
CREATE TABLE airportlogs (
origin_airport int
, id int primary key auto_increment
, date date DEFAULT NULL
);
INSERT INTO airportlogs (origin_airport) VALUES
( 1 )
, ( 1 )
, ( 8 )
, ( 8 )
, ( 8 )
, ( 7 )
, ( 7 )
, ( 6 )
, ( 5 )
, ( 4 )
, ( 3 )
, ( 3 )
, ( 7 )
, ( 7 )
, ( 1 )
, ( 8 )
, ( 3 )
, ( 1 )
;
-- Create some dates to use for ordering.
-- Ordering can be as complicated as we need.
UPDATE airportlogs SET date = current_date + INTERVAL +id DAY;
-- Intermediate calculation to show the MIN(date) per origin_airport
WITH nvals (origin_airport, mdate) AS (
SELECT origin_airport, MIN(date) AS mdate FROM airportlogs GROUP BY origin_airport
)
SELECT *
FROM nvals
ORDER BY mdate
;
+----------------+------------+
| origin_airport | mdate |
+----------------+------------+
| 1 | 2021-08-05 |
| 8 | 2021-08-07 |
| 7 | 2021-08-10 |
| 6 | 2021-08-12 |
| 5 | 2021-08-13 |
| 4 | 2021-08-14 |
| 3 | 2021-08-15 |
+----------------+------------+
-- Calculation of ordered rank for the unique origin_airport values
-- by MIN(date) per origin_airport.
WITH nvals0 (origin_airport, date, mdate) AS (
SELECT origin_airport
, date
, MIN(date) OVER (PARTITION BY origin_airport) AS mdate
FROM airportlogs
)
, nvals (origin_airport, date, mdate, r) AS (
SELECT origin_airport
, date
, mdate
, DENSE_RANK() OVER (ORDER BY mdate) AS r
FROM nvals0
)
SELECT *
FROM nvals
ORDER BY r, date
;
Result:
+----------------+------------+------------+---+
| origin_airport | date | mdate | r |
+----------------+------------+------------+---+
| 1 | 2021-08-05 | 2021-08-05 | 1 |
| 1 | 2021-08-06 | 2021-08-05 | 1 |
| 1 | 2021-08-19 | 2021-08-05 | 1 |
| 1 | 2021-08-22 | 2021-08-05 | 1 |
| 8 | 2021-08-07 | 2021-08-07 | 2 |
| 8 | 2021-08-08 | 2021-08-07 | 2 |
| 8 | 2021-08-09 | 2021-08-07 | 2 |
| 8 | 2021-08-20 | 2021-08-07 | 2 |
| 7 | 2021-08-10 | 2021-08-10 | 3 |
| 7 | 2021-08-11 | 2021-08-10 | 3 |
| 7 | 2021-08-17 | 2021-08-10 | 3 |
| 7 | 2021-08-18 | 2021-08-10 | 3 |
| 6 | 2021-08-12 | 2021-08-12 | 4 |
| 5 | 2021-08-13 | 2021-08-13 | 5 |
| 4 | 2021-08-14 | 2021-08-14 | 6 |
| 3 | 2021-08-15 | 2021-08-15 | 7 |
| 3 | 2021-08-16 | 2021-08-15 | 7 |
| 3 | 2021-08-21 | 2021-08-15 | 7 |
+----------------+------------+------------+---+
The final solution:
WITH min_date (origin_airport, date, mdate) AS (
SELECT origin_airport
, date
, MIN(date) OVER (PARTITION BY origin_airport) AS mdate
FROM airportlogs
)
, ranks (origin_airport, date, mdate, r) AS (
SELECT origin_airport
, date
, mdate
, DENSE_RANK() OVER (ORDER BY mdate) AS r
FROM min_date
)
SELECT *
FROM ranks
WHERE r <= 3
ORDER BY r, date
;
The final result:
+----------------+------------+------------+---+
| origin_airport | date | mdate | r |
+----------------+------------+------------+---+
| 1 | 2021-08-05 | 2021-08-05 | 1 |
| 1 | 2021-08-06 | 2021-08-05 | 1 |
| 1 | 2021-08-19 | 2021-08-05 | 1 |
| 1 | 2021-08-22 | 2021-08-05 | 1 |
| 8 | 2021-08-07 | 2021-08-07 | 2 |
| 8 | 2021-08-08 | 2021-08-07 | 2 |
| 8 | 2021-08-09 | 2021-08-07 | 2 |
| 8 | 2021-08-20 | 2021-08-07 | 2 |
| 7 | 2021-08-10 | 2021-08-10 | 3 |
| 7 | 2021-08-11 | 2021-08-10 | 3 |
| 7 | 2021-08-17 | 2021-08-10 | 3 |
| 7 | 2021-08-18 | 2021-08-10 | 3 |
+----------------+------------+------------+---+
There are a number of other solutions.
The poster didn't mention the logic for this ordering. But with the above window function behavior, that's trivial to specify.

Select top 2 scorers from each combination of 3 columns in MySQL

I have following tables and data:
player_scores
+----+-----------+---------------------+-------+
| id | player_id | created_at | score |
+----+-----------+---------------------+-------+
| 1 | 1 | 2020-01-01 01:00:00 | 20 |
| 2 | 1 | 2020-01-02 01:00:00 | 30 |
| 3 | 2 | 2020-01-01 01:00:00 | 20 |
| 4 | 3 | 2020-01-01 01:00:00 | 20 |
| 5 | 4 | 2020-05-01 01:00:00 | 40 |
| 6 | 5 | 2020-01-02 01:00:00 | 20 |
| 7 | 6 | 2020-01-01 01:00:00 | 20 |
| 8 | 7 | 2020-01-03 01:00:00 | 20 |
| 9 | 1 | 2020-03-01 01:00:00 | 20 |
+----+-----------+---------------------+-------+
players
+----+---------+-------------+----------+---------------------+---------+
| id | city_id | category_id | group_id | created_at | name |
+----+---------+-------------+----------+---------------------+---------+
| 1 | 1 | 1 | 1 | 2020-01-01 01:00:00 | Player1 |
| 2 | 1 | 2 | 1 | 2020-01-02 01:00:00 | Player2 |
| 3 | 2 | 2 | 1 | 2020-01-01 01:00:00 | Player3 |
| 4 | 2 | 1 | 1 | 2020-05-01 01:00:00 | Player4 |
| 5 | 3 | 1 | 1 | 2020-01-02 01:00:00 | Player5 |
| 6 | 4 | 2 | 1 | 2020-01-01 01:00:00 | Player6 |
| 7 | 3 | 1 | 1 | 2020-01-01 01:00:00 | Player7 |
| 8 | 4 | 2 | 1 | 2020-01-01 01:00:00 | Player8 |
+----+---------+-------------+----------+---------------------+---------+
cities
+----+------------+------------+
| id | country_id | name |
+----+------------+------------+
| 1 | 1 | London |
| 2 | 2 | Sydney |
| 3 | 2 | Melbourne |
| 4 | 3 | Toronto |
+----+------------+------------+
countries
+----+-----------+
| id | name |
+----+-----------+
| 1 | England |
| 2 | Australia |
| 3 | Canada |
+----+-----------+
categories
+----+------------+
| id | name |
+----+------------+
| 1 | Category 1 |
| 2 | Category 2 |
+----+------------+
groups
+----+---------+
| id | name |
+----+---------+
| 1 | Group 1 |
| 2 | Group 2 |
+----+---------+
SQL code to create tables and data:
CREATE TABLE players
(
id INT UNSIGNED auto_increment PRIMARY KEY,
city_id INT UNSIGNED NOT NULL,
category_id INT UNSIGNED NOT NULL,
group_id INT UNSIGNED NOT NULL,
created_at DATETIME NOT NULL,
name VARCHAR(255) NOT NULL
);
CREATE TABLE player_scores
(
id INT UNSIGNED auto_increment PRIMARY KEY,
player_id INT UNSIGNED NOT NULL,
created_at DATETIME NOT NULL,
score INT(10) NOT NULL
);
CREATE TABLE cities
(
id INT UNSIGNED auto_increment PRIMARY KEY,
country_id INT UNSIGNED NOT NULL,
name VARCHAR(255) NOT NULL
);
CREATE TABLE countries
(
id INT UNSIGNED auto_increment PRIMARY KEY,
name VARCHAR(255) NOT NULL
);
CREATE TABLE categories
(
id INT UNSIGNED auto_increment PRIMARY KEY,
name VARCHAR(255) NOT NULL
);
CREATE TABLE `groups`
(
id INT UNSIGNED auto_increment PRIMARY KEY,
name VARCHAR(255) NOT NULL
);
INSERT INTO players (id, city_id, category_id, group_id, created_at, name) VALUES (1, 1, 1, 1, '2020-01-01 01:00:00', 'Player1'),(2, 1, 2, 1, '2020-01-02 01:00:00', 'Player2'),(3, 2, 2, 1, '2020-01-01 01:00:00', 'Player3'),(4, 2, 1, 1, '2020-05-01 01:00:00', 'Player4'),(5, 3, 1, 1, '2020-01-02 01:00:00', 'Player5'),(6, 4, 2, 1, '2020-01-01 01:00:00', 'Player6'),(7, 3, 1, 1, '2020-01-01 01:00:00', 'Player7'),(8, 4, 2, 1, '2020-01-01 01:00:00', 'Player8');
INSERT INTO player_scores (id, player_id, created_at, score) VALUES (1, 1, '2020-01-01 01:00:00', 20), (2, 1, '2020-01-02 01:00:00', 30),(3, 2, '2020-01-01 01:00:00', 20),(4, 3, '2020-01-01 01:00:00', 20),(5, 4, '2020-05-01 01:00:00', 40),(6, 5, '2020-01-02 01:00:00', 20),(7, 6, '2020-01-01 01:00:00', 20),(8, 7, '2020-01-03 01:00:00', 20),(9, 1, '2020-03-01 01:00:00', 20);
INSERT INTO cities (id, country_id, name) VALUES (1,1,'London'), (2,2,'Sydney'), (3,2,'Melbourne'), (4,3,'Toronto');
INSERT INTO countries (id, name) VALUES (1,'England'),(2,'Australia'),(3,'Canada');
INSERT INTO categories (id, name) VALUES (1,'Category 1'),(2,'Category 2');
INSERT INTO `groups` (id, name) VALUES (1,'Group 1'),(2,'Group 2');
Relationship between 'players' and 'player_scores' is one-to-many. Also, for some players there might be no scores at all.
I have to return a one list of top 2 scorers from each combination of country, category and group. If there are no scores at all for a combination then no scorers are selected for that combination. If there is only one scorer within a combination then only one scorer is selected. If a player does not have any scores yet then it will not be selected.
If 2 or more players have the same scores within the combination, the earliest created player (created_at field within 'players' table) should be selected.
I use MySQL 5.7, therefore I cannot use window functions !
So, the result from the testing data above should be:
+-----------+--------------+---------------+------------+---------------------+---------------------+--------------------------+
| player.id | country.name | category.name | group.name | player.created_at | player_scores.score | player_scores.created_at |
+-----------+--------------+---------------+------------+---------------------+---------------------+--------------------------+
| 1 | England | Category 1 | Group 1 | 2020-01-01 01:00:00 | 20 | 2020-03-01 01:00:00 |
| 2 | England | Category 2 | Group 1 | 2020-01-02 01:00:00 | 20 | 2020-01-01 01:00:00 |
| 3 | Australia | Category 2 | Group 1 | 2020-01-01 01:00:00 | 20 | 2020-01-01 01:00:00 |
| 4 | Australia | Category 1 | Group 1 | 2020-05-01 01:00:00 | 40 | 2020-05-01 01:00:00 |
| 7 | Australia | Category 1 | Group 1 | 2020-01-01 01:00:00 | 20 | 2020-01-03 01:00:00 |
| 6 | Canada | Category 2 | Group 1 | 2020-01-01 01:00:00 | 20 | 2020-01-01 01:00:00 |
+-----------+--------------+---------------+------------+---------------------+---------------------+--------------------------+
So far, I have this query, but obviously it is far away from solution. I tried and searched for some hints, but could not find any so far:
SELECT players.*, player_scores.*, cities.*, countries.*, categories.*, groups.*
FROM players
LEFT JOIN cities
ON players.city_id = cities.id
LEFT JOIN countries
ON cities.country_id = country.id
LEFT JOIN categories
ON players.category_id = categories.id
LEFT JOIN groups
ON players.group_id = groups.id
LEFT JOIN player_scores
ON player_scores.player_id = players.id
AND player_scores.id IN (
SELECT MAX(ps.id)
FROM player_scores AS ps
JOIN players AS p
ON p.id = ps.player_id
GROUP BY p.id
)
INNER JOIN (
SELECT DISTINCT countries.id, groups.id, categories.id
FROM players
LEFT JOIN cities
ON players.city_id = cities.id
LEFT JOIN countries
ON cities.country_id = country.id
LEFT JOIN groups
ON players.group_id = groups.id
LEFT JOIN categories
ON players.category_id = categories.id
INNER JOIN player_scores
ON player_scores.player_id = players.id
WHERE player_scores.id IN (
SELECT MAX(ps.id)
FROM player_scores AS ps
JOIN players AS p
ON p.id = ps.player_id
GROUP BY p.id
)
GROUP BY countries.id, categories.id, groups.id
HAVING MAX(player_scores.score) > 0
) players2
ON countries.id = players2.country_id
AND categories.id = players2.category_id
AND groups.id = players2.group_id;
Any help will be highly appreciated.
UPDATE: Provided testing data and result table.
To recap, am I right in thinking that this is the intermediate result, from which we have to select a subset of results based upon the stated criteria?
SELECT p.name
, s.score
, c.name city
, x.name country
, y.name category
, g.name player_group
, p.created_at
FROM players p
JOIN player_scores s
ON s.player_id = p.id
JOIN cities c
ON c.id = p.city_id
JOIN countries x
ON x.id = c.country_id
JOIN categories y
ON y.id = p.category_id
JOIN groups g
ON g.id = p.group_id;
+---------+-------+-----------+-----------+------------+--------------+---------------------+
| name | score | city | country | category | player_group | created_at |
+---------+-------+-----------+-----------+------------+--------------+---------------------+
| Player1 | 20 | London | England | Category 1 | Group 1 | 2020-01-01 01:00:00 |
| Player1 | 30 | London | England | Category 1 | Group 1 | 2020-01-01 01:00:00 |
| Player4 | 40 | Sydney | Australia | Category 1 | Group 1 | 2020-05-01 01:00:00 |
| Player5 | 20 | Melbourne | Australia | Category 1 | Group 1 | 2020-01-02 01:00:00 |
| Player7 | 20 | Melbourne | Australia | Category 1 | Group 1 | 2020-01-01 01:00:00 |
| Player1 | 20 | London | England | Category 1 | Group 1 | 2020-01-01 01:00:00 |
| Player2 | 20 | London | England | Category 2 | Group 1 | 2020-01-02 01:00:00 |
| Player3 | 20 | Sydney | Australia | Category 2 | Group 1 | 2020-01-01 01:00:00 |
| Player6 | 20 | Toronto | Canada | Category 2 | Group 1 | 2020-01-01 01:00:00 |
+---------+-------+-----------+-----------+------------+--------------+---------------------+

How to get last added field from MySQL database JOIN

My database has two tables
MariaDB [testnotes]> describe contactstbl;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | int(6) | YES | | NULL | |
| name | varchar(30) | YES | | NULL | |
| phone | varchar(20) | YES | | NULL | |
| email | varchar(40) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
MariaDB [testnotes]> describe notestbl;
+-----------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+----------+------+-----+---------+-------+
| id | int(6) | YES | | NULL | |
| notes | blob | YES | | NULL | |
| dateadded | datetime | YES | | NULL | |
+-----------+----------+------+-----+---------+-------+
I want a query that will show the last notes in the notestbl table for the give ID
contactstbl has about 100ish records I want to show them all even without notes
MariaDB [testnotes]> select * from contactstbl;
+------+------+-------+--------+
| id | name | phone | email |
+------+------+-------+--------+
| 1 | fran | 12335 | gf#g.m |
| 2 | tony | 45355 | ck#g.m |
| 3 | samm | 46545 | fs#g.m |
+------+------+-------+--------+
MariaDB [testnotes]> select * from notestbl;
+------+------------------+---------------------+
| id | notes | dateadded |
+------+------------------+---------------------+
| 1 | 2 days ago notes | 2020-01-12 00:00:00 |
| 3 | 5 days ago notes | 2020-01-09 00:00:00 |
| 3 | 3 days ago notes | 2020-01-11 00:00:00 |
| 1 | 1 days ago notes | 2020-01-13 00:00:00 |
| 1 | 3 days ago notes | 2020-01-11 00:00:00 |
+------+------------------+---------------------+
5 rows in set (0.00 sec)
I have tried a couple different queries and just cannot seem to get it right.
SELECT c.id,c.name,c.email,n.id,n.dateadded,n.notes FROM contactstbl c left join notestbl n using(id) GROUP BY c.id ORDER BY n.dateadded ASC;
Which is very close.
+------+------+--------+------+---------------------+------------------+
| id | name | email | id | dateadded | notes |
+------+------+--------+------+---------------------+------------------+
| 2 | tony | ck#g.m | NULL | NULL | NULL |
| 3 | samm | fs#g.m | 3 | 2020-01-09 00:00:00 | 5 days ago notes |
| 1 | fran | gf#g.m | 1 | 2020-01-12 00:00:00 | 2 days ago notes |
+------+------+--------+------+---------------------+------------------+
What I want to see is:
+------+------+--------+------+---------------------+------------------+
| id | name | email | id | dateadded | notes |
+------+------+--------+------+---------------------+------------------+
| 2 | tony | ck#g.m | NULL | NULL | NULL |
| 3 | samm | fs#g.m | 3 | 2020-01-11 00:00:00 | 3 days ago notes |
| 1 | fran | gf#g.m | 1 | 2020-01-13 00:00:00 | 1 days ago notes |
+------+------+--------+------+---------------------+------------------+
Just use subquery in SELECT clause:
SELECT
c.id,
c.name,
c.email,
(SELECT n.id FROM notestbl n WHERE n.id=c.id ORDER BY n.dateadded DESC LIMIT 1) nid,
(SELECT n.dateadded FROM notestbl n WHERE n.id=c.id ORDER BY n.dateadded DESC LIMIT 1) ndateadded,
(SELECT n.notes FROM notestbl n WHERE n.id=c.id ORDER BY n.dateadded DESC LIMIT 1) nnotes
FROM
contactstbl c
GROUP BY c.id
ORDER BY ndateadded ASC;
Result:
MariaDB [test]> SELECT
-> c.id,
-> c.name,
-> c.email,
-> (SELECT n.id FROM notestbl n WHERE n.id=c.id ORDER BY n.dateadded DESC LIMIT 1) nid,
-> (SELECT n.dateadded FROM notestbl n WHERE n.id=c.id ORDER BY n.dateadded DESC LIMIT 1) ndateadded,
-> (SELECT n.notes FROM notestbl n WHERE n.id=c.id ORDER BY n.dateadded DESC LIMIT 1) nnotes
-> FROM
-> contactstbl c
-> GROUP BY c.id
-> ORDER BY ndateadded ASC;
+----+------+--------+------+---------------------+------------------+
| id | name | email | nid | ndateadded | nnotes |
+----+------+--------+------+---------------------+------------------+
| 2 | tony | ck#g.m | NULL | NULL | NULL |
| 3 | sam | fs#g. | 3 | 2020-01-11 00:00:00 | 3 days ago notes |
| 1 | fran | gf#g.m | 1 | 2020-01-13 00:00:00 | 1 days ago notes |
+----+------+--------+------+---------------------+------------------+
3 rows in set (0.07 sec)
SELECT C.ID,
C.NAME,
C.EMAIL,
N1.ID,
N1.DATEADDED,
N1.NOTES
FROM CONTACTSTBL C
LEFT JOIN NOTESTBL N1 USING(ID)
LEFT JOIN NOTESTBL N2 ON N1.ID = N2.ID
AND N1.DATEADDED < N2.DATEADDED
WHERE N2.ID IS NULL
ORDER BY N1.DATEADDED;
also try some ideas from here
how do I query sql for a latest record date for each user
First, I think that you should change the schema of your notestbl table as it doesn't have its own id field, but instead relies purely on the id of the contactstbl table. This is bad design and should be normalised so as to prevent you pain in the future :)
I'd recommend it is changed to something like this:
mysql> select * from notestbl;
+------+------------+------------------+---------------------+
| id | contact_id | notes | dateadded |
+------+------------+------------------+---------------------+
| 1 | 1 | 2 days ago notes | 2020-01-12 00:00:00 |
| 2 | 3 | 5 days ago notes | 2020-01-09 00:00:00 |
| 3 | 3 | 3 days ago notes | 2020-01-11 00:00:00 |
| 4 | 1 | 1 days ago notes | 2020-01-13 00:00:00 |
| 5 | 1 | 3 days ago notes | 2020-01-11 00:00:00 |
+------+------------+------------------+---------------------+
5 rows in set (0.00 sec)
Then you can use this single line query to get the result you're after:
select c.id, c.name, c.email, n.id, n.dateadded, n.notes from contactstbl c left join (select t1.id, t1.contact_id, t1.dateadded, t1.notes from notestbl t1, (select contact_id, max(dateadded) as maxdate from notestbl group by contact_id) t2 where t1.contact_id=t2.contact_id and t1.dateadded=t2.maxdate) n on c.id=n.contact_id;
+------+------+--------+------+---------------------+------------------+
| id | name | email | id | dateadded | notes |
+------+------+--------+------+---------------------+------------------+
| 1 | fran | gf#g.m | 4 | 2020-01-13 00:00:00 | 1 days ago notes |
| 2 | tony | ck#g.m | NULL | NULL | NULL |
| 3 | samm | fs#g.m | 3 | 2020-01-11 00:00:00 | 3 days ago notes |
+------+------+--------+------+---------------------+------------------+
3 rows in set (0.00 sec)
A more visually pleasing representation of the query:
select c.id,
c.name,
c.email,
n.id,
n.dateadded,
n.notes
from contactstbl c
left join (select t1.id,
t1.contact_id,
t1.dateadded,
t1.notes
from notestbl t1,
(select contact_id, max(dateadded) as maxdate from notestbl group by contact_id) t2
where t1.contact_id=t2.contact_id
and t1.dateadded=t2.maxdate) n
on c.id=n.contact_id;

Find an exact duplicate by join table field

I'm writing a query to find an exact duplicate of the order by its pruducts IDs.
The conditions to find a duplicate are:
1) Order has the same product count.
2) All product IDs are the same.
Tried something like this, but it didn't work:
SELECT
order.*,
count(same_products.id),
count(all_products.id)
FROM orders
LEFT JOIN products AS all_products ON all_products.order_id = orders.id
LEFT JOIN products AS same_products
ON same_products.order_id = orders.id AND same_products.id IN (30868, 30862)
GROUP BY orders.id
HAVING count(same_products.id) = 4 AND count(all_products.id = 4)
if you want count duplicated rows you should avoid the all columns (*) selector because if you have incremental values in your columns this don't let you find the duplicated .
SELECT
order.id
count(same_products.id),
count(all_products.id)
FROM orders
LEFT JOIN products AS all_products ON all_products.order_id = orders.id
LEFT JOIN products AS same_products
ON same_products.order_id = orders.id AND same_products.id IN (30868, 30862)
GROUP BY orders.id
HAVING count(same_products.id) >1 OR count(all_products.id )> 1
and for duplicated row you should check for count> 1 (for both the count)
and be careful with count(all_products.id =4) if you need to filter for this value you should add this to the on condition for the related table eg:
SELECT
order.id
count(same_products.id),
count(all_products.id)
FROM orders
LEFT JOIN products AS all_products ON all_products.order_id = orders.id and all_products.id =4
LEFT JOIN products AS same_products
ON same_products.order_id = orders.id AND same_products.id IN (30868, 30862)
GROUP BY orders.id
HAVING count(same_products.id) >1
I am not clear what you really mean by duplicate, I assume from your description that it's 2 orders with the same products. This seems a little simplistic for example given
MariaDB [sandbox]> select * from orders;
+------+---------------------+-------------+
| id | order_created | customer_id |
+------+---------------------+-------------+
| 1 | 2016-01-01 00:00:00 | 1 |
| 2 | 2016-02-01 00:00:00 | 1 |
| 3 | 2016-03-01 00:00:00 | 1 |
| 4 | 2016-01-01 00:00:00 | 2 |
| 5 | 2016-02-01 00:00:00 | 2 |
| 6 | 2016-01-01 00:00:00 | 3 |
| 10 | 2016-12-01 00:00:00 | 4 |
+------+---------------------+-------------+
7 rows in set (0.00 sec)
MariaDB [sandbox]> select * from order_details;
+----+---------+-----------+------+
| id | orderid | productid | qty |
+----+---------+-----------+------+
| 1 | 1 | 1213 | 10 |
| 2 | 1 | 9999 | 10 |
| 3 | 2 | 8888 | 10 |
| 4 | 3 | 1213 | 10 |
| 5 | 4 | 2222 | 10 |
| 6 | 5 | 9999 | 30 |
| 7 | 5 | 1213 | 30 |
| 8 | 6 | 9999 | 30 |
| 9 | 6 | 1213 | 30 |
+----+---------+-----------+------+
9 rows in set (0.00 sec)
select orders1.*,orders2.*,t.*
from
(select * from
(
select orderid o1orderid,group_concat(productid order by productid) o1grp,sum(qty) qty1
from order_details
group by orderid
) o1
join
(select orderid o2orderid,group_concat(productid order by productid) o2grp, sum(qty) qty2
from order_details
group by orderid
) o2
on o2grp = o1grp and qty2 = qty1 and o2orderid > o1orderid
) t
join orders orders1 on t.o1orderid = orders1.id
join orders orders2 on t.o2orderid = orders2.id
returns
+------+---------------------+-------------+------+---------------------+-------------+-----------+-----------+-------+-----------+-----------+-------+
| id | order_created | customer_id | id | order_created | customer_id | o1orderid | o1grp | qty1 | o2orderid | o2grp | qty2 |
+------+---------------------+-------------+------+---------------------+-------------+-----------+-----------+-------+-----------+-----------+-------+
| 5 | 2016-02-01 00:00:00 | 2 | 6 | 2016-01-01 00:00:00 | 3 | 5 | 1213,9999 | 30,30 | 6 | 1213,9999 | 30,30 |
+------+---------------------+-------------+------+---------------------+-------------+-----------+-----------+-------+-----------+-----------+-------+
1 row in set (0.03 sec)
But customer number is different.

help in forming mysql query to find free(available) venues/resources for a give date range

I have tables & data like this:
venues table contains : id
+----+---------+
| id | name |
+----+---------+
| 1 | venue 1 |
| 2 | venue 2 |
---------------
event_dates : id, event_id, event_from_datetime, event_to_datetime, venue_id
+----+----------+---------------------+---------------------+----------+
| id | event_id | event_from_datetime | event_to_datetime | venue_id |
+----+----------+---------------------+---------------------+----------+
| 1 | 1 | 2009-12-05 00:00:00 | 2009-12-07 00:00:00 | 1 |
| 2 | 1 | 2009-12-09 00:00:00 | 2009-12-12 00:00:00 | 1 |
| 3 | 1 | 2009-12-15 00:00:00 | 2009-12-20 00:00:00 | 2 |
+----+----------+---------------------+---------------------+----------+
This is my requirement: I want venues that will be free on 2009-12-06 00:00:00
i.e.
I should get
|venue_id|
|2 |
Currently I'm having the following query,
select ven.id , evtdt.event_from_datetime, evtdt.event_to_datetime
from venues ven
left join event_dates evtdt
on (ven.id=evtdt.venue_id)
where evtdt.venue_id is null
or not ('2009-12-06 00:00:00' between evtdt.event_from_datetime
and evtdt.event_to_datetime);
+----+---------------------+---------------------+
| id | event_from_datetime | event_to_datetime |
+----+---------------------+---------------------+
| 1 | 2009-12-09 00:00:00 | 2009-12-12 00:00:00 |
| 2 | 2009-12-15 00:00:00 | 2009-12-20 00:00:00 |
| 3 | NULL | NULL |
| 5 | NULL | NULL |
+----+---------------------+---------------------+
If you note the results, its not including venue id 1 where date is in between 2009-12-06 00:00:00 but showing other bookings.
Please help me correct this query.
Thanks in advance.
SELECT *
FROM venue v
WHERE NOT EXISTS
(
SELECT NULL
FROM event_dates ed
WHERE ed.venue_id = v.id
AND '2009-12-06 00:00:00' BETWEEN ed.event_from_datetime AND ed.event_to_datetime
)
or not ('2009-12-06 00:00:00' between evtdt.event_from_datetime
and evtdt.event_to_datetime);
12/6/2009 is between 12/5/09 and 12/7/09... that's why venue_id 1 is being excluded... what is it you're trying to extract from the data exactly?
The join query you've constructed says, take the venues table and for each row of it that has a matching venue_id make a copy of the venue table row and append the matching row. So if you just did:
select *
from venues ven
left join event_dates evtdt
on (ven.id=evtdt.venue_id);
It would yield:
+----+---------+------+----------+---------------------+---------------------+----------+
| id | name | id | event_id | event_from_datetime | event_to_datetime | venue_id |
+----+---------+------+----------+---------------------+---------------------+----------+
| 1 | venue 1 | 1 | 1 | 2009-12-05 00:00:00 | 2009-12-07 00:00:00 | 1 |
| 1 | venue 1 | 2 | 1 | 2009-12-09 00:00:00 | 2009-12-12 00:00:00 | 1 |
| 2 | venue 2 | 3 | 1 | 2009-12-15 00:00:00 | 2009-12-20 00:00:00 | 2 |
+----+---------+------+----------+---------------------+---------------------+----------+
If you then added your condition, which states the date of interest is not between the from and to date of the event, the query looks like:
select *
from venues ven
left join event_dates evtdt
on (ven.id=evtdt.venue_id)
where not ('2009-12-06' between evtdt.event_from_datetime and evtdt.event_to_datetime)
Which yields a result of:
+----+---------+------+----------+---------------------+---------------------+----------+
| id | name | id | event_id | event_from_datetime | event_to_datetime | venue_id |
+----+---------+------+----------+---------------------+---------------------+----------+
| 1 | venue 1 | 2 | 1 | 2009-12-09 00:00:00 | 2009-12-12 00:00:00 | 1 |
| 2 | venue 2 | 3 | 1 | 2009-12-15 00:00:00 | 2009-12-20 00:00:00 | 2 |
+----+---------+------+----------+---------------------+---------------------+----------+
These are my actual experimental results with your data in MySQL.
If you want to get the venue_ids that are free on the proposed date then you would write something like:
select ven.id, SUM('2009-12-06' between evtdt.event_from_datetime and evtdt.event_to_datetime) as num_intersects
from venues ven left join event_dates evtdt on (ven.id=evtdt.venue_id)
group by ven.id
having num_intersects = 0;
which yields:
+----+----------------+
| id | num_intersects |
+----+----------------+
| 2 | 0 |
+----+----------------+
this also comes up with the right answer (without modification) in the case where you have a venue with no events in the event_date table.
At a guess, if you remove not from
or not ('2009-12-06 00:00:00' between evtdt.event_from_datetime
and evtdt.event_to_datetime)
this will then return row 1 from event dates but not the other event date rows.
I say "at a guess" because your where clause is a bit hard to understand. Maybe you mean
select ven.id , evtdt.event_from_datetime, evtdt.event_to_datetime
from venues ven
left join event_dates evtdt
on (ven.id=evtdt.venue_id)
where '2009-12-06 00:00:00' between evtdt.event_from_datetime
and evtdt.event_to_datetime;