I have a question on join two table
TWD_CSD_NEWS_DETAIL (200 million row):
+-----------+------------+--------------+------------------+
| CSD_ID | CSD_ID_DRI | CSD_PARTY_ID | CSD_PARTY_AMOUNT |
+-----------+------------+--------------+------------------+
| 1 | 1 | 1183 | 27870 |
+-----------+------------+--------------+------------------+
| 2 | 1 | 1723 | 12 |
+-----------+------------+--------------+------------------+
| 3 | 1 | 1243 | 87474 |
+-----------+------------+--------------+------------------+
.
.
.
+-----------+------------+--------------+------------------+
| 18575622 | 8881 | 1183 | 27870 |
+-----------+------------+--------------+------------------+
the result of SHOW CREATE TABLE TWD_CSD_NEWS_DETAIL:
CREATE TABLE `TWD_CSD_NEWS_DETAIL` (
`CSD_ID` int(11) NOT NULL AUTO_INCREMENT,
`CSD_ID_CREATED_BY` int(11) DEFAULT NULL,
`CSD_DT_CREATED` datetime DEFAULT NULL,
`CSD_DT_UPD` datetime DEFAULT NULL,
`CSD_ID_DRI` int(11) DEFAULT NULL,
`CSD_ID_UPD_BY` int(11) DEFAULT NULL,
`CSD_PARTY_ID` int(11) DEFAULT NULL,
`CSD_AMOUNT` decimal(26,0) DEFAULT NULL,
`CSD_TIMESTAMP` datetime DEFAULT NULL,
PRIMARY KEY (`CSD_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=184035984 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
TWD_DRI_NEWS_RESULT_HEADER (1 million row) :
+--------+---------------------+----------------+
| DRI_ID | DRI_DATE | DRI_SYM_SYMBOL |
+--------+---------------------+----------------+
| 1 | 2011-11-08 00:00:00 | 1 |
+--------+---------------------+----------------+
| 2 | 2011-11-08 00:00:00 | 2 |
+--------+---------------------+----------------+
| 3 | 2011-11-08 00:00:00 | 3 |
+--------+---------------------+----------------+
.
.
+--------+---------------------+----------------+
| 10001 | 2011-11-11 00:00:00 | 8881 |
+--------+---------------------+----------------+
the result of SHOW CREATE TABLE TWD_DRI_NEWS_RESULT_HEADER :
CREATE TABLE `TWD_DRI_NEWS_RESULT_HEADER` (
`DRI_ID` int(11) NOT NULL AUTO_INCREMENT,
`DRI_DATE` datetime DEFAULT NULL,
`DRI_SYM_SYMBOL` int(11) DEFAULT NULL,
`DRI_TIMESTAMP` datetime DEFAULT NULL,
PRIMARY KEY (`DRI_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=1592193 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
I try to join them with following sql, it works but it will take very long to completed this query when i keep adding csd_id range in where cluase
SELECT
csd.CSD_ID, csd.CSD_ID_DRI, csd.CSD_PARTY_ID, csd.CSD_AMOUNT , dri.DRI_DATE, dri.DRI_SYM_TICKER
FROM
TWD_CSD_NEWS_DETAIL csd
LEFT JOIN
TWD_DRI_NEWS_RESULT_HEADER dri ON dri.DRI_ID = csd.CSD_ID_DRI
WHERE
(
(
( csd_id between 1 and 426029)
|| ( csd_id between 426030 and 851977)
|| ( csd_id between 851978 and 1277890)
..
...
...
)
AND dri.DRI_SYM_SYMBOL = 1
)
do i need create another view to contain result or any faster method to query this? i tried with the range between 1 and 200000000 ther duration and fetch time require 0.197 seconds / 26 seconds
Have you tried using a single range for your query?
SELECT
csd.CSD_ID, csd.CSD_ID_DRI, csd.CSD_PARTY_ID, csd.CSD_SHAREHOLDING ,
dri.DRI_SHAREHOLDING_DATE, dri.DRI_SYM_TICKER
FROM
TWD_CSD_NEWS_DETAIL csd
LEFT JOIN
TWD_DRI_NEWS_RESULT_HEADER dri ON dri.DRI_ID = csd.CSD_ID_DRI
WHERE
csd_id between (1 and 1277890) AND dri.DRI_SYM_SYMBOL = 1;
Related
I'm looking for some help with a SQL/MySQL problem.
I have three source tables:
CREATE TABLE `customers` (
`cid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`customer_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`cid`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8
CREATE TABLE `standards` (
`sid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`standard_name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`sid`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8
CREATE TABLE `partial_standard_compliance` (
`customer` bigint(20) unsigned NOT NULL,
`standard` bigint(20) unsigned NOT NULL,
`standard_compliance` bigint(20) unsigned DEFAULT NULL,
`created_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8
The idea is a customer gives themselves a rating using the standard_compliance column in the partial_standard_compliance table.
Customers can rate the same standard multiple times.
Result example:
+----------+----------+---------------------+---------------------+
| customer | standard | standard_compliance | created_time |
+----------+----------+---------------------+---------------------+
| 1 | 1 | 50 | 2023-01-28 16:19:34 |
| 1 | 1 | 60 | 2023-01-28 16:19:40 |
| 1 | 1 | 70 | 2023-01-28 16:19:48 |
| 2 | 10 | 30 | 2023-01-28 16:58:21 |
| 2 | 8 | 60 | 2023-01-28 16:58:32 |
| 2 | 9 | 60 | 2023-01-28 16:58:39 |
| 2 | 9 | 80 | 2023-01-28 16:58:43 |
+----------+----------+---------------------+---------------------+
I need to create a 4th table that has customer name, standard name and the most recent rating they have given themselves.
I have been trying with JOINS and CREATE AS SELECT, but haven't been able to solve it.
Any point in the right direction would be great. Thanks.
I have been trying with JOINS and CREATE AS SELECT
I need to create a 4th table that has customer name, standard name and
the most recent rating they have given themselves
Would be better if you create a view instead.
create view fourth_table as
select customer_name ,
standard_name ,
standard_compliance,
created_time
from (select c.customer_name,
s.standard_name,
psc.standard_compliance,
psc.created_time,
row_number() over(partition by c.customer_name order by psc.created_time desc ) as rn
from customers c
inner join partial_standard_compliance psc on psc.customer=c.cid
inner join standards s on s.sid=psc.standard
) x
where rn=1;
https://dbfiddle.uk/ZiK-k8jN
MySQL View
I have a table with nested children. I'm trying to fetch a list of parents sorted by the most recent child, when available, otherwise the parent's created date. My query seemed to work at first, but as I started importing more and more records (#13.6K atm), performance has become a problem.
Version: 10.5.5-MariaDB
Table structure (excluded fields for brevity):
CREATE TABLE `emails` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NULL DEFAULT NULL,
`_lft` int(10) unsigned NOT NULL DEFAULT 0,
`_rgt` int(10) unsigned NOT NULL DEFAULT 0,
`parent_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `emails__lft__rgt_parent_id_index` (`_lft`,`_rgt`,`parent_id`) USING BTREE,
KEY `emails__lft__rgt_created_at_index` (`_lft`,`_rgt`,`created_at`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=13484 DEFAULT CHARSET=utf8
Here's the query I'm working with (#21s):
SELECT
`emails`.`id`,
(
SELECT MAX(`descendants`.`created_at`) AS `created_at`
FROM `emails` AS `descendants`
WHERE `descendants`.`_lft` >= `emails`.`_lft`
AND `descendants`.`_rgt` <= `emails`.`_rgt`
) `descendants_created_at`
FROM `emails`
WHERE `parent_id` IS NULL
ORDER BY `descendants_created_at` DESC
LIMIT 25 OFFSET 0;
The _lft and _rgt fields are provided by the lazychaser/laravel-nestedset package and are essentially giving me the descendants for each of the records returned in the main query. It includes the parent as well, so a created_at value is always returned.
Sample output:
| id | created_at | descendants_created_at |
|-------|---------------------|------------------------|
| 13483 | 2021-07-22 12:35:55 | 2021-07-22 12:35:55 |
| 8460 | 2021-04-29 12:56:57 | 2021-07-22 12:35:00 |
| 13481 | 2021-07-22 12:33:22 | 2021-07-22 12:33:22 |
| 3514 | 2021-01-16 09:43:42 | 2021-07-22 12:23:28 |
| 13479 | 2021-07-22 11:28:07 | 2021-07-22 11:28:07 |
| 13478 | 2021-07-22 11:27:09 | 2021-07-22 11:27:09 |
| 13407 | 2021-07-21 10:05:41 | 2021-07-22 10:21:14 |
| 13408 | 2021-07-21 10:05:41 | 2021-07-22 10:21:14 |
| 13389 | 2021-07-21 08:17:23 | 2021-07-22 10:21:14 |
| 13303 | 2021-07-19 14:25:38 | 2021-07-22 10:21:14 |
The problem seems to be once I'm doing the actual ordering here:
ORDER BY `descendants_created_at` DESC
My EXPLAIN looks like this:
UPDATE #1 - Using a LEFT JOIN & adding a parent_id key, this query is now #10s which is better, but still not great:
https://dbfiddle.uk/?rdbms=mariadb_10.5&fiddle=f5442fdfba119cc750c09a19024ccf7c
I have four tables:
CREATE TABLE `A` (
`AID` bigint(20) NOT NULL AUTO_INCREMENT,
`Name` varchar(150) DEFAULT NULL,
PRIMARY KEY (`AID`)
);
CREATE TABLE `B` (
`BID` bigint(20) NOT NULL AUTO_INCREMENT,
`DtStart` datetime DEFAULT NULL,
`DtStop` datetime DEFAULT NULL,
`AID` bigint(20) DEFAULT NULL,
`CID` bigint(20) DEFAULT NULL,
PRIMARY KEY (`BID`)
);
CREATE TABLE `C` (
`CID` bigint(20) NOT NULL AUTO_INCREMENT,
`FLAGS` smallint(6) DEFAULT NULL,
PRIMARY KEY (`CID`)
);
CREATE TABLE `D` (
`DID` bigint(20) NOT NULL AUTO_INCREMENT,
`CID` bigint(20) DEFAULT NULL,
`Name` varchar(150) DEFAULT NULL,
PRIMARY KEY (`DID`)
);
I'm feeding data like this,
INSERT INTO A (Name) VALUES ("First");
INSERT INTO C (FLAGS) VALUES (1);
INSERT INTO B (DtStart, DtStop, AID, CID) VALUES ("2016-09-07", "2017-09-07", 1, 1);
INSERT INTO D (CID, Name) VALUES (1, "Alan");
INSERT INTO C (FLAGS) VALUES (2);
INSERT INTO B (DtStart, DtStop, AID, CID) VALUES ("2016-09-15", "2017-09-23", 1, 2);
INSERT INTO D (CID, Name) VALUES (2, "John");
When I hit the query:
SELECT
A.Name as Object, B.DtStart, B.DtStop, C.FLAGS, D.Name as User
FROM A
LEFT JOIN B ON B.AID=A.AID
LEFT JOIN C ON C.CID=B.CID
LEFT JOIN D ON D.CID=C.CID
WHERE "2017-09-01" <= B.DtStop AND "2017-10-01" > B.DtStart;
I get the result:
+--------+---------------------+---------------------+-------+------+
| Object | DtStart | DtStop | FLAGS | User |
+--------+---------------------+---------------------+-------+------+
| First | 2016-09-07 00:00:00 | 2017-09-07 00:00:00 | 1 | Alan |
| First | 2016-09-15 00:00:00 | 2017-09-23 00:00:00 | 2 | John |
+--------+---------------------+---------------------+-------+------+
2 rows in set (0.00 sec)
How could I find all the rows which also include gaps for the given date range?
In my example, I'm looking for the report (Sep 1, 2017 to Sep 30, 2017) so I want result like this:
+--------+---------------------+---------------------+-------+------+
| Object | DtStart | DtStop | FLAGS | User |
+--------+---------------------+---------------------+-------+------+
| First | 2016-09-07 00:00:00 | 2017-09-07 00:00:00 | 1 | Alan |
| First | 2017-09-08 00:00:00 | 2017-09-14 00:00:00 | NULL | NULL |
| First | 2016-09-15 00:00:00 | 2017-09-23 00:00:00 | 2 | John |
| First | 2016-09-24 00:00:00 | 2017-09-30 00:00:00 | NULL | NULL |
+--------+---------------------+---------------------+-------+------+
4 rows in set (0.00 sec)
This is just models of my original tables, I have multiple joins over 10 tables in my query
I have two tables
CREATE TABLE `server` (
`server_id` int(3) NOT NULL AUTO_INCREMENT,
`server_name` varchar(15),
`server_alias` varchar(50),
`server_status` tinyint(1) DEFAULT '0',
`server_join` tinyint(1) DEFAULT '1',
`server_number_member` int(5),
PRIMARY KEY (`server_id`)
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `member` (
`member_id` int(11) NOT NULL AUTO_INCREMENT,
`member_server` int(3) DEFAULT NULL COMMENT 'Id server',
`member_name` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Tên của member',
PRIMARY KEY (`member_id`)
) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
An I create table VIEW to get list server
CREATE VIEW `server_client` AS
SELECT
`s`.`server_id` AS `server_id`,
`s`.`server_name` AS `server_name`,
`s`.`server_alias` AS `server_alias`,
IF (`s`.`server_join` = 1, (COUNT(`m`.`member_id`) / `s`.`server_number_member` * 100) DIV 1, 100) AS `server_full`
FROM (`server` `s`
LEFT JOIN `member` `m`
ON ((`m`.`member_server` = `s`.`server_id`)))
WHERE `s`.`server_status` = 1
Now, server table have 1 record:
-------------------------------------------------------------------------------------------
|server_id|server_name|server_alias |server_status|server_join|server_number_member|
|-----------------------------------------------------------------------------------------|
| 1 | SV 01 | http://example.com/ | 0 | 0 | 10 |
-------------------------------------------------------------------------------------------
In membertable
------------------------------------------
| member_id | member_server | member_name|
|----------------------------------------|
| 1 | 1 | aaa |
|----------------------------------------|
| 2 | 1 | bbb |
|----------------------------------------|
| 3 | 1 | ccc |
------------------------------------------
Result in server_client table
--------------------------------------------------------
| server_id | server_name | server_alias | server_full |
|------------------------------------------------------|
| NULL | NULL | NULL | 100 |
--------------------------------------------------------
server_full is used to calculate the percentage of the number of members already in a server
I want to remove record NULL in server_client table
How to do it
Thank
Because you are using COUNT() you should also be aggregating over the servers with GROUP BY. The following query should be along the lines of what you want:
CREATE VIEW server_client AS
SELECT
s.server_id AS server_id,
s.server_name AS server_name,
s.server_alias AS server_alias,
IF (s.server_join = 1,
(COUNT(m.member_id) / s.server_number_member * 100) DIV 1,
100) AS server_full
FROM server s
LEFT JOIN member m
ON m.member_server = s.server_id
WHERE s.server_status = 1
GROUP BY
s.server_id,
s.server_name,
s.server_alias
The only issue you may have is with the sum conditional aggregation I have in my query. In any case, I expect that the results from the above will at least start looking correct.
By the way, I removed all the backticks because you don't them and they are ugly.
I'm using MySQL5 and I currently have a query that gets me the info I need but I feel like it could be improved in terms of performance.
Here's the query I built (roughly following this guide) :
SELECT d.*, dc.date_change, dc.cwd, h.name as hub
FROM livedata_dom AS d
LEFT JOIN ( SELECT dc1.*
FROM livedata_domcabling as dc1
LEFT JOIN livedata_domcabling AS dc2
ON dc1.dom_id = dc2.dom_id AND dc1.date_change < dc2.date_change
WHERE dc2.dom_id IS NULL
ORDER BY dc1.date_change desc) AS dc ON (d.id = dc.dom_id)
LEFT JOIN livedata_hub AS h ON (d.id = dc.dom_id AND dc.hub_id = h.id)
WHERE d.cluster = 'localhost'
GROUP BY d.id;
EDIT: Using ORDER BY + GROUP BY to avoid getting multiple dom entries in case 'domcabling' has an entry with null date_change and another one with a date for the same 'dom'.
I feel like I'm killing a mouse with a bazooka. This query takes more than 3 seconds with only about 5k entries in 'livedata_dom' and 'livedata_domcabling'. Also, EXPLAIN tells me that 2 filesorts are used:
+----+-------------+------------+--------+-----------------------------+-----------------------------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+-----------------------------+-----------------------------+---------+-----------------+------+----------------------------------------------+
| 1 | PRIMARY | d | ALL | NULL | NULL | NULL | NULL | 3 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 1 | PRIMARY | h | eq_ref | PRIMARY | PRIMARY | 4 | dc.hub_id | 1 | |
| 2 | DERIVED | dc1 | ALL | NULL | NULL | NULL | NULL | 4 | Using filesort |
| 2 | DERIVED | dc2 | ref | livedata_domcabling_dc592d9 | livedata_domcabling_dc592d9 | 4 | live.dc1.dom_id | 2 | Using where; Not exists |
+----+-------------+------------+--------+-----------------------------+-----------------------------+---------+-----------------+------+----------------------------------------------+
How could I change this query to make it more efficient?
Using the dummy data (provided below), this is the expected result:
+-----+-------+---------+--------+----------+------------+-----------+---------------------+------+-----------+
| id | mb_id | prod_id | string | position | name | cluster | date_change | cwd | hub |
+-----+-------+---------+--------+----------+------------+-----------+---------------------+------+-----------+
| 249 | 47 | 47 | 47 | 47 | SuperDOM47 | localhost | NULL | NULL | NULL |
| 250 | 48 | 48 | 48 | 48 | SuperDOM48 | localhost | 2014-04-16 05:23:00 | 32A | megahub01 |
| 251 | 49 | 49 | 49 | 49 | SuperDOM49 | localhost | NULL | 22B | megahub01 |
+-----+-------+---------+--------+----------+------------+-----------+---------------------+------+-----------+
Basically I need 1 row for every 'dom' entry, with
the 'domcabling' record with the highest date_change
if record does not exist, I need null fields
ONE entry may have a null date_change field per dom (null datetime field considered older than any other datetime)
the name of the 'hub', when a 'domcabling' entry is found, null otherwise
CREATE TABLE + dummy INSERT for the 3 tables:
livedata_dom (about 5000 entries)
CREATE TABLE `livedata_dom` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`mb_id` varchar(12) NOT NULL,
`prod_id` varchar(8) NOT NULL,
`string` int(11) NOT NULL,
`position` int(11) NOT NULL,
`name` varchar(30) NOT NULL,
`cluster` varchar(9) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `mb_id` (`mb_id`),
UNIQUE KEY `prod_id` (`prod_id`),
UNIQUE KEY `name` (`name`),
UNIQUE KEY `livedata_domgood_string_7bff074107b0e5a0_uniq` (`string`,`position`,`cluster`)
) ENGINE=InnoDB AUTO_INCREMENT=5485 DEFAULT CHARSET=latin1;
INSERT INTO `livedata_dom` VALUES (251,'49','49',49,49,'SuperDOM49','localhost'),(250,'48','48',48,48,'SuperDOM48','localhost'),(249,'47','47',47,47,'SuperDOM47','localhost');
livedata_domcabling (about 10000 entries and growing slowly)
CREATE TABLE `livedata_domcabling` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dom_id` int(11) NOT NULL,
`hub_id` int(11) NOT NULL,
`cwd` varchar(3) NOT NULL,
`date_change` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `livedata_domcabling_dc592d9` (`dom_id`),
KEY `livedata_domcabling_4366aa6e` (`hub_id`),
CONSTRAINT `dom_id_refs_id_73e89ce0c50bf0a6` FOREIGN KEY (`dom_id`) REFERENCES `livedata_dom` (`id`),
CONSTRAINT `hub_id_refs_id_179c89d8bfd74cdf` FOREIGN KEY (`hub_id`) REFERENCES `livedata_hub` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5397 DEFAULT CHARSET=latin1;
INSERT INTO `livedata_domcabling` VALUES (1,251,1,'22B',NULL),(2,250,1,'33A',NULL),(6,250,1,'32A','2014-04-16 05:23:00'),(5,250,1,'22B','2013-05-22 00:00:00');
livedata_hub (about 100 entries)
CREATE TABLE `livedata_hub` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(14) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=98 DEFAULT CHARSET=latin;
INSERT INTO `livedata_hub` VALUES (1,'megahub01');
Try this rewriting (tested in SQL-Fiddle:
SELECT
d.*, dc.date_change, dc.cwd, h.name as hub
FROM
livedata_dom AS d
LEFT JOIN
livedata_domcabling as dc
ON dc.id =
( SELECT id
FROM livedata_domcabling AS dcc
WHERE dcc.dom_id = d.id
ORDER BY date_change DESC
LIMIT 1
)
LEFT JOIN
livedata_hub AS h
ON dc.hub_id = h.id
WHERE
d.cluster = 'localhost' ;
And index on (dom_id, date_change) would help efficiency.
I'm not sure about the selectivity of d.cluster = 'localhost' (how many rows of the livedata_dom table match this condiiton?) but adding an index on (cluster) might help as well.
set #rn := 0, #dom_id := 0;
select d.*, dc.date_change, dc.cwd, h.name as hub
from
livedata_dom d
left join (
select
hub_id, date_change, cwd, dom_id,
if(#dom_id = dom_id, #rn := #rn + 1, #rn := 1) as rn,
#dom_id := dom_id as dm_id
from
livedata_domcabling
order by dom_id, date_change desc
) dc on d.id = dc.dom_id
left join
livedata_hub h on h.id = dc.hub_id
where rn = 1 or rn is null
order by dom_id
The data you posted does not have the dom_id 249. And the #250 has one null date so it comes first. So your result does not reflect what I understand form your question.