tree branch from mysql - mysql

I have mysql table with schema whixh contain data to store tree structure.
CREATE TABLE `treedata` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`parent_id` int(11) unsigned NOT NULL DEFAULT '0',
`depth` tinyint(3) unsigned NOT NULL DEFAULT '0',
`name` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uniquecheck` (`parent_id`,`name`) USING BTREE,
KEY `depth` (`depth`) USING BTREE,
KEY `parent_id` (`parent_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=latin1
It has below data.
mysql> select * from treedata;
+----+-----------+-------+------+
| id | parent_id | depth | name |
+----+-----------+-------+------+
| 1 | 1 | 0 | root |
| 2 | 1 | 1 | b1 |
| 3 | 1 | 1 | b2 |
| 4 | 1 | 1 | b3 |
| 5 | 2 | 2 | b1_1 |
| 6 | 2 | 2 | b1_2 |
| 7 | 2 | 2 | b1_3 |
| 8 | 3 | 2 | b2_1 |
| 9 | 3 | 2 | b2_2 |
| 10 | 3 | 2 | b2_3 |
| 11 | 4 | 2 | b3_1 |
| 12 | 4 | 2 | b3_2 |
| 13 | 4 | 2 | b3_3 |
+----+-----------+-------+------+
13 rows in set (0.00 sec)
I need to select branch and its children based on depth and name, like if depth is 1 and name is b1 then it should return
+----+-----------+-------+------+
| id | parent_id | depth | name |
+----+-----------+-------+------+
| 2 | 1 | 1 | b1 |
| 5 | 2 | 2 | b1_1 |
| 6 | 2 | 2 | b1_2 |
| 7 | 2 | 2 | b1_3 |
+----+-----------+-------+------+
I am new to database. I tried left join it gives all children but not branch itself.
mysql> select td2.* from treedata as td1 left join treedata as td2 on td1.id=td2.parent_id where td1.name='b1';
+------+-----------+-------+------+
| id | parent_id | depth | name |
+------+-----------+-------+------+
| 5 | 2 | 2 | b1_1 |
| 6 | 2 | 2 | b1_2 |
| 7 | 2 | 2 | b1_3 |
+------+-----------+-------+------+
3 rows in set (0.00 sec)
Note: I can't change database schema.

you can use like cluse for select all data which has b1 branch like this .
select td2.* from treedata as td1 left join treedata as td2 on td1.id=td2.parent_id where td1.name LIKE '%b1%';

i think it may help you
select * from (select * from table_name order by `depth`) products_sorted,(select #pv := 'your_node_id(string)') initialisation where (find_in_set(parent_id, #pv) or id=your_node_id) and length(#pv := concat(#pv, ',', id))
it will find all children of your starting node

Related

How to avoid temporary table on group by with join?

I'm having two tables say(for example), Department and Members
Department table description:
CREATE TABLE `Department` (
`code` int(10) DEFAULT NULL,
`name` char(100) DEFAULT NULL,
KEY `code_index` (`code`),
KEY `name_index` (`name`)
)
Department table values:
+------+-------------+
| code | name |
+------+-------------+
| 1 | Production |
| 2 | Development |
| 3 | Management |
+------+-------------+
Members table description:
CREATE TABLE `Members` (
`department_code` int(10) DEFAULT NULL,
`name` char(100) DEFAULT NULL,
KEY `department_code_index` (`department_code`),
KEY `name_index` (`name`)
)
Members table values:
+-----------------+----------------+
| department_code | name |
+-----------------+----------------+
| 1 | Ross Geller |
| 1 | Monica Geller |
| 1 | Phoebe Buffay |
| 1 | Rachel Green |
| 1 | Chandler Bing |
| 1 | Joey Tribianni |
| 2 | Janice |
| 2 | Gunther |
| 2 | Cathy |
| 2 | Emily |
| 2 | Fun Bobby |
| 2 | Heckles |
| 3 | Paolo |
| 3 | Mike Hannigan |
| 3 | Carol |
| 3 | Susan |
| 3 | Richard |
| 3 | Tag |
+-----------------+----------------+
I want to get the all the department code and name for the given set of users. As i just want the department names alone, I used the below query.
mysql> select Department.code, Department.name, Members.department_code from Department left join Members on (Department.code=Members.department_code) where Members.name in ('Rachel Green', 'Gunther', 'Paolo') group by Department.code;
+------+-------------+-----------------+
| code | name | department_code |
+------+-------------+-----------------+
| 1 | Production | 1 |
| 2 | Development | 2 |
| 3 | Management | 3 |
+------+-------------+-----------------+
This works fine and the "explain" gives me below execution plan.
+----+-------------+------------+------------+------+----------------------------------+-----------------------+---------+----------------------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+----------------------------------+-----------------------+---------+----------------------+------+----------+---------------------------------+
| 1 | SIMPLE | Department | NULL | ALL | code_index | NULL | NULL | NULL | 3 | 100.00 | Using temporary; Using filesort |
| 1 | SIMPLE | Members | NULL | ref | department_code_index,name_index | department_code_index | 5 | test.Department.code | 1 | 16.67 | Using where |
+----+-------------+------------+------------+------+----------------------------------+-----------------------+---------+----------------------+------+----------+---------------------------------+
But the "group by" uses temporary table which may degrade the performance if the Members table contains a lot of rows. Though I guess some ideal indexing would help out here, i can't get the proper idea. Any help will be appreciated.
Thanks in advance!
You can avoid the group by over all the data using a subquery:
select d.code, d.name, d.department_code
from Department d
where exists (select 1
from Members m
where d.code = m.department_code and
m.name in ('Rachel Green', 'Gunther', 'Paolo')
);
With an index on members(department_code, name), this should be much faster.

Trouble about MySQL Performance Optimization

I did a MySQL performance optimization test, but the test results surprised me.
First of all, I prepared several tables for my test, which are "t_worker_attendance_300w(3 million data), t_worker_attendance_1000w(10 million data), t_worker_attendance_1y(100 million data), t_worker_attendance_4y(400 million data)".
Each table has the same field, the same index, they are copied, including 400 million data volume is also increased from 3 million data.
In my understanding, MySQL's performance is bound to be severely affected by the size of the data volume, but the results have puzzled me for a whole week. I've almost tested the scenarios I can think of, but their execution times are the same!
This is a new MySQL 5.6.16 server,I tested any scenario I could think of, including INNER JOIN....
A) SHOW CREATE TABLE t_worker_attendance_4y
CREATE TABLE `t_worker_attendance_4y` (
`id` bigint(20) NOT NULL ,
`attendance_id` char(32) NOT NULL,
`worker_id` char(32) NOT NULL,
`subcontractor_id` char(32) NOT NULL ,
`project_id` char(32) NOT NULL ,
`sign_date` date NOT NULL ,
`sign_type` char(2) NOT NULL ,
`latitude` double DEFAULT NULL,
`longitude` double DEFAULT NULL ,
`sign_wages` decimal(16,2) DEFAULT NULL ,
`confirm_wages` decimal(16,2) DEFAULT NULL ,
`work_content` varchar(60) DEFAULT NULL ,
`team_leader_id` char(32) DEFAULT NULL,
`sign_state` char(2) NOT NULL ,
`confirm_date` date DEFAULT NULL ,
`sign_mode` char(2) DEFAULT NULL ,
`checkin_time` datetime DEFAULT NULL ,
`checkout_time` datetime DEFAULT NULL ,
`sign_hours` decimal(6,1) DEFAULT NULL ,
`overtime` decimal(6,1) DEFAULT NULL ,
`confirm_hours` decimal(6,1) DEFAULT NULL ,
`signimg` varchar(200) DEFAULT NULL ,
`signoutimg` varchar(200) DEFAULT NULL ,
`photocheck` char(2) DEFAULT NULL ,
`machine_type` varchar(2) DEFAULT '1' ,
`project_coordinate` text ,
`floor_num` varchar(200) DEFAULT NULL ,
`device_serial_no` varchar(32) DEFAULT NULL ,
KEY `checkin_time` (`checkin_time`),
KEY `worker_id` (`worker_id`),
KEY `project_id` (`project_id`),
KEY `subcontractor_id` (`subcontractor_id`),
KEY `sign_date` (`sign_date`),
KEY `project_id_2` (`project_id`,`sign_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
B) SHOW INDEX FROM t_worker_attendance_4y
+------------------------+------------+------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------------+------------+------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_worker_attendance_4y | 1 | checkin_time | 1 | checkin_time | A | 5017494 | NULL | NULL | YES | BTREE | | |
| t_worker_attendance_4y | 1 | worker_id | 1 | worker_id | A | 1686552 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | project_id | 1 | project_id | A | 102450 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | subcontractor_id | 1 | subcontractor_id | A | 380473 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | sign_date | 1 | sign_date | A | 512643 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | project_id_2 | 1 | project_id | A | 102059 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | project_id_2 | 2 | sign_date | A | 1776104 | NULL | NULL | | BTREE | | |
+------------------------+------------+------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
C) EXPLAIN SELECT SQL_NO_CACHE tw.project_id, tw.sign_date FROM t_worker_attendance_4y tw WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb' AND sign_date >= '07/01/2018' AND sign_date < '08/01/2018' ;
+----+-------------+-------+------+-----------------------------------+--------------+---------+-------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------------------------+--------------+---------+-------+----------+--------------------------+
| 1 | SIMPLE | tw | ref | project_id,sign_date,project_id_2 | project_id_2 | 96 | const | 54134596 | Using where; Using index |
+----+-------------+-------+------+-----------------------------------+--------------+---------+-------+----------+--------------------------+
They all went through the same joint index.
SELECT tw.project_id, tw.sign_date FROM t_worker_attendance_300w tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.02 sec
SELECT tw.project_id, tw.sign_date FROM t_worker_attendance_1000w tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.01 sec
SELECT tw.project_id, tw.sign_date FROM t_worker_attendance_1y tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.02 sec
SELECT tw.project_id, tw.sign_date FROM t_worker_attendance_4y tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.02 sec
......
My guess is that MySQL's query performance will decline dramatically with the increase of data volume, but they are not much different. So I have no way to optimize my query. I don't know when to implement table partition plan or sub-database sub-table plan.
What I want to know is why the execution speed of index with small data volume is the same as that of index with large data volume. If you can help me, I would like to thank you very much.
Same search performance on large data volume because of BTREE index. It has O(log(n)). Relatively speaking that means that search algorithm have to complete:
6 operations on 3m of data
7 operations on 10m of data
8 operations on 100m of data
8 operations on 400m of data
Аs you can see the number of operations is almost the same.
My guess is that MySQL's query performance will decline dramatically with the increase of data volume
This is true for full table scan cases.
I have a new answer, someone told me "Because your query is covered by index, index is actually the time of query index. Mysql index uses B + tree structure. The query time is basically the same under the same tree height. You can calculate whether the height of the trees indexed by these tables is the same."
So I did the inquiry as required.
mysql> SELECT b.name, a.name, index_id, type, a.space, a.PAGE_NO
-> FROM information_schema.INNODB_SYS_INDEXES a,
-> information_schema.INNODB_SYS_TABLES b
-> WHERE a.table_id = b.table_id AND a.space <> 0;
+-------------------------------------------------+---------------------+----------+------+-------+---------+
| name | name | index_id | type | space | PAGE_NO |
+-------------------------------------------------+---------------------+----------+------+-------+---------+
| mysql/innodb_index_stats | PRIMARY | 18 | 3 | 2 | 3 |
| mysql/innodb_table_stats | PRIMARY | 17 | 3 | 1 | 3 |
| mysql/slave_master_info | PRIMARY | 20 | 3 | 4 | 3 |
| mysql/slave_relay_log_info | PRIMARY | 19 | 3 | 3 | 3 |
| mysql/slave_worker_info | PRIMARY | 21 | 3 | 5 | 3 |
| test_gomeet/t_worker_attendance_1y | GEN_CLUST_INDEX | 45 | 1 | 12 | 3 |
| test_gomeet/t_worker_attendance_1y | checkin_time | 46 | 0 | 12 | 16389 |
| test_gomeet/t_worker_attendance_1y | project_id | 50 | 0 | 12 | 32775 |
| test_gomeet/t_worker_attendance_1y | worker_id | 53 | 0 | 12 | 49161 |
| test_gomeet/t_worker_attendance_1y | subcontractor_id | 54 | 0 | 12 | 65547 |
| test_gomeet/t_worker_attendance_1y | sign_date | 66 | 0 | 12 | 81933 |
| test_gomeet/t_worker_attendance_1y | project_id_2 | 408 | 0 | 12 | 98319 |
| test_gomeet/t_worker_attendance_300w | GEN_CLUST_INDEX | 56 | 1 | 13 | 3 |
| test_gomeet/t_worker_attendance_300w | checkin_time | 58 | 0 | 13 | 16389 |
| test_gomeet/t_worker_attendance_300w | project_id | 59 | 0 | 13 | 16427 |
| test_gomeet/t_worker_attendance_300w | worker_id | 60 | 0 | 13 | 16428 |
| test_gomeet/t_worker_attendance_300w | subcontractor_id | 61 | 0 | 13 | 16429 |
| test_gomeet/t_worker_attendance_300w | sign_date | 67 | 0 | 13 | 65570 |
| test_gomeet/t_worker_attendance_300w | project_id_2 | 397 | 0 | 13 | 81929 |
| test_gomeet/t_worker_attendance_4y | GEN_CLUST_INDEX | 42 | 1 | 9 | 3 |
| test_gomeet/t_worker_attendance_4y | checkin_time | 47 | 0 | 9 | 16389 |
| test_gomeet/t_worker_attendance_4y | worker_id | 49 | 0 | 9 | 32775 |
| test_gomeet/t_worker_attendance_4y | project_id | 52 | 0 | 9 | 49161 |
| test_gomeet/t_worker_attendance_4y | subcontractor_id | 55 | 0 | 9 | 65547 |
| test_gomeet/t_worker_attendance_4y | sign_date | 69 | 0 | 9 | 81933 |
| test_gomeet/t_worker_attendance_4y | project_id_2 | 412 | 0 | 9 | 98319 |
+-------------------------------------------------+---------------------+----------+------+-------+---------+
mysql> SHOW GLOBAL STATUS LIKE 'Innodb_page_size';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_page_size | 16384 |
+------------------+-------+
root#localhost:/usr/local/mysql/data/test_gomeet# hexdump -s 49216 -n 02 t_worker_attendance_300w.ibd
000c040 0200
000c042
root#localhost:/usr/local/mysql/data/test_gomeet# hexdump -s 49216 -n 02 t_worker_attendance_1y.ibd
000c040 0300
000c042
root#localhost:/usr/local/mysql/data/test_gomeet# hexdump -s 49216 -n 02 t_worker_attendance_4y.ibd
000c040 0300
000c042
The calculation shows that 3.34 is 100 million and 3.589 is 400 million. It's almost the same. Is it because of this?

MySQL: How to optimize this query

I wish to reduce time to query data in view.
My tables have following structure:
Table Rings contains individual rings, each ring has unique combination of ID_RingType and Number, But also ID, which is used as foreign key elsewhere.
-- RINGS
CREATE TABLE `Rings` (
ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
ID_RingType CHAR(2) NOT NULL,
Number MEDIUMINT UNSIGNED NOT NULL,
ID_RingStatus TINYINT DEFAULT 1,
ID_User INT(11),
DateLastChange TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
FOREIGN KEY (ID_RingType) REFERENCES RingType(Code),
FOREIGN KEY (ID_RingStatus) REFERENCES RingStatus(ID),
FOREIGN KEY (ID_User) REFERENCES `848-cso`.`Users`(UID)
);
-- create index on tripple ID_User, ID_RingType, Number
CREATE INDEX idx_rings ON `Rings` (ID_User, ID_RingType, Number);
CREATE INDEX idx_rings_overview ON `Rings` (ID_RingType, Number, ID_RingStatus);
CREATE INDEX idx_rings_numbers ON `Rings` (ID_RingStatus, ID_User, ID_RingType, Number);
Ring Status contains only 4 values and their meaning
-- RING STATUS
CREATE TABLE `RingStatus` (
ID TINYINT NOT NULL PRIMARY KEY,
Name VARCHAR(20) UNIQUE COLLATE utf8_czech_ci,
NameEng VARCHAR(20)
);
Ring Type is indentified by two-letters Code
-- RING TYPE
CREATE TABLE `RingType` (
Code CHAR(2) NOT NULL PRIMARY KEY,
Material VARCHAR(30) COLLATE utf8_czech_ci,
Radius DOUBLE UNSIGNED,
MaxVal MEDIUMINT UNSIGNED NOT NULL
);
Moreover, I use following function:
/*
Function returns tinyint(1) specifying, whether ring was assigned
*/
CREATE FUNCTION fn_isRingAssigned (idRingStatus TINYINT)
RETURNS TINYINT(1) DETERMINISTIC
RETURN IF(idRingStatus = 1,1,2);
The query which I try to optimize is stored in following VIEW:
/*
View finds contiguous ranges of rings grouped by type, radius and status
*/
ALTER VIEW vw_rings_overview AS SELECT
a.ID_RingType,
rt.Radius,
fn_isRingAssigned(a.ID_RingStatus) AS status,
rs.Name,
a.Number AS min,
MIN(b.Number) AS max
FROM
RingStatus AS rs, Rings AS a
JOIN RingType AS rt ON a.ID_RingType = rt.Code
JOIN Rings AS b
ON a.ID_RingType = b.ID_RingType
AND fn_isRingAssigned(a.ID_RingStatus) = fn_isRingAssigned(b.ID_RingStatus)
AND a.Number <= b.Number
WHERE NOT EXISTS
( SELECT 1
FROM Rings AS c
WHERE c.ID_RingType = a.ID_RingType
AND fn_isRingAssigned(c.ID_RingStatus) = fn_isRingAssigned(a.ID_RingStatus)
AND c.Number = a.Number - 1
)
AND NOT EXISTS
( SELECT 1
FROM Rings AS d
WHERE d.ID_RingType = b.ID_RingType
AND fn_isRingAssigned(d.ID_RingStatus) = fn_isRingAssigned(b.ID_RingStatus)
AND d.Number = b.Number + 1
)
AND fn_isRingAssigned(a.ID_RingStatus) = rs.ID
GROUP BY
a.ID_RingType,
fn_isRingAssigned(a.ID_RingStatus),
a.Number
ORDER BY
a.ID_RingType,
a.Number;
The data in Rings table look as follows
+----+-------------+--------+---------------+---------+---------------------+
| ID | ID_RingType | Number | ID_RingStatus | ID_User | DateLastChange |
+----+-------------+--------+---------------+---------+---------------------+
| 1 | A | 1 | 4 | 2 | 2015-12-02 19:02:50 |
| 2 | A | 2 | 4 | 2 | 2015-12-02 19:02:56 |
| 3 | A | 3 | 4 | 2 | 2015-12-02 19:22:29 |
| 4 | A | 4 | 4 | 2 | 2015-12-21 20:32:24 |
| 5 | A | 5 | 4 | 2 | 2015-12-21 20:52:08 |
| 6 | A | 6 | 4 | 2 | 2015-12-21 20:52:22 |
| 7 | A | 7 | 1 | 2 | 2015-12-02 19:00:23 |
| 8 | A | 8 | 1 | 2 | 2015-12-02 19:00:23 |
| 9 | A | 9 | 1 | 2 | 2015-12-02 19:00:23 |
| 10 | A | 10 | 1 | 2 | 2015-12-02 19:00:23 |
+----+-------------+--------+---------------+---------+---------------------+
And results of the query look like this:
mysql> select * from vw_rings_overview;
+-------------+--------+--------+----------------+-----+-------+
| ID_RingType | Radius | status | Name | min | max |
+-------------+--------+--------+----------------+-----+-------+
| A | 20 | 2 | Assigned | 1 | 6 |
| A | 20 | 1 | Not assigned | 7 | 10 |
+-------------+--------+--------+----------------+-------------+
What the view does is it finds contiguous ranges in rings, having the same ring type, status and radius.
Table Rings currently contains less than 30 000 rows, and querying takes approx. 2 seconds. It is expected to contains few millions of rows, so I wish to optimize design of tables, indexes and view.
Here is result of EXPLAIN:
mysql> explain select * from vw_rings_overview;
+----+--------------------+------------+--------+--------------------+--------------------+---------+-----------------------------+-------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+--------+--------------------+--------------------+---------+-----------------------------+-------+-----------------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 19 | |
| 2 | DERIVED | a | index | idx_rings_overview | idx_rings_overview | 7 | NULL | 25173 | Using where; Using index; Using temporary; Using filesort |
| 2 | DERIVED | rt | eq_ref | PRIMARY | PRIMARY | 2 | 848-avi2.a.ID_RingType | 1 | |
| 2 | DERIVED | rs | eq_ref | PRIMARY | PRIMARY | 1 | func | 1 | Using where |
| 2 | DERIVED | b | ref | idx_rings_overview | idx_rings_overview | 2 | 848-avi2.rt.Code | 1573 | Using where; Using index |
| 4 | DEPENDENT SUBQUERY | d | ref | idx_rings_overview | idx_rings_overview | 5 | 848-avi2.b.ID_RingType,func | 1 | Using where; Using index |
| 3 | DEPENDENT SUBQUERY | c | ref | idx_rings_overview | idx_rings_overview | 5 | 848-avi2.a.ID_RingType,func | 1 | Using where; Using index |
+----+--------------------+------------+--------+--------------------+--------------------+---------+-----------------------------+-------+-----------------------------------------------------------+
Here are some sample data: http://sqlfiddle.com/#!9/b8b489/1

mysql how to find the total number of child rows with respect to a parent

I have a table which having parent child relatiionship like this,
Employee_ID | Employee_Manager_ID | Employee_Name
--------------------------------------------------------
1 | 1 | AAAA
2 | 1 | BBBB
3 | 2 | CCCC
4 | 3 | DDDD
5 | 3 | EEEEE
Is it possible to get the count of all the employees come under a particular employee(Not only direct child,count of all the childs of child ) using a single query ?
Eg if the input = 1
output should be 4
if input = 2 ,output should be 3
thanks in advance
Suppose your table is:
mysql> SELECT * FROM Employee;
+-----+------+-------------+------+
| SSN | Name | Designation | MSSN |
+-----+------+-------------+------+
| 1 | A | OWNER | 1 |
| 10 | G | WORKER | 5 |
| 11 | D | WORKER | 5 |
| 12 | E | WORKER | 5 |
| 2 | B | BOSS | 1 |
| 3 | F | BOSS | 1 |
| 4 | C | BOSS | 2 |
| 5 | H | BOSS | 2 |
| 6 | L | WORKER | 2 |
| 7 | I | BOSS | 2 |
| 8 | K | WORKER | 3 |
| 9 | J | WORKER | 7 |
+-----+------+-------------+------+
12 rows in set (0.00 sec)
Query is:
SELECT SUPERVISOR.name AS SuperVisor,
GROUP_CONCAT(SUPERVISEE.name ORDER BY SUPERVISEE.name ) AS SuperVisee,
COUNT(*)
FROM Employee AS SUPERVISOR
INNER JOIN Employee SUPERVISEE ON SUPERVISOR.SSN = SUPERVISEE.MSSN
GROUP BY SuperVisor;
The query will produce result like:
+------------+------------+----------+
| SuperVisor | SuperVisee | COUNT(*) |
+------------+------------+----------+
| A | A,B,F | 3 |
| B | C,H,I,L | 4 |
| F | K | 1 |
| H | D,E,G | 3 |
| I | J | 1 |
+------------+------------+----------+
5 rows in set (0.00 sec)
[Answer]:
This for One level (immediate supervise) to find all supervises at all possible level you have to use while loop (use stored procedures).
Although it is possible to retrieve employees at each level and then take their UNION, we cannot, in general, specify a query such as "retrieve the supervisees of a employee at all levels" without utilizing a looping mechanism."
REFERENCE: in this slide read slid number 23.
The BOOK is " FUNDAMENTALS OF FourthEdition DATABASE SYSTEMS" in chapter "The Relational Algebra and Relational Calculus" there is a topic "Recursive Closure Operations".
Adding Query for Table creation, May be helpful to you:
mysql> CREATE TABLE IF NOT EXISTS `Employee` (
-> `SSN` varchar(64) NOT NULL,
-> `Name` varchar(64) DEFAULT NULL,
-> `Designation` varchar(128) NOT NULL,
-> `MSSN` varchar(64) NOT NULL,
-> PRIMARY KEY (`SSN`),
-> CONSTRAINT `FK_Manager_Employee` FOREIGN KEY (`MSSN`) REFERENCES Employee(SSN)
-> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.17 sec)
You can check Table like:
mysql> DESCRIBE Employee;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| SSN | varchar(64) | NO | PRI | NULL | |
| Name | varchar(64) | YES | | NULL | |
| Designation | varchar(128) | NO | | NULL | |
| MSSN | varchar(64) | NO | MUL | NULL | |
+-------------+--------------+------+-----+---------+-------+
4 rows in set (0.00 sec)
You may try this:
SELECT
table_name.Employee_ID,
table_name.Employee_Name,
COUNT(*) AS children
FROM
table_name AS t_one
INNER JOIN table_name AS t_two ON
t_two.Employee_Manager_ID=t_one.Employee_ID
GROUP BY
t_one.Employee_ID

Join by part of string

I have following tables:
**visitors**
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| visitors_id | int(11) | NO | PRI | NULL | auto_increment |
| visitors_path | varchar(255) | NO | | | |
+---------------------+--------------+------+-----+---------+----------------+
**fedora_info**
+----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| pid | varchar(255) | NO | PRI | | |
| owner_uid | int(11) | YES | | NULL | |
+----------------+--------------+------+-----+---------+-------+
First I looking for visitors_path that are related to specific pages by:
SELECT visitors_id, visitors_path
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$';
The above query return expected result.
now .*:[0-9]+ in above query referred to pid in second table. now I want know count of result in above query grouped by owner_uid in second table.
How can I JOIN this tables?
EDIT
sample data:
visitors
+-------------+---------------------------------+
| visitors_id | visitors_path |
+-------------+---------------------------------+
| 4574 | fedora/repository/islandora:123 |
| 4575 | fedora/repository/islandora:123 |
| 4580 | fedora/repository/islandora:321 |
| 4681 | fedora/repository/islandora:321 |
| 4682 | fedora/repository/islandora:321 |
| 4704 | fedora/repository/islandora:321 |
| 4706 | fedora/repository/islandora:456 |
| 4741 | fedora/repository/islandora:456 |
| 4743 | fedora/repository/islandora:789 |
| 4769 | fedora/repository/islandora:789 |
+-------------+---------------------------------+
fedora_info
+-----------------+-----------+
| pid | owner_uid |
+-----------------+-----------+
| islandora:123 | 1 |
| islandora:321 | 2 |
| islandora:456 | 3 |
| islandora:789 | 4 |
+-----------------+-----------+
Expected result:
+-----------------+-----------+
| count | owner_uid |
+-----------------+-----------+
| 2 | 1 |
| 4 | 2 |
| 3 | 3 |
| 2 | 4 |
| 0 | 5 |
+-----------------+-----------+
I suggest you to normalize your database. When inserting rows in visitors extract pid in the front end language and put it in a separate column (e.g. fi_pid). Then you can join it easily.
The following query might work for you. But it'll be little cpu intensive.
SELECT
COUNT(a.visitors_id) as `count`,
f.owner_uid
FROM (SELECT visitors_id,
visitors_path,
SUBSTRING(visitors_path, ( LENGTH(visitors_path) -
LOCATE('/', REVERSE(visitors_path)) )
+ 2) AS
pid
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$') AS `a`
JOIN fedora_info AS f
ON ( a.pid = f.pid )
GROUP BY f.owner_uid
Following query returns expected result, but its very slow Query took 9.6700 sec
SELECT COUNT(t2.pid), t1.owner_uid
FROM fedora_info t1
JOIN (SELECT TRIM(LEADING 'fedora/repository/' FROM visitors_path) as pid
FROM visitors
WHERE visitors_path REGEXP '[[:<:]]fedora/repository/.*:[0-9]+$') t2 ON t1.pid = t2.pid
GROUP BY t1.owner_uid