Last observation carried forward / ignore nulls in lag - mysql

How can I imitate the LOCF behavior induced by lag(x) ignore nulls on, e.g., Redshift, in Presto?
Take this sample data:
select * from (
values (7369, null), (7499, 300), (7521, 500),
(7566, null), (7654, 1400), (7698, null),
(7782, null), (7788, null), (7839, null),
(7844, 0), (7876, null), (7900, null),
(7902, null), (7934, null)
) ex(empno, comm)
-- empno comm
-- 7369
-- 7499 300
-- 7521 500
-- 7566
-- 7654 1400
-- 7698
-- 7782
-- 7788
-- 7839
-- 7844 0
-- 7876
-- 7900
-- 7902
-- 7934
Desired output is:
-- empno comm prev_comm
-- 7369
-- 7499 300
-- 7521 500 300
-- 7566 500
-- 7654 1400 500
-- 7698 1400
-- 7782 1400
-- 7788 1400
-- 7839 1400
-- 7844 0 1400
-- 7876 0
-- 7900 0
-- 7902 0
-- 7934 0
This can be nearly achieved by the following (adapted to Presto from here):
select empno, comm, max(comm) over (partition by grp) prev_comm
from (
select empno, comm, sum(cast(comm is not null as double)) over (order by empno) grp
from example_table
)
order by empno
-- empno comm prev_comm
-- 7369
-- 7499 300 300
-- 7521 500 500
-- 7566 500
-- 7654 1400 1400
-- 7698 1400
-- 7782 1400
-- 7788 1400
-- 7839 1400
-- 7844 0 0
-- 7876 0
-- 7900 0
-- 7902 0
-- 7934 0
(the difference being that the current rows for non-NULL comm are incorrect)
Actually, in my case, the difference doesn't matter, since I want to coalesce(comm, prev_comm). However, this answer still does not suffice, because in the full data set, it created a memory failure:
Query exceeded local memory limit of 20GB
The following outstanding pull request to presto would implement ignore nulls directly; is there no way to accomplish the equivalent result in the interim?
https://github.com/prestodb/presto/pull/6157

Related

MySQL Query with recursive self join to obtain full structure of results

I'm trying to create a query which will obtain the full transaction history of accounts in our database. The relationship is fairly simple, 'loanaccount' joins to 'loantransactions', and I can see the full history of that account.
However, some accounts will be refinanced or rescheduled, and for the full loan transaction history, I also need to link to the prior accounts. When completing a refinance/reschedule, the system creates a 'Transfer' activity on both the old and the new account, and I have been exploring this as a way to link them.
My attempts have largely related to using the parenttransactionkey in the transaction table - this is a foreign key back to the transaction table which links the two 'Transfer' activities which have been created. For example:
select
la.id,
la2.id
from loanaccount la
join loantransaction lt on lt.PARENTACCOUNTKEY = la.ENCODEDKEY
join loantransaction lt2 on lt2.PARENTLOANTRANSACTIONKEY = lt.ENCODEDKEY
join loanaccount la2 on lt2.PARENTACCOUNTKEY = la2.ENCODEDKEY
Where
lt.type = 'TRANSFER'
In the above example, la.id would give me the new account ID, la2.id would give me the old account ID. There can be any number of previous accounts and I'm not sure how I can arrive at a solution which will continue to join to the nth level - as this only appears to work if there is a 1 new account and 1 old account.
To give a thorough example of what I am trying to achieve:
loanaccount
encodedkey id accountstate
a1a1a1 a1 active
b2b2b2 b2 closed
c3c3c3 c3 active
d4d4d4 d4 closed
e5e5e5 e5 closed
CREATE TABLE `loanaccount` (
`encodedkey` VARCHAR(32) NOT NULL,
`id` VARCHAR(32) NULL,
`accountstate` VARCHAR(32),
PRIMARY KEY (`encodedkey`));
INSERT INTO ``loanaccount` (`encodedkey`, `id`) VALUES ('a1a1a1','a1','active'),'b2b2b2', 'b2','closed'),('c3c3c3', 'c3','active'),('d4d4d4','d4','closed'),('e5e5e5','e5','closed');
loantransaction
encodedkey parentaccountkey amount type entrydate parentloantransactionkey
tra1 a1a1a1 1000 Interest 2017-12-31 null
tra2 a1a1a1 5000 Repayment 2017-12-01 null
tra3 a1a1a1 50000 Transfer 2017-11-15 null
tra4 b2b2b2 50000 Transfer 2017-11-15 tra3
tra5 b2b2b2 900 Interest 2017-10-30 null
tra6 b2b2b2 5000 Repayment 2017-10-01 null
tra7 b2b2b2 60000 Transfer 2017-09-15 null
tra8 d4d4d4 60000 Transfer 2017-09-15 tra7
tra9 d4d4d4 800 Interest 2017-09-30 null
tra10 c3c3c3 7500 Repayment 2018-01-31 null
tra11 c3c3c3 750000 Transfer 2018-01-01 null
tra12 e5e5e5 750000 Transfer 2018-01-01 tra11
tra13 e5e5e5 10000 Interest 2017-12-01 null
CREATE TABLE `loantransaction` (
`encodedkey` VARCHAR(32) NOT NULL,
`parentaccountkey` VARCHAR(32) NOT NULL,
`amount` DECIMAL(18,2) NULL,
`type` VARCHAR(32) NULL,
`entrydate` DATE NOT NULL
`parentloantransactionkey` VARCHAR(32) NULL),
PRIMARY KEY (`encodedkey`));
INSERT INTO loantransaction (encodedkey, parentaccountkey, amount, type, entrydate, parentloantransactionkey) VALUES ('tra1', 'a1a1a1', '1000', 'Interest', '2017-12-31', ''),('tra2', 'a1a1a1', '5000', 'Repayment', '2017-12-01', ''),('tra3', 'a1a1a1', '50000', 'Transfer', '2017-11-15', ''),('tra4', 'b2b2b2', '50000', 'Transfer', '2017-11-15', 'tra3'),('tra5', 'b2b2b2', '900', 'Interest', '2017-10-30', ''),('tra6', 'b2b2b2', '5000', 'Repayment', '2017-10-01', ''),('tra7', 'b2b2b2', '60000', 'Transfer', '2017-09-15', ''),('tra8', 'd4d4d4', '60000', 'Transfer', '2017-09-15', 'tra7'),('tra9', 'd4d4d4', '800', 'Interest', '2017-09-30', ''),('tra10', 'c3c3c3', '7500', 'Repayment', '2018-01-31', ''),('tra11', 'c3c3c3', '75000', 'Transfer', '2018-01-01', ''),('tra12', 'e5e5e5', '75000', 'Transfer', '2018-01-01', 'tra11'),('tra13', 'e5e5e5', '10000', 'Interest', '2017-12-01', '')
So, I want to have a query which will give me the full transaction history of the active account. So for account ID a1 I would like to see the following:
loanaccount.id encodedkey parentaccountkey amount type entrydate parentloantransactionkey
a1 tra1 a1a1a1 1000 Interest 2017-12-31 null
a1 tra2 a1a1a1 5000 Repayment 2017-12-01 null
a1 tra3 a1a1a1 50000 Transfer 2017-11-15 null
a1 tra4 b2b2b2 50000 Transfer 2017-11-15 tra3
a1 tra5 b2b2b2 900 Interest 2017-10-30 null
a1 tra6 b2b2b2 5000 Repayment 2017-10-01 null
a1 tra7 b2b2b2 60000 Transfer 2017-09-15 null
a1 tra8 d4d4d4 60000 Transfer 2017-09-15 tra7
a1 tra9 d4d4d4 800 Interest 2017-09-30 null
and for account ID c3 I should see:
loanaccount.id encodedkey parentaccountkey amount type entrydate parentloantransactionkey
c3 tra10 c3c3c3 7500 Repayment 2018-01-31 null
c3 tra11 c3c3c3 750000 Transfer 2018-01-01 null
c3 tra12 e5e5e5 750000 Transfer 2018-01-01 tra11
c3 tra13 e5e5e5 10000 Interest 2017-12-01 null
My research shows I may need to create a procedure to complete this as shown in the following link:
Mysql query with recursive self JOIN
Ideally i'd like to avoid the procedure route - is there any way i can create a temporary table to achieve this?
Apologies for the lengthy post, and thank you for any help you can provide.

Inner join 3 table

I have 6 table in my database . And now I would like to inner join car_space, transaction and sport_facilities. However, I got a problem.
When I use these two sql command respectively, these command also can be run and I can get the result I want.
-- car_space INNER JOIN transaction
SELECT * FROM car_space INNER JOIN transaction ON car_space.carSpaceId = transaction.carSpaceId ORDER BY transactionId;
-- sport_facilities INNER JOIN transaction
SELECT * FROM sport_facilities INNER JOIN transaction ON sport_facilities.sportFacilitiesId = transaction.sportFacilitiesId ORDER BY transactionId;
And then, I combine them into one command.
-- Combine But Not Work
SELECT * FROM transaction
INNER JOIN car_space ON transaction.carSpaceId = car_space.carSpaceId
INNER JOIN sport_facilities ON transaction.sportFacilitiesId = sport_facilities.sportFacilitiesId
ORDER BY transactionId;
Although this can be run, there are no result or records was shown.
I want to do is the database can be found the record in which table (car_space / sport_facilities) when I typed a transactionId.
For example:
I type WHERE transactionId = 1
Database can be searched this is from sport_facilities table rather that car_space.
Thank you. Here is some code for reference.
-- Create a database
CREATE DATABASE booking_system;
-- Use This database
USE booking_system;
-- Create smartcart table
CREATE TABLE card(
cardId CHAR(8) NOT NULL,
PRIMARY KEY (cardId)
);
-- Insert some recond to card table
INSERT INTO card VALUES
('4332A0D5'),
('637ED500'),
('B3895A02'),
('E32F3702')
;
-- Create user table
CREATE TABLE user(
userId INT(5) NOT NULL AUTO_INCREMENT,
cardNo CHAR(8) NOT NULL,
firstName VARCHAR(255) NOT NULL,
lastName VARCHAR(255) NOT NULL,
sex CHAR(1) NOT NULL,
dob DATE NOT NULL,
hkid CHAR(8) NOT NULL,
email VARCHAR(255) NOT NULL,
telNo INT(8) NOT NULL,
PRIMARY KEY (userId),
FOREIGN KEY (cardNo) REFERENCES card (cardId) ON DELETE CASCADE,
UNIQUE (hkid)
);
-- Alter user table
ALTER TABLE user AUTO_INCREMENT = 16001;
-- Insert some recond to user table
INSERT INTO user VALUES
('','4332A0D5','Andy','Ding','M','1962-04-20','K5216117','mkding#yahoo.com','98626229'),
('','637ED500','Emma','Dai','F','1972-06-15','D5060339','emmadai#yahoo.com.hk','62937453'),
('','B3895A02','Brinsley','Au','F','1984-02-24','P8172327','da224#live.hk','91961624'),
('','E32F3702','Eric','Fong','M','1990-04-15','Y1129323','ericfong0415#gmail.com','98428731')
;
-- Create car space price table
CREATE TABLE car_space_price(
spaceNo INT(2) NOT NULL AUTO_INCREMENT,
price INT(2) NOT NULL,
carSpaceDescription VARCHAR(16),
CHECK (carSpaceDescription IN ('motorcycles','small vehicles','medium vehicles','large vehicles')),
PRIMARY KEY (spaceNo)
);
-- Insert some recond to car space price table
INSERT INTO car_space_price VALUES
('','10','motorcycles'), -- 1
('','10','motorcycles'), -- 2
('','10','motorcycles'), -- 3
('','10','motorcycles'), -- 4
('','10','motorcycles'), -- 5
('','20','small vehicles'), -- 6
('','20','small vehicles'), -- 7
('','20','small vehicles'), -- 8
('','20','small vehicles'), -- 9
('','20','small vehicles'), -- 10
('','40','medium vehicles'), -- 11
('','40','medium vehicles'), -- 12
('','40','medium vehicles'), -- 13
('','80','large vehicles'), -- 14
('','80','large vehicles') -- 15
;
-- Create car space table
CREATE TABLE car_space(
carSpaceId INT(5) NOT NULL AUTO_INCREMENT,
spaceNo INT(2) NOT NULL,
cardNo VARCHAR(8) NOT NULL,
inTime DATETIME,
outTime DATETIME,
PRIMARY KEY (carSpaceId),
FOREIGN KEY (spaceNo) REFERENCES car_space_price (spaceNo) ON DELETE CASCADE,
FOREIGN KEY (cardNo) REFERENCES card (cardId) ON DELETE CASCADE
);
-- Insert some recond to car space table
INSERT INTO car_space VALUES
('','2','E32F3702','2015-02-23 14:24:18','2015-02-23 17:01:43'), -- 1 --16004
('','6','B3895A02','2016-02-24 11:56:43','2016-02-25 09:21:08'), -- 2 --16003
('','2','E32F3702','2016-02-24 16:42:34','2016-02-24 21:02:45'), -- 3 --16004
('','2','E32F3702','2016-02-25 14:25:32','2016-02-25 17:03:54'), -- 4 --16004
('','6','B3895A02','2016-02-25 17:12:11','2016-02-25 20:58:18'), -- 5 --16003
('','13','637ED500','2016-02-25 19:17:03','2016-02-27 18:05:28'), -- 6 --16002
('','6','B3895A02','2016-02-25 21:14:03','2016-02-25 23:53:28'), -- 7 --16003
('','6','B3895A02','2016-02-26 08:46:23','2016-02-26 17:21:08'), -- 8 --16003
('','2','E32F3702','2016-02-26 14:15:45','2016-02-26 21:01:15'), -- 9 --16004
('','6','B3895A02','2016-02-27 09:42:13','2016-02-27 15:48:45'), -- 10 --16003
('','2','E32F3702','2016-02-27 13:25:45','2016-02-27 15:15:45'), -- 11 --16004
('','6','B3895A02','2016-02-28 10:57:16','2016-02-28 14:41:25'), -- 12 --16003
('','2','E32F3702','2016-02-28 11:47:32','2016-02-28 13:43:15'), -- 13 --16004
('','13','637ED500','2016-02-28 13:04:43','2016-03-02 22:39:46'), -- 14 --16002
('','2','E32F3702','2016-02-28 14:42:34','2016-02-28 21:47:45'), -- 15 --16004
('','6','B3895A02','2016-02-29 08:50:42','2016-02-29 14:28:42'), -- 16 --16003
('','2','E32F3702','2016-02-29 12:12:35','2016-02-29 16:45:28'), -- 17 --16004
('','6','B3895A02','2016-03-01 11:26:43','2016-03-01 14:56:26'), -- 18 --16003
('','6','B3895A02','2016-03-03 13:45:26','2016-03-03 17:54:18') -- 19 --16003
;
-- Create sport facilities price table
CREATE TABLE sport_facilities_price(
sportNo INT(2) NOT NULL AUTO_INCREMENT,
sportType VARCHAR(10) NOT NULL,
price INT(2) NOT NULL,
sportDescription VARCHAR(20),
PRIMARY KEY (sportNo)
);
-- Insert some recond to sport facilities price table
INSERT INTO sport_facilities_price VALUES
('','snooker','15','Snooker Room 1'), -- 1
('','snooker','15','Snooker Room 2'), -- 2
('','snooker','15','Snooker Room 3'), -- 3
('','snooker','15','Snooker Room 4'), -- 4
('','table_tennis','15','Table Tennis Room 1'), -- 5
('','table_tennis','15','Table Tennis Room 2'), -- 6
('','table_tennis','15','Table Tennis Room 3'), -- 7
('','table_tennis','15','Table Tennis Room 4'), -- 8
('','tennis','30','Tennis Vanue 1'), -- 9
('','tennis','30','Tennis Vanue 2'), -- 10
('','badminton','30','Badminton Vanue 1'), -- 11
('','badminton','30','Badminton Vanue 2'), -- 12
('','basketball','60','Hall') -- 13
;
-- Create sport facilities table
CREATE TABLE sport_facilities(
sportFacilitiesId INT(5) NOT NULL AUTO_INCREMENT,
sportNo INT(2) NOT NULL,
cardNo VARCHAR(8) NOT NULL,
bookDate DATE NOT NULL,
startTime TIME NOT NULL,
endTime TIME NOT NULL,
PRIMARY KEY (sportFacilitiesId),
FOREIGN KEY (sportNo) REFERENCES sport_facilities_price (sportNo) ON DELETE CASCADE,
FOREIGN KEY (cardNo) REFERENCES card (cardId) ON DELETE CASCADE
);
-- Insert some recond to sport facilities table
INSERT INTO sport_facilities VALUES
('','1','E32F3702','2015-02-23','12:00:00','14:00:00'), -- 1 --16004
('','5','B3895A02','2016-02-23','14:00:00','15:00:00'), -- 2 --16003
('','8','637ED500','2016-02-23','17:00:00','21:00:00'), -- 3 --16002
('','2','E32F3702','2016-02-24','09:00:00','11:00:00'), -- 4 --16004
('','5','4332A0D5','2016-02-24','13:00:00','14:00:00'), -- 5 --16001
('','7','637ED500','2016-02-24','15:00:00','17:00:00'), -- 6 --16002
('','8','B3895A02','2016-02-24','16:00:00','18:00:00'), -- 7 --16003
('','10','4332A0D5','2016-02-25','09:00:00','10:00:00'), -- 8 --16001
('','12','B3895A02','2016-02-25','13:00:00','14:00:00'), -- 9 --16003
('','6','637ED500','2016-02-25','21:00:00','22:00:00'), -- 10 --16002
('','4','637ED500','2016-02-26','11:00:00','13:00:00'), -- 11 --16002
('','8','4332A0D5','2016-02-26','22:00:00','23:00:00'), -- 12 --16001
('','13','B3895A02','2016-02-27','09:00:00','14:00:00'), -- 13 --16003
('','4','637ED500','2016-02-28','12:00:00','14:00:00'), -- 14 --16002
('','3','B3895A02','2016-02-28','14:00:00','15:00:00'), -- 15 --16003
('','4','E32F3702','2016-02-28','17:00:00','19:00:00'), -- 16 --16004
('','5','B3895A02','2016-02-28','21:00:00','22:00:00'), -- 17 --16003
('','2','4332A0D5','2016-02-28','21:00:00','23:00:00'), -- 18 --16001
('','10','E32F3702','2016-02-28','19:00:00','20:00:00'), -- 19 --16004
('','11','B3895A02','2016-02-29','11:00:00','13::00:00'), -- 20 --16003
('','8','E32F3702','2016-02-29','12:00:00','14:00:00'), -- 21 --16004
('','4','4332A0D5','2016-02-29','15:00:00','18:00:00'), -- 22 --16001
('','6','E32F3702','2016-03-01','09:00:00','11:00:00'), -- 23 --16004
('','5','637ED500','2016-03-01','12:00:00','15:00:00'), -- 24 --16002
('','3','B3895A02','2016-03-02','09:00:00','11:00:00'), -- 25 --16003
('','7','4332A0D5','2016-03-02','12:00:00','13:00:00'), -- 26 --16001
('','4','637ED500','2016-03-02','15:00:00','17:00:00'), -- 27 --16002
('','1','E32F3702','2016-03-02','19:00:00','22:00:00'), -- 28 --16004
('','12','4332A0D5','2016-03-03','11:00:00','13:00:00'), -- 29 --16001
('','9','E32F3702','2016-03-03','15:00:00','16:00:00'), -- 30 --16004
('','10','B3895A02','2016-03-03','09:00:00','11:00:00'), -- 31 --16003
('','4','637ED500','2016-03-04','11:00:00','12:00:00'), -- 32 --16002
('','8','E32F3702','2016-03-04','14:00:00','16:00:00'), -- 33 --16004
('','6','B3895A02','2016-03-05','19:00:00','21:00:00'), -- 34 --16003
('','13','E32F3702','2016-03-05','11:00:00','12:00:00'), -- 35 --16004
('','8','637ED500','2016-03-05','14:00:00','15:00:00'), -- 36 --16002
('','4','4332A0D5','2016-03-05','16:00:00','18:00:00'), -- 37 --16001
('','5','E32F3702','2016-03-06','13:00:00','15:00:00'), -- 38 --16004
('','9','B3895A02','2016-03-06','17:00:00','18:00:00'), -- 39 --16003
('','11','4332A0D5','2016-03-07','20:00:00','21::00:00'), -- 40 --16001
('','5','B3895A02','2016-03-07','22:00:00','23:00:00') -- 41 --16003
;
-- Create transaction table
CREATE TABLE transaction(
transactionId INT(5) UNSIGNED ZEROFILL NOT NULL AUTO_INCREMENT,
userId INT(5) NOT NULL,
carSpaceId INT(5),
sportFacilitiesId INT(5),
transactionDate DATE NOT NULL,
PRIMARY KEY (transactionId),
FOREIGN KEY (userId) REFERENCES user (userId) ON DELETE CASCADE,
FOREIGN KEy (carSpaceId) REFERENCES car_space (carSpaceId) ON DELETE CASCADE,
FOREIGN KEY (sportFacilitiesId) REFERENCES sport_facilities (sportFacilitiesId) ON DELETE CASCADE
);
-- Insert some recond to transaction table
INSERT INTO transaction VALUES
('','16004',NULL,'1','2015-02-23'), -- 1 -- Sport Facilities
('','16003',NULL,'5','2015-02-23'), -- 2 -- Sport Facilities
('','16004','2',NULL,'2015-02-23'), -- 3 -- Car Space
('','16002',NULL,'8','2015-02-23'), -- 4 -- Sport Facilities
('','16004',NULL,'2','2016-02-24'), -- 5 -- Sport Facilities
('','16003','6',NULL,'2016-02-24'), -- 6 -- Car Space
('','16001',NULL,'5','2016-02-24'), -- 7 -- Sport Facilities
('','16002',NULL,'7','2016-02-24'), -- 8 -- Sport Facilities
('','16003',NULL,'8','2016-02-24'), -- 9 -- Sport Facilities
('','16004','2',NULL,'2016-02-24'), -- 10 -- Car Space
('','16001',NULL,'10','2016-02-25'), -- 11 -- Sport Facilities
('','16003',NULL,'12','2016-02-25'), -- 12 -- Sport Facilities
('','16004','2',NULL,'2016-02-25'), -- 13 -- Car Space
('','16003','6',NULL,'2016-02-25'), -- 14 -- Car Space
('','16002','13',NULL,'2016-02-25'), -- 15 -- Car Space
('','16002',NULL,'6','2016-02-25'), -- 16 -- Sport Facilities
('','16003','6',NULL,'2016-02-25'), -- 17 -- Car Space
('','16003','6',NULL,'2016-02-26'), -- 18 -- Car Space
('','16002',NULL,'4','2016-02-26'), -- 19 -- Sport Facilities
('','16004','2',NULL,'2016-02-26'), -- 20 -- Car Space
('','16001',NULL,'8','2016-02-26'), -- 21 -- Sport Facilities
('','16003',NULL,'13','2016-02-27'), -- 22 -- Sport Facilities
('','16003','6',NULL,'2016-02-27'), -- 23 -- Car Space
('','16004','2',NULL,'2016-02-27'), -- 24 -- Car Space
('','16003','6',NULL,'2016-02-28'), -- 25 -- Car Space
('','16004','2',NULL,'2016-02-28'), -- 26 -- Car Space
('','16002',NULL,'4','2016-02-28'), -- 27 -- Sport Facilities
('','16002','13',NULL,'2016-02-28'), -- 28 -- Car Space
('','16003',NULL,'3','2016-02-28'), -- 29 -- Sport Facilities
('','16004','2',NULL,'2016-02-28'), -- 30 -- Car Space
('','16004',NULL,'4','2016-02-28'), -- 31 -- Sport Facilities
('','16003',NULL,'5','2016-02-28'), -- 32 -- Sport Facilities
('','16001',NULL,'2','2016-02-28'), -- 33 -- Sport Facilities
('','16004',NULL,'10','2016-02-28'), -- 34 -- Sport Facilities
('','16003','6',NULL,'2016-02-29'), -- 35 -- Car Space
('','16003',NULL,'11','2016-02-29'), -- 36 -- Sport Facilities
('','16004',NULL,'8','2016-02-29'), -- 37 -- Sport Facilities
('','16004','2',NULL,'2016-02-29'), -- 38 -- Car Space
('','16001',NULL,'4','2016-02-29'), -- 39 -- Sport Facilities
('','16004',NULL,'6','2016-03-01'), -- 40 -- Sport Facilities
('','16003','6',NULL,'2016-03-01'), -- 41 -- Car Space
('','16002',NULL,'5','2016-03-01'), -- 42 -- Sport Facilities
('','16003',NULL,'3','2016-03-02'), -- 43 -- Sport Facilities
('','16001',NULL,'7','2016-03-02'), -- 44 -- Sport Facilities
('','16002',NULL,'4','2016-03-02'), -- 45 -- Sport Facilities
('','16004',NULL,'1','2016-03-02'), -- 46 -- Sport Facilities
('','16001',NULL,'12','2016-03-03'), -- 47 -- Sport Facilities
('','16003','6',NULL,'2016-03-03'), -- 48 -- Car Space
('','16004',NULL,'9','2016-03-03'), -- 49 -- Sport Facilities
('','16003',NULL,'10','2016-03-03'), -- 50 -- Sport Facilities
('','16002',NULL,'4','2016-03-04'), -- 51 -- Sport Facilities
('','16004',NULL,'8','2016-03-04'), -- 52 -- Sport Facilities
('','16003',NULL,'6','2016-03-05'), -- 53 -- Sport Facilities
('','16004',NULL,'13','2016-03-05'), -- 54 -- Sport Facilities
('','16002',NULL,'8','2016-03-05'), -- 55 -- Sport Facilities
('','16001',NULL,'4','2016-03-05'), -- 56 -- Sport Facilities
('','16004',NULL,'5','2016-03-06'), -- 57 -- Sport Facilities
('','16003',NULL,'9','2016-03-06'), -- 58 -- Sport Facilities
('','16001',NULL,'11','2016-03-07'), -- 59 -- Sport Facilities
('','16003',NULL,'5','2016-03-07') -- 60 -- Sport Facilities
;
How do you wish to combine the rows?
Looks like all your transactions that reference a car space have a NULL sports facility reference and vice versa.
Queries are done row-by-row, when you INNER JOIN transaction to just car spaces, you get all the transaction records with car space references with the car space record. All the other transactions are filtered out.
As none of these filtered transaction + car space rows have sports facility references (all NULL), when you then add the INNER JOIN to sports facilities there are no matching rows and again the non-matching rows are filtered out. This leaves you with an empty results set.
To get any results back from the double INNER JOIN query, the transaction row would have to reference (or link) a car space AND a sports facility.
If you want to keep all the transaction rows with either their car space or sports facility and a NULLed out record for whichever is not referenced, you could change the INNER JOINs to LEFT JOINs (just replace the words INNER with LEFT in your final query).
In this case, I believe you would want to use the UNION operator. You don't have transaction IDs that match in both tables, which is why you are returning 0 rows. LEFT/FULL joins may also work for you.
SELECT * FROM car_space INNER JOIN transaction ON car_space.carSpaceId = transaction.carSpaceId ORDER BY transactionId;
UNION
SELECT * FROM sport_facilities INNER JOIN transaction ON sport_facilities.sportFacilitiesId = transaction.sportFacilitiesId ORDER BY transactionId;

How to count repeating data in mysql

id | category
001 | 1000
001 | 300
002 | 500
003 | 200;300;100
004 | 100;300
005 | 200;3000
The result should be
Category | Total
1000 | 1
300 | 3
500 | 1
200 | 2
100 | 2
How can I arrive on that result? I saw something that I need to use find_in_set but its kind of complicated for me.
Any help on this will be greatly appreciated!
PS: I know the solution for this is to normalize but I guess it's a big work and I don't have an access to change database structure. So I guess if there's a solution to make a query work that will be great! :)
Thanks you!
Ok. my folt on previous answer!
Below is a way to split a string by a delimiter in MySQL without using a stored procedure.
To use the method you will first need to have another table that has numbers from 1 up to however many choices each row can store. This table will be used in a join, so that the first choice will be joined to the row with number 1, the second choice to row 2, etc. So you would need a table like this:
id
1
2
3
4
5
...
Let's say your main table is called maintable with a category column, and your other table is called othertable with an id column (though you could use any table that had sequential numbers or id numbers).
this I used to create table for this exampe:
CREATE TABLE maintable (id INT, category VARCHAR(255));
INSERT INTO maintable VALUES (1, '1000'), (2, '300'), (3, '500'), 4, '200;300;100'), (4, '100;300'), (4, '200;3000');
CREATE TABLE othertable (id INT);
INSERT INTO othertable VALUES (1), (2), (3), (4), (5), (6), (7), (8);
this is mysql code:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(maintable.category,';',othertable.id),';',-1) AS category,
COUNT(*) AS numtimes
FROM maintable INNER JOIN othertable ON
(LENGTH(category)>0 AND SUBSTRING_INDEX(SUBSTRING_INDEX(category,';',othertable.id),';',-1)
<> SUBSTRING_INDEX(SUBSTRING_INDEX(category,';',othertable.id-1),';', -1))
GROUP BY category ORDER BY category;
and i got this resoult:
category numtimes
100 2
1000 1
200 3
200 2
300 1
500 1

query executing too much time

my query is taking around 2800 secs to get output.
we have indexes also but no luck.
my target is need to get the output with in 2 to 3 seconds.
if possible please re-write the query.
query:
select ttl.id, ttl.url, ttl.canonical_url_id
from t_target_url ttl
where ttl.own_domain_id=476 and ttl.type != 10
order by ttl.week_entrances desc
limit 550000;
Explain Plan:
+----+-------------+-------+------+--------------------------------+---------------------------+---------+-------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------------------+---------------------------+---------+-------+----------+-----------------------------+
| 1 | SIMPLE | ttl | ref | own_domain_id_type_status,type | own_domain_id_type_status | 5 | const | 57871959 | Using where; Using filesort |
+----+-------------+-------+------+--------------------------------+---------------------------+---------+-------+----------+-----------------------------+
1 row in set (0.80 sec)
mysql> show create table t_target_url\G
*************************** 1. row ***************************
Table: t_target_url
Create Table: CREATE TABLE `t_target_url` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`own_domain_id` int(11) DEFAULT NULL,
`url` varchar(2000) NOT NULL,
`create_date` datetime DEFAULT NULL,
`friendly_name` varchar(255) DEFAULT NULL,
`section_name_id` int(11) DEFAULT NULL,
`type` int(11) DEFAULT NULL,
`status` int(11) DEFAULT NULL,
`week_entrances` int(11) DEFAULT NULL COMMENT 'last 7 days entrances',
`week_bounces` int(11) DEFAULT NULL COMMENT 'last 7 days bounce',
`canonical_url_id` int(11) DEFAULT NULL COMMENT 'the primary URL ID, NOT allow canonical of canonical',
KEY `id` (`id`),
KEY `urlindex` (`url`(255)),
KEY `own_domain_id_type_status` (`own_domain_id`,`type`,`status`),
KEY `canonical_url_id` (`canonical_url_id`),
KEY `type` (`type`,`status`)
) ENGINE=InnoDB AUTO_INCREMENT=227984392 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`type`)
(PARTITION p0 VALUES LESS THAN (0) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (1) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (2) ENGINE = InnoDB,
PARTITION pEOW VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)
Your query itself looks fine, however, the order by clause, and possible half-million records is probably your killer. I would add an index to help optimize that portion via
( own_domain_id, week_entrances, type )
So this way, you are first hitting your critical key "own_domain_id", and then getting everything already in order. The type is for != 10, thus any other type and would appear to cause more problems if that was in the second index position.
Comment Feedback.
For simplistic purposes, your critical key per the where clause is "ttl.own_domain_id=476". You only care about data for domain ID 476. Now, lets assume you have 15 "types" that span all different week entrances, such as
own_domain_id type week_entrances
476 1 1000
476 1 1700
476 1 850
476 2 15000
476 2 4250
476 2 12000
476 7 2500
476 7 5300
476 10 1250
476 10 4100
476 12 8000
476 12 3150
476 15 5750
476 15 27000
This obviously is not to scale of your half-million capacity, but shows sample data.
By having the type != 10, it will STILL have to blow through all the records for id=476, yet exclude only those with the type = 10. It then has to put all the data in order by the week entrances which would take more time. By having the week entrances as part of the key in the second position, THEN the type, the data WILL BE able to be optimized in the returned result set already in proper order. However, when it gets to the type of "!= 10", it will still skip over those quickly as they are encountered. Here would be the revised index data per above sample.
own_domain_id week_entrances type
476 850 1
476 1000 1
476 1250 10
476 1700 1
476 2500 7
476 3150 12
476 4100 10
476 4250 2
476 5300 7
476 5750 15
476 8000 12
476 12000 2
476 15000 2
476 27000 15
So, as you can see, the data is already pre-sorted per the index, and applying DESCENDING order is no problem for the engine, just pulls the records in reverse order and skips the 10's as they are found.
Does that help?
Additional comment feedback per Salman.
Think of this another way with a store with 10 different branch locations, each with their own sales. The transactions receipts are stored in boxes (literally). Think of how you would want to go through the boxes if you were looking for all transactions on a given date.
Box 1 = Store #1 only, and transactions sorted by date
Box 2 = Store #2 only, and transactions sorted by date
Box ...
Box 10 = Store #10 only, sorted by date.
You have to go through 10 boxes, pulling out all for a given date... Or in the original question, every transaction EXCEPT for one date, and you want them in order by dollar amount of transaction, regardless of date... What a mess that could be.
If you had the boxes pregroup sorted, regardless of store
Box 1 = Sales from $1 - $1000 (all properly sorted by amount)
Box 2 = Sales from $1001 - $2000 (properly sorted)
Box ...
Box 10... same...
You STILL have to go through all the boxes and put them in order, but at least, as you are looking through the transactions, you could just throw out the one for the date exclusion to ignore.
Indexes help pre-organize how the engine can best go through them for your criteria.

PL/SQL rownum updates

I am working on a database with a couple of tables. They are a
districts table
PK district_id
student_data table
PK study_id
FK district_id
ga_data table
PK study_id
district_id
The ga_data table is data that I am adding in. Both the student_data table and ga_data have 1.3 million records. The study_id's are 1 to 1 between the two tables, but the ga_data.district_id's are NULL and need to be updated. I am having trouble with the following PL/SQL:
update ga_data
set district_id = (select district_id from student_data
where student_data.study_id = ga_data.study_id)
where ga_data.district_id is null and rownum < 100;
I need to do it incremently so that's why I need rownum. But am I using it correctly? After running the query a bunch of times, it only updated about 8,000 records of the 1.3 million (should be about 1.1 million updates since some of the district_ids are null in student_data). Thanks!
ROWNUM just chops off query after the first n rows. You have some rows in STUDENT_DATA which have a NULL for DISTRICT_ID. So after a number of runs your query is liable to get stuck in a rut, returning the same 100 QA_DATA records, all of which match one of those pesky STUDENT_DATA rows.
So you need some mechanism for ensuring that you are working your way progressively through the QA_DATA table. A flag column would be one solution. Partitioning the query so it hits a different set of STUDENT_IDs is another.
It's not clear why you have to do this in batches of 100, but perhaps the easiest way of doing this would be to use BULK PROCESSING (at least in Oracle: this PL/SQL syntax won't work in MySQL).
Here is some test data:
SQL> select district_id, count(*)
2 from student_data
3 group by district_id
4 /
DISTRICT_ID COUNT(*)
----------- ----------
7369 192
7499 190
7521 192
7566 190
7654 192
7698 191
7782 191
7788 191
7839 191
7844 192
7876 191
7900 192
7902 191
7934 192
8060 190
8061 193
8083 190
8084 193
8085 190
8100 193
8101 190
183
22 rows selected.
SQL> select district_id, count(*)
2 from qa_data
3 group by district_id
4 /
DISTRICT_ID COUNT(*)
----------- ----------
4200
SQL>
This anonymous block uses the Bulk processing LIMIT clause to batch the result set into chunks of 100 rows.
SQL> declare
2 type qa_nt is table of qa_data%rowtype;
3 qa_recs qa_nt;
4
5 cursor c_qa is
6 select qa.student_id
7 , s.district_id
8 from qa_data qa
9 join student_data s
10 on (s.student_id = qa.student_id);
11 begin
12 open c_qa;
13
14 loop
15 fetch c_qa bulk collect into qa_recs limit 100;
16 exit when qa_recs.count() = 0;
17
18 for i in qa_recs.first()..qa_recs.last()
19 loop
20 update qa_data qt
21 set qt.district_id = qa_recs(i).district_id
22 where qt.student_id = qa_recs(i).student_id;
23 end loop;
24
25 end loop;
26 end;
27 /
PL/SQL procedure successfully completed.
SQL>
Note that this construct allows us to do additional processing on the selected rows before issuing the update. This is handy if we need to apply complicated fixes programmatically.
As you can see, the data in QA_DATA now matches that in STUDENT_DATA
SQL> select district_id, count(*)
2 from qa_data
3 group by district_id
4 /
DISTRICT_ID COUNT(*)
----------- ----------
7369 192
7499 190
7521 192
7566 190
7654 192
7698 191
7782 191
7788 191
7839 191
7844 192
7876 191
7900 192
7902 191
7934 192
8060 190
8061 193
8083 190
8084 193
8085 190
8100 193
8101 190
183
22 rows selected.
SQL>
It is kind of an odd requirement to only update 100 rows at a time. Why is that?
Anyway, since district_id in student_data can be null, you might be updating the same 100 rows over and over again.
If you extend your query to make sure a non-null district_id exists, you might end up where you want to be:
update ga_data
set district_id = (
select district_id
from student_data
where student_data.study_id = ga_data.study_id
)
where ga_data.district_id is null
and exists (
select 1
from student_data
where student_data.study_id = ga_data.study_id
and district_id is not null
)
and rownum < 100;
If this is a one-time conversion you should consider a completely different approach. Recreate the table as the join of your two tables. I promise you will laugh out loud when you realise how fast it is compared to all kinds of funny 100-rows-at-a-time updates.
create table new_table as
select study_id
,s.district_id
,g.the_remaining_columns_in_ga_data
from student_data s
join ga_data g using(study_id);
create indexes, constraints etc
drop table ga_data;
alter table new_table rename to ga_data;
Or if it isn't a one time conversion or you can't re-create/drop tables or you just feel like spending a few extra hours on data loading:
merge
into ga_data g
using student_data s
on (g.study_id = s.study_id)
when matched then
update
set g.district_id = s.district_id;
The last statement can also be rewritten as an updatable-view, but I personally never use them.
Drop/disable indexes/constraints on ga_data.district_id before running the merge and recreate them afterward will improve on the performance.