I just wanted to know how to do a loop and fill a database table with fake data in order to get 500,000 records. I have a table with the following fields, for customer_id we have 1-1000, staff_id we have 1-5 staff, car_id is between 1-10,000, qty is 1-3, date_ordered is from 1975 to 2017, date_returned is from 1975 to 2017, For the dates the difference between date_ordered and date_returned should be between 2-3 days.
Any help on this would be much appreciated!
CREATE TABLE car_transaction
(
transaction_id INTEGER NOT NULL,
customer_id INTEGER,
staff_id INTEGER,
car_ID INTEGER,
QTY INTEGER,
date_ordered,
date_returned,
PRIMARY KEY (transaction_id));
This is totally possible with only pure MySQL SQL code.
This is the table i've used
CREATE TABLE car_transaction
(
transaction_id INTEGER NOT NULL AUTO_INCREMENT, # included AUTO_INCREMENT HERE
customer_id INTEGER,
staff_id INTEGER,
car_ID INTEGER,
QTY INTEGER,
date_ordered DATE, # made DATE type
date_returned DATE, # made DATE type
PRIMARY KEY (transaction_id)
);
for customer_id we have 1-1000, staff_id we have 1-5 staff, car_id is
between 1-10,000, qty is 1-3
These fields have clear requirements about there range use can use MySQL rand() function in combination with a formula to generate those ranges this formula is
SELECT ROUND((RAND() * (MAX - MIN)) + MIN)
So for example for customer id the formula is
SELECT ROUND((RAND() * (1000 - 1)) + 1)
first try result
ROUND((RAND() * (1000 - 1)) + 1)
----------------------------------
648
second try result
ROUND((RAND() * (1000 - 1)) + 1)
----------------------------------
486
date_ordered is from 1975 to 2017, date_returned is from 1975 to 2017,
For the dates the difference between date_ordered and date_returned
should be between 2-3 days.
The date formula is a bit more complex.
But it still uses the ROUND((RAND() * (MAX - MIN)) + MIN) formula
SELECT DATE(FROM_UNIXTIME(ROUND((RAND() * (UNIX_TIMESTAMP('2017-12-31') - UNIX_TIMESTAMP('1975-01-01'))) + UNIX_TIMESTAMP('1975-01-01'))))
first try result
DATE(FROM_UNIXTIME(ROUND((RAND() * (UNIX_TIMESTAMP('2017-12-31') - UNIX_TIMESTAMP('1975-01-01'))) + UNIX_TIMESTAMP('1975-01-01'))))
-------------------------------------------------------------------------------------------------------------------------------------
2005-08-04
second try result
DATE(FROM_UNIXTIME(ROUND((RAND() * (UNIX_TIMESTAMP('2017-12-31') - UNIX_TIMESTAMP('1975-01-01'))) + UNIX_TIMESTAMP('1975-01-01'))))
-------------------------------------------------------------------------------------------------------------------------------------
1998-07-22
Now we will generate one record off data to combine all the last steps.
Query
SELECT
record.customer_id
, record.staff_id
, record.car_id
, record.qty
, record.date_ordered
, record.date_ordered + INTERVAL record.random_day DAY AS date_returned
FROM (
SELECT
(SELECT ROUND((RAND() * (1000 - 1)) + 1)) AS customer_id
, (SELECT ROUND((RAND() * (5 - 1)) + 1)) AS staff_id
, (SELECT ROUND((RAND() * (10000 - 1)) + 1)) AS car_id
, (SELECT ROUND((RAND() * (3 - 1)) + 1)) AS qty
, (DATE(FROM_UNIXTIME(FLOOR((RAND() * (UNIX_TIMESTAMP('2017-12-31') - UNIX_TIMESTAMP('1975-01-01'))) + UNIX_TIMESTAMP('1975-01-01')))) ) AS date_ordered
, (SELECT ROUND((RAND() * (3 - 2)) + 2)) AS random_day
FROM
DUAL
)
record
first try result
customer_id staff_id car_id qty date_ordered date_returned
----------- -------- ------ ------ ------------ ---------------
633 2 5553 3 2011-11-21 2011-11-24
second try result
customer_id staff_id car_id qty date_ordered date_returned
----------- -------- ------ ------ ------------ ---------------
300 4 2380 2 2010-08-21 2010-08-23
Procedure
DELIMITER $$
CREATE
PROCEDURE generate_random_data_car_transaction(IN numberOfRows INT)
BEGIN
DECLARE counter INT;
SET counter = 1;
WHILE (counter <= numberOfRows) DO
INSERT INTO
car_transaction
(
customer_id
, staff_id
, car_id
, qty
, date_ordered
, date_returned
)
SELECT
record.customer_id
, record.staff_id
, record.car_id
, record.qty
, record.date_ordered
, record.date_ordered + INTERVAL record.random_day DAY AS date_returned
FROM (
SELECT
(SELECT ROUND((RAND() * (1000 - 1)) + 1)) AS customer_id
, (SELECT ROUND((RAND() * (5 - 1)) + 1)) AS staff_id
, (SELECT ROUND((RAND() * (10000 - 1)) + 1)) AS car_id
, (SELECT ROUND((RAND() * (3 - 1)) + 1)) AS qty
, (DATE(FROM_UNIXTIME(FLOOR((RAND() * (UNIX_TIMESTAMP('2017-12-31') - UNIX_TIMESTAMP('1975-01-01'))) + UNIX_TIMESTAMP('1975-01-01')))) ) AS date_ordered
, (SELECT ROUND((RAND() * (3 - 2)) + 2)) AS random_day
FROM
DUAL
)
record;
SET counter = counter + 1;
END WHILE;
END$$
DELIMITER ;
CALL Procedure
CALL generate_random_data_car_transaction(500000);
Query
SELECT * FROM car_transaction
Result
transaction_id customer_id staff_id car_ID QTY date_ordered date_returned
-------------- ----------- -------- ------ ------ ------------ ---------------
1 757 2 2621 2 1982-03-10 1982-03-13
2 818 1 368 3 1989-06-06 1989-06-08
3 47 2 8538 2 2009-09-30 2009-10-02
4 670 2 4597 2 2005-03-20 2005-03-22
5 216 2 7651 3 2000-10-08 2000-10-10
6 502 2 1364 2 1978-03-28 1978-03-30
7 204 2 1910 2 2009-03-17 2009-03-20
8 398 2 3934 1 2013-07-02 2013-07-04
9 474 1 9286 2 1991-08-06 1991-08-09
10 976 1 724 2 2000-05-09 2000-05-12
...
...
...
499990 20 5 6595 2 1990-05-01 1990-05-03
499991 839 1 7315 2 1989-12-05 1989-12-07
499992 14 3 1274 2 1987-11-12 1987-11-14
499993 539 2 5422 1 1994-06-24 1994-06-26
499994 728 5 7441 3 2000-05-12 2000-05-15
499995 512 3 4039 2 1978-02-03 1978-02-06
499996 732 5 2599 2 1990-01-11 1990-01-14
499997 304 5 6098 2 2011-11-25 2011-11-27
499998 818 2 8196 2 1984-01-14 1984-01-16
499999 617 5 8160 2 2016-03-15 2016-03-18
500000 864 3 7837 2 1980-01-13 1980-01-15
If this were my project I'd look around the intertoobz for a software package or web app capable of generating random test data.
I'd get that to give me a CSV file full of data for all columns except the first.
I'd change the table definition to make the first column autoincrement.
Then I'd use
LOAD DATA INFILE filename INTO car_transaction
COLUMNS TERMINATED BY ','
LINES TERMINATED BY '\r\n' /* or maybe just '\n' */
(customer_id, staff_id, car_ID, QTY, date_ordered, date_returned)
to slurp the data from the file (called filename) into the table.
Related
I have a table like below
session stepId starttime
------ ----------- -----
1 1 10:00
1 1 10:10
1 2 10:40
1 3 10:50
1 4 11:00
And what I am aiming to calculate is the average time between each step Id, if the stepID is the same , like the first two rows, the most recent one is used.
For example, for the above query, the result should be ((10:40 - 10:10) + (10:50-10:40) + (11:00 - 10:50))/3.
I am using MySQL.
SELECT
TIME(AVG(M.timediff))
FROM
(SELECT
TIME(b.starttime - a.starttime) AS timediff
FROM
(SELECT
stepId, MAX(starttime) starttime
FROM
test.test
GROUP BY stepId) a
LEFT JOIN test.test b ON a.stepId = b.stepId - 1
WHERE
a.starttime IS NOT NULL
AND b.starttime IS NOT NULL) AS M
I have to design an optimized MySQL query for generating a report based on 2 tables.
I have 2 tables one for services and another for payments. I accept user specific criteria for services and based on that I have to report services and corresponding payments. These transactions from 2 different tables will be in order of service dates and corresponding payments in order of payment dates. Along with that, I also have to report any advance payments paid on account (in database terms payments not linked to any particular service)
Currently I run one query for selecting services and unlinked payments using UNION of 2 tables based on given criteria. Then I run separate query for each service related payment through a loop.
Is there any way I can get all these transactions via a single query and that too in desired order.
Here are the relevant columns of 2 tables.
service table
id (PK)
account_no
date
service_amount
tran_type
payment table
id
account_no
date
pmt_amount
service_id (FK to service table nulls acceptable)
tran_type
Here are the queries I am trying
Query 1
select account_no, id, date, service_amount, tran_type
from service where <user specified criteria like date range>
UNION
select account_no, id, date, pmt_amount, tran_type
from payment where service_id is null and
<user specified criteria like date range>
order by date
Query2
This query is run on individual services on result of above query ( tran_type is service)
select account_no, id, date, pmt_amount, tran_type
from payment where service_id= <specific id>
order by date
Service table Data
ID Item_Typ Date Amt Acct#
1 SVC 11/12/2015 10 1
2 SVC 11/20/2015 20 1
3 SVC 12/13/2015 40 1
4 SVC 4/1/2016 30 1
Payment table Data
ID Svc_ID Item_Typ Date Amt Acct#
1 1 PMT 11/15/2015 5 1
2 1 PMT 11/15/2015 5 1
3 2 PMT 11/25/2015 40 1
4 3 PMT 12/28/2015 35 1
5 2 PMT 12/30/2015 -15 1
7 NULL PMT 1/1/2016 12 2
8 NULL PMT 3/1/2016 35 3
Query 1 Result
ID Item_Typ Date Amt Acct#
1 SVC 11/12/2015 10 1
2 SVC 11/20/2015 20 1
3 SVC 12/13/2015 40 1
4 SVC 4/1/2016 30 1
7 PMT 1/1/2016 12 2
8 PMT 3/1/2016 35 3
Final result after fetching payments for all query result related services
tranTyp Date Amt Acct#
SVC 11/12/2015 10 1
PMT 11/15/2015 5 1
PMT 11/15/2015 5 1
SVC 11/20/2015 20 1
PMT 11/25/2015 40 1
PMT 12/30/2015 -15 1
SVC 12/13/2015 40 1
PMT 12/28/2015 35 1
drop table if exists service;
create table service (ID int, Item_Typ varchar(3), `Date` date, Amt int, Acct int);
insert into service values
(1, 'SVC', '2015-11-12', 10 , 1),
(2, 'SVC', '2015-11-20', 20 , 1),
(3, 'SVC', '2015-12-13', 40 , 1),
(4, 'SVC', '2016-01-04', 30 , 1),
(5, 'SVC', '2015-10-04', 50 , 1)
drop table if exists payment;
create table payment(ID INT, Svc_ID INT, Item_Typ VARCHAR(3), `Date` DATE, Amt INT, Acct INT);
INSERT INTO payment values
(1, 1 , 'PMT', '2015-11-15', 5 , 1),
(2, 1 , 'PMT', '2015-11-15', 5 , 1),
(3, 2 , 'PMT', '2015-11-25', 40 , 1),
(4, 3 , 'PMT', '2015-12-28', 35 , 1),
(5, 2 , 'PMT', '2015-12-30', -15 ,1),
(7, NULL , 'PMT', '2016-01-01', 12 , 2),
(8, NULL , 'PMT', '2016-03-01', 35 , 3);
MariaDB [sandbox]> select * from
-> (
-> select 1 as typ,id,Item_typ,`date`, `date` as svc_date,amt,acct from service
-> union all
-> select 2,p.svc_id,p.Item_typ,p.`date`,
-> case when s.id is null then now()
-> else s.`date`
-> end as svc_date,
-> p.amt, p.acct from payment p
-> left join service s on p.svc_id = s.id
-> ) s
->
-> order by s.svc_date,s.acct,s.typ,s.id
-> ;
I have two different tables :
Tank & FillingStations
In which one tank can be attached to many fillingstations.
Suppose:
SrNo TankID TankName TANK_Balance FillingStation_ID FS_NAme BALANCE
1 1 Tank1 5000 A11 FSA11 1545
2 1 Tank1 5000 A12 FSA12 1000
3 1 Tank1 5000 A13 FSA13 800
And i want to get a report as like :
SrNo TankID TankName TANK_Balance A11 BAL1 A12 BAL2 A13 BAL3 TOTAL
1 1 Tank1 5000 FSA11 1545 FSA12 1000 FSA13 800 3345
Try like this,
DECLARE #Table TABLE (
SrNo INT
,TankID INT
,TankName VARCHAR(10)
,TANK_Balance INT
,FillingStation_ID VARCHAR(10)
,FS_NAme VARCHAR(10)
,BALANCE INT
)
insert into #Table values
(1,1,'Tank1',5000,'A11','FSA11',1545)
,(2,1,'Tank1',5000,'A12','FSA12',1000)
,(3,1,'Tank1',5000,'A13','FSA13',800)
SELECT min(SrNo) as SrNo, TankID
,TankName
,TANK_Balance
,max([A11]) AS [A11]
,max([BAL1]) AS [BAL1]
,max([A12]) AS [A12]
,max([BAL2]) AS [BAL2]
,max([A13]) AS [A13]
,max([BAL3]) AS [BAL3]
,max([BAL1])+max([BAL2])+max([BAL3]) as Total
FROM (
SELECT SrNo,TankID
,TankName
,TANK_Balance
,FillingStation_ID
,replace(FillingStation_ID, 'A1', 'BAL') AS bFillingStation_ID
,FS_NAme
,BALANCE
FROM #Table
) s
pivot(Max(FS_NAme) FOR FillingStation_Id IN (
[A11]
,[A12]
,[A13]
)) t
pivot(Max(BALANCE) FOR bFillingStation_Id IN (
[BAL1]
,[BAL2]
,[BAL3]
)) t1
GROUP BY TankID
,TankName
,TANK_Balance
I have been stumped on this for quite awhile. Request#, SlotId, Segment, and Version all make up the primary key. What i want from my stored proc is to be able to retrieve all rows by passing in the Request # and Segment, but for each slot i want the most recent effective date on or before todays date and from that i need the highest version #. I appriciate your time.
Values in database
Request# SlotId Segment Version Effective Date ContentId
A123 1 A 1 2012-01-01 1
A123 2 A 1 2012-01-01 2
A123 2 A 2 2012-02-01 34
A123 2 A 3 2012-02-01 24
A123 2 A 4 2015-01-01 6 //beyond todays date. dont want
Values I want to return from my stored proc when i pass in A123 for Request # and A for Segment.
A123 1 A 1 2012-01-01 1
A123 2 A 3 2012-02-01 24
The query could be written like this:
; WITH cte AS
( SELECT Request, SlotId, Segment, Version, [Effective Date], ContentId,
ROW_NUMBER() OVER ( PARTITION BY Request, Segment, SlotId
ORDER BY Version DESC ) AS RowN
FROM
tableX
WHERE
Request = #Req AND Segment = #Seg --- the 2 parameters
AND [Effective Date] < DATEADD(day, 1, GETDATE())
)
SELECT Request, SlotId, Segment, Version, [Effective Date], ContentId
FROM cte
WHERE Rn = 1 ;
Consider this:
;
WITH A as
(
SELECT DISTINCT
Request
, Segment
, SlotId
FROM Table1
)
SELECT A.Request
, A.SlotId
, A.Segment
, B.EffectiveDate
, B.Version
, B.ContentID
FROM A
JOIN (
SELECT Top 1
Request
, SlotId
, Segment
, EffectiveDate
, Version
, ContentId
FROM Table1 t1
WHERE t1.Request = A.Request
AND t1.SlotId = A.SlotId
AND T1.Segment = A.Segment
AND T1.EffectiveDate <= GetDate()
ORDER BY
T1.EffectiveDate DESC
, T1.Version DESC
) as B
ON A.Request = B.Request
AND A.SlotId = B.SlotId
AND A.Segment = B.Segment
I have a table of available date blocks (7 days in my case) which may or may not be consecutive:
start_date end_date booked id room_id
2012-07-14 2012-07-21 0 1 6
2012-07-21 2012-07-28 0 2 6
2012-07-28 2012-08-04 1 3 6
2012-08-04 2012-08-11 0 4 6
What I'd like to do is be able to get a result set that gives me one row per X weeks of consecutive unbooked dates, within a date range.
So, for 2 week blocks starting on the 14th of July and using the above table data, I would expect the following:
start_date end_date booked
2012-07-14 2012-07-28 0
The second block of 2 weeks would not be returned as one of the component weeks is booked.
Here are a few ideas I've tried already:
SELECT
MIN(start_date) AS start_date_min,
MAX(end_date) AS end_date_max,
CAST(GROUP_CONCAT(id) AS CHAR) AS ids,
SUM(booked) AS booked
FROM
available_dates
WHERE
(start_date>=20120714 AND end_date<=DATE_ADD(20120714, INTERVAL 14 DAY))
GROUP BY
room_id
HAVING
end_date_max=DATE_ADD(20120714, INTERVAL 14 DAY)
This gets me part of the way, however doesn't get me the consecutive results - that is the important part. It also only returns a single result (probably because of the HAVING clause) when I widen the test data.
Can anyone point me in the right direction?
If you have a calendar or a numbers table:
CREATE TABLE num
( i INT NOT NULL
, PRIMARY KEY (i)
) ;
INSERT INTO num
(i)
VALUES
(0), (1), (2), ..., (1000) ;
You could use something like this:
SELECT
avail.room_id,
MIN(avail.start_date) AS start_date_min,
MAX(avail.end_date) AS end_date_max,
CAST(GROUP_CONCAT(avail.id) AS CHAR) AS ids,
SUM(avail.booked) AS booked
FROM
available_dates AS avail
CROSS JOIN
( SELECT DATE('2012-07-14') AS start_date_check
, 52 AS max_week_check
) AS param
JOIN
num
ON avail.start_date = param.start_date_check + INTERVAL num.i WEEK
AND num.i < param.max_week_check
WHERE
avail.booked = 0
GROUP BY
avail.room_id,
( num.i / 2 )
HAVING
COUNT(*) = 2
You could also have this:
WHERE
1 =1 --- no WHERE condition
GROUP BY
avail.room_id,
( num.i / 2 )
HAVING --- and optionally
SUM(avail.booked) = 0 --- this