MYSQL: Exclude duplicate scan logs within same day - mysql

I'm trying to select rows excluding duplicates in a day.
Criteria for duplicate is: SAME USER AND SAME PRODUCT_UPC AND SAME DATE(SCANNED_ON)
So, from the below table, if SCAN_ID = 100 is selected, exclude SCAN_ID = 101 since they belong to same user_id AND same product_upc AND have same DATE(scanned_on).
Here's the table structure:
SCAN_ID USER_ID PRODUCT_UPC SCANNED_ON
100 1 0767914767 2020-08-01 03:49:11
101 1 0767914767 2020-08-01 03:58:28
102 2 0064432050 2020-08-02 04:01:31
103 3 0804169977 2020-08-10 04:08:48
104 4 0875523846 2020-08-10 05:21:32
105 4 0007850492 2020-08-12 07:10:05
Query I've come up so far is:
SET #last_user='', #last_upc='', #last_date='';
SELECT *,
#last_user as last_user , #last_user:=user_id as this_user,
#last_upc as last_upc , #last_upc:=product_upc as this_upc,
#last_date as last_date , #last_date:=DATE(scanned_on) as this_date
FROM scansv2
HAVING this_user != last_user OR this_upc != last_upc OR this_date != last_date

In MySQL 8 you can use ROW_NUMVER for this
CREATE TABLE scansv2 (
`SCAN_ID` INTEGER,
`USER_ID` INTEGER,
`PRODUCT_UPC` INTEGER,
`SCANNED_ON` DATETIME
);
INSERT INTO scansv2
(`SCAN_ID`, `USER_ID`, `PRODUCT_UPC`, `SCANNED_ON`)
VALUES
('100', '1', '0767914767', '2020-08-01 03:49:11'),
('101', '1', '0767914767', '2020-08-01 03:58:28'),
('102', '2', '0064432050', '2020-08-02 04:01:31'),
('103', '3', '0804169977', '2020-08-10 04:08:48'),
('104', '4', '0875523846', '2020-08-10 05:21:32'),
('105', '4', '0007850492', '2020-08-12 07:10:05');
WITH rownum AS (SELECT `SCAN_ID`, `USER_ID`, `PRODUCT_UPC`, `SCANNED_ON`,ROW_NUMBER() OVER (
PARTITION BY `PRODUCT_UPC`
ORDER BY `SCANNED_ON` DESC) row_num FROM scansv2)
SELECT `SCAN_ID`, `USER_ID`, `PRODUCT_UPC`, `SCANNED_ON` FROM rownum WHERE row_num = 1 ORDER BY `SCAN_ID`
SCAN_ID | USER_ID | PRODUCT_UPC | SCANNED_ON
------: | ------: | ----------: | :------------------
101 | 1 | 767914767 | 2020-08-01 03:58:28
102 | 2 | 64432050 | 2020-08-02 04:01:31
103 | 3 | 804169977 | 2020-08-10 04:08:48
104 | 4 | 875523846 | 2020-08-10 05:21:32
105 | 4 | 7850492 | 2020-08-12 07:10:05
db<>fiddle here
in MySQL 5.x you need user defined variables for the same purpose
SELECT `SCAN_ID`, `USER_ID`, `PRODUCT_UPC`, `SCANNED_ON`
FROM
(SELECT `SCAN_ID`, `USER_ID`, `SCANNED_ON`,
IF (#product = `PRODUCT_UPC`,#row_num := #row_num + 1,#row_num := 1) row_num
, #product := `PRODUCT_UPC` PRODUCT_UPC
FROM (SELECT * FROM scansv2 ORDER BY `PRODUCT_UPC`, `SCANNED_ON`) c,(SELECT #row_num := 0,#product := 0) a ) b
WHERE row_num = 1 ORDER BY `SCAN_ID`
SCAN_ID | USER_ID | PRODUCT_UPC | SCANNED_ON
------: | ------: | ----------: | :------------------
100 | 1 | 767914767 | 2020-08-01 03:49:11
102 | 2 | 64432050 | 2020-08-02 04:01:31
103 | 3 | 804169977 | 2020-08-10 04:08:48
104 | 4 | 875523846 | 2020-08-10 05:21:32
105 | 4 | 7850492 | 2020-08-12 07:10:05
db<>fiddle here

In most databases (including MySQL pre-8.0), filtering with a subquery is a supported and and efficient option:
select s.*
from scansv2 s
where s.scanned_on = (
select min(s1.scanned_on)
from scansv2 s1
where
s1.user_id = s.user_id
and s1.product_upc = s.product_upc
and s1.scanned_on >= date(s.scanned_on)
and s1.scanned_on < date(s.scanned_on) + interval 1 day
)
This gives you the first row per user_id, product_upc and day, and filters out the other ones, if any.

Related

MySQL select all rows from last N groups

I have a dataset like this, where there can be multiple transactions per trade
| tx_id | trade_id |
--------------------
| 100 | 11 |
| 99 | 11 |
| 98 | 11 |
| 97 | 10 |
| 96 | 10 |
| 95 | 9 |
| 94 | 9 |
| 93 | 8 |
...
I want to select all of the transactions from the last N trades. For instance if I wanted to select all rows from the last 2 trades, I would get:
| tx_id | trade_id |
--------------------
| 100 | 11 |
| 99 | 11 |
| 98 | 11 |
| 97 | 10 |
| 96 | 10 |
I cannot guarantee that the trade_id will always have an interval of 1.
How can I accomplish this in mysql?
This will also work with mysql 5
Changing the linit , you can choose how many trades you want to receive
CREATE TABLE tab1 (
`tx_id` INTEGER,
`trade_id` INTEGER
);
INSERT INTO tab1
(`tx_id`, `trade_id`)
VALUES
('100', '11'),
('99', '11'),
('98', '11'),
('97', '10'),
('96', '10'),
('95', '9'),
('94', '9'),
('93', '8');
SELECT t1.* FROM tab1 t1 JOIN (SELECT DISTINCT `trade_id` FROM tab1 ORDER BY `trade_id` DESC LIMIT 2) t2
ON t1.`trade_id` = t2.`trade_id`
tx_id | trade_id
----: | -------:
100 | 11
99 | 11
98 | 11
97 | 10
96 | 10
db<>fiddle here
You use DENSE_RANK on trade_id descending, then filter on your required X for "last X":
CREATE TABLE t (tx_id int, trade_id int);
INSERT INTO t (tx_id, trade_id) VALUES
(100,11),
(99,11),
(98,11),
(97,10),
(96,10),
(95,9),
(94,9),
(93,8);
SET #ngroups=2;
WITH dat
AS
(
SELECT tx_id, trade_id, DENSE_RANK() OVER (ORDER BY trade_id DESC) AS trade_id_rank
FROM t
)
SELECT tx_id, trade_id
FROM dat
WHERE trade_id_rank <= #ngroups;
dbfiddle.uk
If we assume the "last trades" are the ones with the highest trade_id numbers, then you can use DENSE_RANK().
For example:
select *
from (
select *,
dense_rank() over(order by trade_id desc) as dr
from t
) x
where dr <= 2
This can be done with a CTE
WITH trades AS
SELECT trade_id tid
FROM myTable
GROUP BY trade_id
ORDER BY trade_id
LIMIT 2
SELECT * FROM
trades
JOIN myTable ON trade_id = tid
ORDER BY tx_id;

I would to update a field in my table from mysql db - but the problem statement is as follows in the body

The following it the data that explains has both my problem and desired output. Please look below for more details
+------------+-----+--------------+-------------------+
| Date | ID | Payment Done | The Problem Field |
+------------+-----+--------------+-------------------+
| 2020-02-15 | 111 | 1 | 0 |
| 2020-03-15 | 111 | 0 | 0 |
| 2020-04-15 | 111 | 0 | -1 |
| 2020-03-15 | 222 | 0 | 0 |
| 2020-03-31 | 222 | 0 | -1 |
| 2020-04-14 | 222 | 1 | 0 |
| 2020-02-29 | 333 | 0 | 0 |
| 2020-03-15 | 333 | 0 | -1 |
| 2020-03-35 | 333 | 1 | 0 |
| 2020-04-14 | 333 | 0 | 0 |
+------------+-----+--------------+-------------------+
The picture contains data to explain the problem I'm stuck with. For a given ID, if the payment is not done in a date D1 and payment is also not done in the previous date D2 (D2
So, wherever the problem field is -1, it means that during that date and also during the previous date(the difference between dates is not same always), the payment was not done by that specific ID.
I tried to to do this using a query and spent more than half a day. Then I used Python script to do this - but my ego is not satisfied yet. I would like to do this using a query - if at all possible.
MY IDEA:
My approach to solving was to write a query that says the following:
Take a date D1 for an ID.
Get the Max(date) for the same ID when the date is less than D1 - meaning I'll get the immediate lesser record's date.
Then check if "payment done" is 0 in both the rows and if yes, then update the D1 row's problem field to -1.
By the way, I'm not much experienced with MySQL and please forgive me for being a naive learner. The MySQL Server version I'm using is 5.6.41
I would like you to see the query(error somewhere - innermost query is unable to access the outer table's variable) that I tried to write to say that I mentioned above:
update my_table a set the_problem_field = -1 where 0 = (select payment_done from (select payment_done from my_table where id = a.id and date =(select max(date) from my_table where id = a.id and date<a.date))T) and a.payment_done = 0;
The same Neatly Formatted:
UPDATE my_table a
SET the_problem_field = -1
WHERE 0 = (SELECT payment_done
FROM (SELECT payment_done
FROM my_table
WHERE id = a.id
AND date = (SELECT Max(date)
FROM my_table
WHERE id = a.id
AND date < a.date))T)
AND a.payment_done = 0;
I would like someone to help me with this....please! Any help is very much appreciated. Thanks for your time.
There was no need for a stored procedure.
The crucial part is the inner join which produces the wanted result.
You can realizes that by "remembering" the last ID and Payment Done
Column names should not have spaces in them or else you have always to remember tol encapsulate them always
UPDATE Paymnettable pt
INNER JOIN
(SELECT
`Date`,
IF(#id = ID, IF(`Payment Done` = 1, 0, IF(#Payment_Done = 0, - 1, 0)), 0) `The Problem Field`,
#id:=ID ID,
#Payment_Done:=`Payment Done` 'Payment Done'
FROM
(SELECT
*
FROM
Paymnettable
ORDER BY `ID` , `Date`) t1
, (SELECT #Payment_Done:=0) t2
, (SELECT #ID:=0) t3) pt1
ON pt.`Date` = pt1.`Date`
AND pt.ID = pt1.ID
SET
pt.`The Problem Field` = pt1.`The Problem Field`
Schema (MySQL v5.6)
CREATE TABLE Paymnettable (
`Date` DATE,
`ID` INTEGER,
`Payment Done` INTEGER,
`The Problem Field` INTEGER
);
INSERT INTO Paymnettable
(`Date`, `ID`, `Payment Done`, `The Problem Field`)
VALUES
('2020-02-15', '111', '1', '0'),
('2020-03-15', '111', '0', '0'),
('2020-04-15', '111', '0', '0'),
('2020-03-15', '222', '0', '0'),
('2020-03-31', '222', '0', '0'),
('2020-04-14', '222', '1', '0'),
('2020-02-29', '333', '0', '0'),
('2020-03-15', '333', '0', '0'),
('2020-03-31', '333', '1', '0'),
('2020-04-14', '333', '0', '0');
UPDATE Paymnettable pt
INNER JOIN
(SELECT
`Date`,
IF(#id = ID, IF(`Payment Done` = 1, 0, IF(#Payment_Done = 0, - 1, 0)), 0) `The Problem Field`,
#id:=ID ID,
#Payment_Done:=`Payment Done` 'Payment Done'
FROM
(SELECT
*
FROM
Paymnettable
ORDER BY `ID` , `Date`) t1, (SELECT #Payment_Done:=0) t2, (SELECT #ID:=0) t3) pt1 ON pt.`Date` = pt1.`Date`
AND pt.ID = pt1.ID
SET
pt.`The Problem Field` = pt1.`The Problem Field`
Query #1
SELECT
*
FROM
Paymnettable;
| Date | ID | Payment Done | The Problem Field |
| ---------- | --- | ------------ | ----------------- |
| 2020-02-15 | 111 | 1 | 0 |
| 2020-03-15 | 111 | 0 | 0 |
| 2020-04-15 | 111 | 0 | -1 |
| 2020-03-15 | 222 | 0 | 0 |
| 2020-03-31 | 222 | 0 | -1 |
| 2020-04-14 | 222 | 1 | 0 |
| 2020-02-29 | 333 | 0 | 0 |
| 2020-03-15 | 333 | 0 | -1 |
| 2020-03-31 | 333 | 1 | 0 |
| 2020-04-14 | 333 | 0 | 0 |
View on DB Fiddle

SQL get range of dates in between given dates grouped by another column in a table

In this table -
----------------------------------------------
ID | user | type | timestamp
----------------------------------------------
1 | 1 | 1 | 2019-02-08 15:00:00
2 | 1 | 3 | 2019-02-15 15:00:00
3 | 1 | 2 | 2019-03-06 15:00:00
4 | 2 | 3 | 2019-02-01 15:00:00
5 | 2 | 1 | 2019-02-06 15:00:00
6 | 3 | 1 | 2019-01-10 15:00:00
7 | 3 | 4 | 2019-02-08 15:00:00
8 | 3 | 3 | 2019-02-24 15:00:00
9 | 3 | 2 | 2019-03-04 15:00:00
10 | 3 | 3 | 2019-03-05 15:00:00
I need to find the number of days every user has been in a particular type in the given range of days.
Eg: For the given range 2019-02-01 to 2019-03-04, the output should be
--------------------------------
user | type | No. of days
--------------------------------
1 | 1 | 7
1 | 3 | 17
2 | 3 | 6
3 | 1 | 29
2 | 4 | 16
2 | 3 | 8
The use can switch between types at any day but I need to capture all those switches and the number of days the user has been in a type. I currently solve this by getting all the values and filtering stuff manually in JS. Is there any way to do this by a SQL query? I use MYSQL 5.7.23.
EDIT:
The above output is incorrect but really appreciate everyone overlooking that and helping me with the right query. Here is the correct output for this question -
--------------------------------
user | type | No. of days
--------------------------------
1 | 1 | 7
1 | 3 | 19
2 | 3 | 5
3 | 1 | 29
3 | 2 | 1
3 | 3 | 8
3 | 4 | 16
Use lead() and then datediff() and sum() and a lot of date comparisons:
select user, type,
sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
lead(timestamp, 1, '2019-03-04') over (partition by user order by timestamp) as next_ts
from t
) t
where next_ts >= '2019-02-01' and
timestamp <= '2019-03-04'
group by user, type;
EDIT:
In older versions, you can use:
select user, type,
sum(datediff( least(next_ts, '2019-03-04'), greatest(timestamp, '2019-02-01'))
from (select t.*,
(select coalesce(min(timestamp), '2019-03-04')
from t t2
where t2.user = t.user and t2.timestamp > t.timestamp
) as next_ts
from t
) t
where next_ts >= '2019-02-01' and
timestamp <= '2019-03-04'
group by user, type;
Here is one way to do it in MysQL 5.7 and without user variables:
select
t.user,
t.type,
sum(datediff(
greatest(tlead.timestamp, '2019-02-01'),
least(t.timestamp, '2019-03-04'))
) no_of_days
from mytable t
inner join mytable tlead
on tlead.user = t.user
and tlead.timestamp > t.timestamp
and not exists (
select 1
from mytable t1
where
t1.user = t.user
and t1.timestamp > t.timestamp
and t1.timestamp < tlead.timestamp
)
where tlead.timestamp >= '2019-02-01' and t.timestamp <= '2019-03-04'
group by t.user, t.type
order by t.user, t.type
This basically emulates lead() with a self-join and a not exists condition: table alias tlead is the next record for the same user. The rest is filtering, aggregating, and computing date differences within the target date range.
Demo on DB Fiddle - results are not exactly the same as yours, but I suspect they are actually correct:
user | type | no_of_days
---: | ---: | ---------:
1 | 1 | 7
1 | 3 | 19
2 | 3 | 5
3 | 1 | 29
3 | 2 | 1
3 | 3 | 8
3 | 4 | 16
You don't get exactly as you wanted, but its accurate
SELECT
`user`
,`type`
,dategone `No. of days`
FROM
(SELECT
`type`,
IF(#id = `user`,DATEDIFF(`timestamp` , #days), -1) dategone #
,#id := `user` `user`
,#days := `timestamp`
FROM
(SELECT
`D`, `user`, `type`, `timestamp`
From table1
ORDER BY `user` ASC, `timestamp` ASC) a
, (SELECT #days :=0) b, (SELECT #id :=0) c) d
WHERE dategone > -1;
CREATE TABLE table1 (
`D` INTEGER,
`user` INTEGER,
`type` INTEGER,
`timestamp` VARCHAR(19)
);
INSERT INTO table1
(`D`, `user`, `type`, `timestamp`)
VALUES
('1', '1', '1', '2019-02-08 15:00:00'),
('2', '1', '3', '2019-02-15 15:00:00'),
('3', '1', '2', '2019-03-06 15:00:00'),
('4', '2', '3', '2019-02-01 15:00:00'),
('5', '2', '1', '2019-02-06 15:00:00'),
('6', '3', '1', '2019-01-10 15:00:00'),
('7', '3', '4', '2019-02-08 15:00:00'),
('8', '3', '3', '2019-02-24 15:00:00'),
('9', '3', '2', '2019-03-04 15:00:00'),
('10', '3', '3', '2019-03-05 15:00:00');
✓
✓
SELECT
`user`
,`type`
,dategone `No. of days`
FROM
(SELECT
`type`,
IF(#id = `user`,DATEDIFF(`timestamp` , #days), -1) dategone #
,#id := `user` `user`
,#days := `timestamp`
FROM
(SELECT
`D`, `user`, `type`, `timestamp`
From table1
ORDER BY `user` ASC, `timestamp` ASC) a, (SELECT #days :=0) b, (SELECT #id :=0) c) d
WHERE dategone > -1;
user | type | No. of days
---: | ---: | ----------:
1 | 3 | 7
1 | 2 | 19
2 | 1 | 5
3 | 4 | 29
3 | 3 | 16
3 | 2 | 8
3 | 3 | 1
db<>fiddle here
This should give you what you want:
select id, user, type, time_stamp, (
select datediff(min(time_stamp), t1.time_stamp)
from table1 as t2
where t2.user = t1.user
and t2.time_stamp > t1.time_stamp
) as days
from table1 as t1
where 0 < (select count(*) from table1 as t3 where t3.user = t1.user
and t3.time_stamp > t1.time_stamp )
order by id;
Working in a fiddle here: http://sqlfiddle.com/#!9/347ab5/26
If you also want the "final" row for each user use this variation:
select id, user, type, time_stamp, (
select datediff(coalesce(min(time_stamp),current_timestamp()) , t1.time_stamp)
from table1 as t2
where t2.user = t1.user
and t2.time_stamp > t1.time_stamp
) as days
from table1 as t1
order by id;

MySQL count changes

I would like to count number of changes in column Value grouped by Id using MySQL.
Source Table:
create table sequence
(
`Id` int,
`Date` date,
`Value` int not null,
PRIMARY KEY (`Id`,`Date`)
);
insert into sequence
( `Id`,`Date`, `Value` )
values
(1, '2016-01-01' , 0 ),
(1, '2016-01-02' , 10 ),
(1, '2016-01-03' , 0 ),
(1, '2016-01-05' , 0 ),
(1, '2016-01-06' , 10 ),
(1, '2016-01-07' , 15 ),
(2, '2016-01-08' , 15 );
Visualization:
+------------+-------+-------+
| Date | ID | Value |
+------------+-------+-------+
| 2016-01-01 | 1 | 0 |
| 2016-01-02 | 1 | 10 | (change)
| 2016-01-03 | 1 | 0 | (change)
| 2016-01-05 | 1 | 0 |
| 2016-01-06 | 1 | 10 | (change)
| 2016-01-07 | 1 | 15 | (change)
| 2016-01-08 | 2 | 15 |
+------------+-------+-------+
Expected output:
+-------+-------+
| ID | Value |
+-------+-------+
| 1 | 4 |
| 2 | 0 |
+-------+-------+
I would like to ask if there is a way how to do this in SQL.
This is not the very efficient or elegant solution,
but just to show some goals that you can achieve using mysql :-)
http://sqlfiddle.com/#!9/1db14/6
SELECT t1.id, MAX(t1.changes)
FROM (SELECT t.*,
IF (#i IS NULL,#i:=0,IF(#lastId <> id,#i:=0,IF (#lastV <> `value`, #i:=#i+1, #i:=#i))) as changes,
#lastV := `value`,
#lastId := `id`
FROM (
SELECT *
FROM sequence
ORDER BY id, date) t
) t1
GROUP BY t1.id

MySQL: Join distinct rows of two tables in a certain order?

I have a list of inventory units and sale transactions that I want to, (1) join by unit SKU, and (2) associate ONE transaction to ONE inventory unit in first-in-first-out order by date. I'm having trouble with the second part.
The best I can come up with is:
SELECT `units`.`unit_date`, `units`.`unit_id`, `trans`.`tran_date`, `trans`.`tran_id`, `units`.`unit_sku` FROM `units`
INNER JOIN `trans`
ON `trans`.`unit_sku` = `units`.`unit_sku`
GROUP BY `trans`.`tran_id`, `trans`.`unit_sku`
ORDER BY `units`.`unit_date` asc, `trans`.`tran_date` asc
;
units table:
unit_date | unit_id | unit_sku
2015-06-01 | 1 | U1KLM
2015-06-02 | 2 | U1KLM
2015-06-03 | 3 | U2QRS
2015-06-04 | 4 | U2QRS
2015-06-05 | 5 | U1KLM
trans table:
tran_date | tran_id | unit_sku
2015-06-11 | A | U2QRS
2015-06-12 | B | U1KLM
2015-06-13 | C | U1KLM
2015-06-14 | D | U2QRS
2015-06-15 | E | U1KLM
The desired result is one tran_id to be joined to one unit_id of the unit_sku by earliest-to-latest order of unit_date:
unit_date | unit_id | tran_date | tran_id | unit_sku
2015-06-01 | 1 | 2015-06-12 | B | U1KLM
2015-06-02 | 2 | 2015-06-13 | C | U1KLM
2015-06-03 | 3 | 2015-06-11 | A | U2QRS
2015-06-04 | 4 | 2015-06-14 | D | U2QRS
2015-06-05 | 5 | 2015-06-15 | E | U1KLM
The query (undesired) result joins tran_id only to the unit_id of the earliest occurrence of unit_sku:
unit_date | unit_id | tran_date | tran_id | unit_sku
2015-06-01 | 1 | 2015-06-12 | B | U1KLM
2015-06-01 | 1 | 2015-06-13 | C | U1KLM
2015-06-01 | 1 | 2015-06-15 | E | U1KLM
2015-06-03 | 3 | 2015-06-11 | A | U2QRS
2015-06-03 | 3 | 2015-06-14 | D | U2QRS
Any ideas on how to do get the desired result? In this setup, only unit_date and tran_date are sortable; the rest are randomly generated.
Repro script:
DROP TEMPORARY TABLE IF EXISTS `units`;
DROP TEMPORARY TABLE IF EXISTS `trans`;
CREATE TEMPORARY TABLE `units` (`unit_date` date, `unit_id` char(1) , `unit_sku` char(5), PRIMARY KEY(`unit_id`));
CREATE TEMPORARY TABLE `trans` (`tran_date` date, `tran_id` char(1) , `unit_sku` char(5), PRIMARY KEY(`tran_id`));
INSERT INTO `units` (`unit_date`, `unit_id`, `unit_sku`) VALUES
('2015-06-01', '1', 'U1KLM')
, ('2015-06-02', '2', 'U1KLM')
, ('2015-06-03', '3', 'U2QRS')
, ('2015-06-04', '4', 'U2QRS')
, ('2015-06-05', '5', 'U1KLM')
;
INSERT INTO `trans` (`tran_date`, `tran_id`, `unit_sku`) VALUES
('2015-06-11', 'A', 'U2QRS')
, ('2015-06-12', 'B', 'U1KLM')
, ('2015-06-13', 'C', 'U1KLM')
, ('2015-06-14', 'D', 'U2QRS')
, ('2015-06-15', 'E', 'U1KLM')
;
SELECT `units`.`unit_date`, `units`.`unit_id`, `trans`.`tran_date`, `trans`.`tran_id`, `units`.`unit_sku` FROM `units`
INNER JOIN `trans`
ON `trans`.`unit_sku` = `units`.`unit_sku`
GROUP BY `trans`.`tran_id`, `trans`.`unit_sku`
ORDER BY `units`.`unit_date` asc, `trans`.`tran_date` asc
;
I believe this is what you're looking for: (This is assuming that 1 to 1 relationship)
SET #UNITRN := 0;
SET #TRANSRN :=0;
SELECT A.`unit_date`, A.`unit_id`, B.`tran_date`, B.`tran_id`, A.`unit_sku` FROM (SELECT #UNITRN := #UNITRN + 1 AS ROWNUM, UNIT_DATE, UNIT_ID, UNIT_SKU FROM UNITS ORDER BY UNIT_SKU, UNIT_DATE ASC) A
JOIN (SELECT #TRANSRN := #TRANSRN + 1 AS ROWNUM, TRAN_DATE, TRAN_ID, UNIT_SKU FROM TRANS ORDER BY UNIT_SKU, TRAN_DATE ASC) B
ON A.UNIT_SKU = B.UNIT_SKU
AND A.ROWNUM = B.ROWNUM
ORDER BY A.UNIT_DATE ASC