Unable to INSERT ON DUPLICATE KEY UPDATE from another query - mysql

I am working on the following table:
CREATE TABLE `cons` (
`Id` char(20) NOT NULL,
`Client_ID` char(12) NOT NULL,
`voice_cons` decimal(11,8) DEFAULT '0.00000000',
`data_cons` int(11) DEFAULT '0',
`day` date DEFAULT NULL,
PRIMARY KEY (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I need to get some data from another table, cdr, which contains a row per event. This means every call or data connection has its own row.
+-----------+--------------+----------------+-------+
| Client_ID | Data_Up_Link | Data_Down_Link | Price |
+-----------+--------------+----------------+-------+
| 1 | 23 | 56 | 0 |
| 1 | 12 | 3 | 0 |
| 1 | 0 | 0 | 5 |
+-----------+--------------+----------------+-------+
I need to compute the total voice and data consumption for each Client_ID in my new cons table, but just keeping a single record for each Client_ID and day. To keep the question simple, I will consider just one day.
+-----------+-----------+------------+
| Client_ID | data_cons | voice_cons |
+-----------+-----------+------------+
| 1 | 94 | 5 |
+-----------+-----------+------------+
I have unsuccessfully tried the following, among many other (alias, .
insert into cons_day (Id, Client_ID, voice_cons, MSISDN, day)
select
concat(Client_ID,date_format(date,'%Y%m%d')),
Client_ID,
sum(Price) as voice_cons,
date as day
from cdr
where Type_Cdr='VOICE'
group by Client_ID;
insert into cons_day (Id, Client_ID, data_cons, MSISDN, day)
select
concat(Client_ID,date_format(date,'%Y%m%d')),
Client_ID,
sum(Data_Down_Link+Data_Up_Link) as data_cons,
Calling_Number as MSISDN,
date as day
from cdr
where Type_Cdr='DATA'
group by Client_ID
on duplicate key update data_cons=data_cons;
But I keep getting the values unchanged or receiving SQL errors. I would really appreciate a piece of advice.
Thank you very much in advance.

First of all it seems that Id column in cons table is absolutely redundant. You already have ClientID and Day columns. Just make them PRIMARY KEY.
That being said the proposed table schema might look like
CREATE TABLE `cons`
(
`Client_ID` char(12) NOT NULL,
`voice_cons` decimal(11,8) DEFAULT '0.00000000',
`data_cons` int(11) DEFAULT '0',
`day` date DEFAULT NULL,
PRIMARY KEY (`Client_ID`, `day`)
);
Now you can use conditional aggregation to get your voice_cons and data_cons in one go
SELECT Client_ID,
SUM(CASE WHEN Type_CDR = 'VOICE' THEN price END) voice_cons,
SUM(CASE WHEN Type_CDR = 'DATA' THEN Data_Up_Link + Data_Down_Link END) data_cons,
DATE(date) day
FROM cdr
GROUP BY Client_ID, DATE(date)
Note: you have to GROUP BY both by Client_ID and DATE(date)
Now the INSERT statement should look like
INSERT INTO cons (Client_ID, voice_cons, data_cons, day)
SELECT Client_ID,
SUM(CASE WHEN Type_CDR = 'VOICE' THEN price END) voice_cons,
SUM(CASE WHEN Type_CDR = 'DATA' THEN Data_Up_Link + Data_Down_Link END) data_cons,
DATE(date) day
FROM cdr
GROUP BY Client_ID, DATE(date)
ON DUPLICATE KEY UPDATE voice_cons = VALUES(voice_cons),
data_cons = VALUES(data_cons);
Note: since now you simultaneously get both voice_cons and data_cons you might not need ON DUPLICATE KEY clause at all if you don't process data for the same dates multiple times.
Here is SQLFiddle demo

Related

Query to find an entry between dates

I have a table containing several records associated to the same entities. Two of the fields are dates - start and end dates of a specific period.
Example:
ID
Name
Start
End
3
Fred
2022/01/01
2100/12/31
2
John
2018/01/01
2021/12/31
1
Mark
2014/03/22
2017/12/31
The dates and names vary, but the only rule is that there are NO OVERLAPS - it's a succession of people in charge of a unique role, so there is only one record which is valid for any date.
I have a query returning me a date (let's call it $ThatDay) and what I am trying to do is to find a way to find which name it was at that specific date. For example, if the date was July 4th, 2019, the result of the query I am after would be "John"
I have run out of ideas on how to structure a query to help me find it. Thank you in advance for any help!
you can use a SELECT with BETWEEN as WHERE clause
The date format of MySQL is yyyy-mm-dd , if you keep that you wil never have problems
CREATE TABLE datetab (
`ID` INTEGER,
`Name` VARCHAR(4),
`Start` DATETIME,
`End` DATETIME
);
INSERT INTO datetab
(`ID`, `Name`, `Start`, `End`)
VALUES
('3', 'Fred', '2022/01/01', '2100/12/31'),
('2', 'John', '2018/01/01', '2021/12/31'),
('1', 'Mark', '2014/03/22', '2017/12/31');
SELECT `Name` FROM datetab WHERE '2019-07-04' BETWEEN `Start` AND `End`
| Name |
| :--- |
| John |
db<>fiddle here
If ou have a (Sub)- Query with a date as result,you can join it for example
SELECT `Name`
FROM datetab CROSS JOIN (SELECT '2019-07-04' as mydate FROM dual) t1
WHERE mydate BETWEEN `Start` AND `End`
| Name |
| :--- |
| John |
db<>fiddle here
Also when the query only return one row and column you can use the subquery like this
SELECT `Name`
FROM datetab
WHERE (SELECT '2019-07-04' as mydate FROM dual) BETWEEN `Start` AND `End`
| Name |
| :--- |
| John |
db<>fiddle here
Select where the result of your find-date query is between start and end:
select * from mytable
where (<my find date query>)
between start and end

Difference between 2 records timestamp sql

I have this table:
CREATE TABLE result (
id bigint(20) NOT NULL AUTO_INCREMENT,
tag int(11) NOT NULL,
timestamp timestamp NULL DEFAULT NULL,
value double NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY nasudnBBEby333412dsa (timestamp, tag)
) ENGINE=InnoDB AUTO_INCREMENT=115 DEFAULT CHARSET=utf8mb4;
I would like to calculate the difference between two consecutive days that have the same column tag. For example, in timestamp:
| 1 | 1 | 2017-06-18 00:00:00 | 7.3 |
| 2 | 1 | 2017-06-17 00:00:00 | 7.4 |
I want to result: -0.1
Which query should i write?
You can try this
1) Use join to select value of next consecutive day.
2) then calculate difference
SELECT r1.id, r1.tag, r1.value AS CURRENT_VALUE, r2.value AS NEXT_VALUE, (
r1.value - r2.value
) AS DIFF, r1.timestamp
FROM `result` r1
LEFT JOIN result r2 ON r2.tag=r1.tag AND r2.`timestamp` = r1.`timestamp` + INTERVAL 1
DAY WHERE r2.value IS NOT NULL
GROUP BY r1.timestamp
Output
First, if you want to store date values, you can use date, so there is no time component.
Second, you can do this with join:
select r.*, (r.value - rprev.value) as diff
from results r left join
results rprev
on r.tag = rprev.tag and
r.timestamp = rprev.timestamp + interval 1 day;

MySQL/memSQL not using index on BETWEEN join condition

We have two tables:
A dates table that contains one date per day for the last 10 and next 10 years.
A states table that has the following columns: start_date, end_date, state.
The query we run looks like this:
SELECT dates.date, COUNT(*)
FROM dates
JOIN states
ON dates.date BETWEEN states.start_date AND states.end_date
WHERE dates.date BETWEEN '2017-01-01' AND '2017-01-31'
GROUP BY dates.date
ORDER BY dates.date;
According to the query plan, memSQL isn't using an index on the JOIN condition and this makes the query slow. Is there a way we can use an index on the JOIN condition?
We tried memSQL skiplist indexes on dates.date, states.start_date, states.end_date, (states.start_date, states.end_date)
Tables & EXPLAIN:
CREATE TABLE `dates` (
`date` date DEFAULT NULL,
KEY `date_index` (`date`)
)
CREATE TABLE `states` (
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`state` varchar(256) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
KEY `start_date` (`start_date`),
KEY `end_date` (`end_date`),
KEY `start_date_end_date` (`start_date`,`end_date`),
)
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| GatherMerge [remote_0.date] partitions:all est_rows:96 alias:remote_0 |
| Project [r2.date, CAST(COALESCE($0,0) AS SIGNED) AS `COUNT(*)`] est_rows:96 |
| Sort [r2.date] |
| HashGroupBy [SUM(r2.`COUNT(*)`) AS $0] groups:[r2.date] |
| TableScan r2 storage:list stream:no |
| Repartition [r1.date, `COUNT(*)`] AS r2 shard_key:[date] est_rows:96 est_select_cost:26764032 |
| HashGroupBy [COUNT(*) AS `COUNT(*)`] groups:[r1.date] |
| Filter [r1.date <= states.end_date] |
| NestedLoopJoin |
| |---IndexRangeScan drstates_test.states, KEY start_date (start_date) scan:[start_date <= r1.date] est_table_rows:123904 est_filtered:123904 |
| TableScan r1 storage:list stream:no |
| Broadcast [dates.date] AS r1 distribution:tree est_rows:96 |
| IndexRangeScan drstates_test.dates, KEY date_index (date) scan:[date >= '2017-01-01' AND date <= '2017-01-31'] est_table_rows:18628 est_filtered:96 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
ON dates.date BETWEEN states.start_date
AND states.end_date
is essentially un-optimizable. The only practical way to perform this test is to tediously test every row.
If you are using MySQL and don't need the dates table, consider starting with
SELECT *
FROM states
WHERE start_date >= '2017-01-01'
AND end_date < '2017-01-01' + INTERVAL 1 MONTH
Note that this works for any combination of DATE and DATETIME datatypes.
Since I am unclear on the ultimate goal, I am unclear on what to do next.

Validating presence of value(s) in a (sub)table and return a "boolean" result

I want to create a query in MySQL, on an order table and verify if it has a booking id, if it does not have a booking_id it should available on all relations in the invoice table.
I want the value returned to be a boolean in a single field.
Taken the example given, in
Case of id #1 I expect an immediate true, because it's available
Case of id #2 I expect an "delayed" false from the invoice table as not all related invoices have an booking_id, it should only return true if invoice id #3 actually has an booking id, meaning all invoices have an booking_id when the order does not.
I've tried several ways but still failed and don't even know what the best way to tackle this is.
Thanks for your input in advance!
Table order:
|----+------------+
| id | booking_id |
|----+------------+
| 1 | 123 |
| 2 | NULL |
|----+------------+
Table invoice:
+----+----------+------------+
| id | order_id | booking_id |
+----+----------+------------+
| 1 | 1 | 123 |
| 2 | 2 | 124 |
| 3 | 2 | NULL |
+----+----------+------------+
Schema
CREATE TABLE IF NOT EXISTS `invoice` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`order_id` int(11) NOT NULL,
`booking_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
)
CREATE TABLE IF NOT EXISTS `order` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`booking_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
If I understand you correctly, this is the base query for your request:
SELECT
O.id
, SUM(CASE WHEN I.booking_id IS NOT NULL THEN 1 ELSE 0 END) AS booked_count
, COUNT(1) AS total_count
, CASE WHEN SUM(CASE WHEN I.booking_id IS NOT NULL THEN 1 ELSE 0 END) = COUNT(1) THEN 1 ELSE 0 END AS has_all_bookings
FROM
`order` O
LEFT JOIN invoice I
ON O.id = I.order_id
GROUP BY
O.id
If you want to check if there is no record in the invoice table add the COUNT(1) to the last CASE statement as an additional condition (COUNT(1) = 0)
Fiddle Demo
I have not understood how the logic works out when the order is booked but some of the invoices are not. I'll presume either is good for a true value (OR logic). I'd avoid COUNT and GROUP BY and go for a SUBSELECT, which works fine in MySQL (I'm using and old 5.1.73-1 version).
This query gives you both values in distinct columns:
SELECT o.*
, (booking_id IS NOT NULL) AS order_booked
, (NOT EXISTS (SELECT id FROM `invoice` WHERE order_id=o.id AND booking_id IS NULL)) AS invoices_all_booked
FROM `order` o
Of course you can combine the values:
SELECT o.*
, (booking_id IS NOT NULL OR NOT EXISTS (SELECT id FROM `invoice` WHERE order_id=o.id AND booking_id IS NULL)) AS booked
FROM `order` o
Here you go, create a view that does it
create view booked_view as
select `order`.id as order_id
,
case when booking_id > 0 then true
when exists (SELECT id FROM invoice WHERE order_id=`order`.id AND invoice.booking_id IS NULL) then true
else false
end as booked
from `order` ;
Then just join your view to the order table and you will have your boolean column 'booked'
select o.id, booked from `order` o
join booked_view on (o.id = booked_view.order_id)

How do I get a left join with a group by clause to return all the rows?

I am trying to write a query to determine how much of my inventory is committed at a given time, i.e. current, next month, etc.
A simplified example:
I have an inventory table of items. I have an offer table that specifies the customer, when the offer starts, and when the offer expires. I have a third table that associates the two.
create table inventory
(id int not null auto_increment , name varchar(32) not null, primary key(id));
create table offer
(id int not null auto_increment , customer_name varchar(32) not null, starts_at datetime not null, expires_at datetime, primary key (id));
create table items
(id int not null auto_increment, inventory_id int not null, offer_id int not null, primary key (id),
CONSTRAINT fk_item__offer FOREIGN KEY (offer_id) REFERENCES offer(id),
CONSTRAINT fk_item__inventory FOREIGN KEY (inventory_id) REFERENCES inventory(id));
create some inventory
insert into inventory(name)
values ('item 1'), ('item 2'),('item 3');
create two offers for this month
insert into offer(customer_name, starts_at)
values ('customer 1', DATE_FORMAT(NOW(), '%Y-%m-01')), ('customer 2', DATE_FORMAT(NOW(), '%Y-%m-01'));
and one for next month
insert into offer(customer_name, starts_at)
values ('customer 3', DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 1 MONTH), '%Y-%m-01'));
Now add some items to each offer
insert into items(inventory_id, offer_id)
values (1,1), (2,1), (2,2), (3,3);
What I want is a query that will show me all the inventory and the count of the committed inventory for this month. Inventory would be considered committed if the starts_at is less than or equal to now, and the offer has not expired (expires_at is null or expires_at is in the future)
The results I would expect would look like this:
+----+--------+---------------------+
| id | name | committed_inventory |
+----+--------+---------------------+
| 1 | item 1 | 1 |
| 2 | item 2 | 2 |
| 3 | item 3 | 0 |
+----+--------+---------------------+
3 rows in set (0.00 sec)
The query that I felt should work is:
SELECT inventory.id
, inventory.name
, count(items.id) as committed_inventory
FROM inventory
LEFT JOIN items
ON items.inventory_id = inventory.id
LEFT JOIN offer
ON offer.id = items.offer_id
WHERE (offer.starts_at IS NULL OR offer.starts_at <= NOW())
AND (offer.expires_at IS NULL OR offer.expires_at > NOW())
GROUP BY inventory.id, inventory.name;
However, the results from this query does not include the third item. What I get is this:
+----+--------+---------------------+
| id | name | committed_inventory |
+----+--------+---------------------+
| 1 | item 1 | 1 |
| 2 | item 2 | 2 |
+----+--------+---------------------+
2 rows in set (0.00 sec)
I cannot figure out how to get the third inventory item to show. Since inventory is the driving table in the outer joins, I thought that it should always show.
The problem is the where clause. Try this:
SELECT inventory.id
, inventory.name
, count(offers.id) as committed_inventory
FROM inventory
LEFT JOIN items
ON items.inventory_id = inventory.id
LEFT JOIN offer
ON offer.id = items.offer_id and
(offer.starts_at <= NOW() or
offer.expires_at > NOW()
)
GROUP BY inventory.id, inventory.name;
The problem is that you get a matching offer, but it isn't currently valid. So, the where clause fails because the offer dates are not NULL (there is a match) and the date comparison fails because the offer is not current ly.
For item 3 the starts_at from offer table is set to March, 01 2014 which is greater than NOW so (offer.starts_at IS NULL OR offer.starts_at <= NOW()) condition will skip the item 3 record
See fiddle demo