Improve performance of WareHouse query on Exact Online - exact-online

We have 20 warehouses and 3.000 articles. Therefore there are 60.000 rows in the ItemWarehouses table of Exact Online. However, retrieval takes 1200 ms per 60 rows, so total query on this data volume for a warehouse analysis takes 3-4 hours.
I've tried to restrict the number of data retrieved using the following filter because we are only in items with some non-zero stock information:
select t.*
from exactonlinerest..itemwarehouses t
where ( currentstock != 0 or projectedstock != 0 or plannedstockin != 0 or plannedstockout != 0 or safetystock != 0 or reorderpoint != 0)
But it still downloads all 60.000 combinations and filters them on the PC. The result at the end is approximately 700 valid combinations of warehouse and item stock information.
Is there a way to retrieve the data in a more performant way?

Invantive SQL does not forward OR-constructs to the server-side. But in this case you might want to change the OR into a UNION (without ALL):
select t.*
from exactonlinerest..itemwarehouses t
where currentstock != 0
union
select t.*
from exactonlinerest..itemwarehouses t
where projectedstock != 0
union
select t.*
from exactonlinerest..itemwarehouses t
where plannedstockin != 0
union
select t.*
from exactonlinerest..itemwarehouses t
where plannedstockin != 0
union
select t.*
from exactonlinerest..itemwarehouses t
where safetystock != 0
union
select t.*
from exactonlinerest..itemwarehouses t
where reorderpoint != 0
These filters are forwarded to Exact Online and should run very fast given your data distribution. The UNION ensures that you only get the unique rows back.

Related

improve sql query with 2 EXISTS sub queries

I have this query (mysql):
SELECT `budget_items`.*
FROM `budget_items`
WHERE (budget_category_id = 4
AND ((is_custom_for_family = 0)
OR (is_custom_for_family = 1
AND custom_item_family_id = 999))
AND ((EXISTS
(SELECT 1
FROM balance_histories
WHERE balance_histories.budget_item_id = budget_items.id
AND balance_histories.family_id = 999
AND payment_date >= '2021-02-01'
AND payment_date <= '2021-02-28' ))
OR (EXISTS
(SELECT 1
FROM budget_lines
WHERE family_id = 999
AND budget_id = 188311
AND budget_item_id = budget_items.id
AND amount > 0))))
It runs multiple times on app start. It takes more than 10 seconds (all of them).
I have indexes on:
balance_histories table: budget_item_id, family_id (tried also payment_date)
budget_lines table: family_id, budget_id, budget_item_id
How can I improve the speed? Query or maybe mysql (8) configuration.
balance_histories table:
budget_lines table:
I would start this query in reverse of what you have. Assuming you COULD have years of data, but your EXISTS query is looking more specifically at a date-range, or specific budget lines, start there, it will probably be much smaller. Once you have DISTINCT IDs, then go back to the budget items by qualified ID PLUS the additional criteria.
To help optimize the queries, I would have indexes on
table index
balance_histories ( family_id, payment_date, budget_item_id )
budget_lines ( family_id, budget_id, amount )
budget_items ( id, budget_category_id, is_custom_for_family, custom_item_family_id )
select
bi.*
from
-- pre-query a list of DISTINCT IDs from the balance history
-- and budget lines that qualify. THEN join to the rest.
( select distinct
bh.budget_item_id id
from
balance_histories bh
where
bh.family_id = 999
AND bh.payment_date >= '2021-02-01'
AND bh.payment_date <= '2021-02-28'
UNION
select
bl.budget_item_id
FROM
budget_lines bl
WHERE
bl.family_id = 999
AND bl.budget_id = 188311
AND bl.amount > 0 ) PQ
JOIN budget_items bi
on PQ.id = bi.id
AND bi.budget_category_id = 4
AND ( bi.is_custom_for_family = 0
OR
( bi.is_custom_for_family = 1
AND bi.custom_item_family_id = 999 )
)
Feedback
As for many SQL queries, there are typically multiple ways to get a solution. Sometimes using EXISTS works well, sometimes not as much. You need to consider cardinality of your data, and that is what I was shooting for. Look at what you were asking for first: Get budget items that are all category for and custom for family is 1 or 0 (which is all), but if family, only those for 999. You were correct on your balance of AND/OR. However, this is going through EVERY RECORD, and if you have millions of rows, that is what you are scanning through. Only after scanning through every row are you now doing a secondary query (for each record that qualified) against the histories for the specific date range OR family/budget.
My guess is that the number of possible records returned from your two EXISTS queries was going to be very small. So, by starting by getting a DISTINCT list of just those IDs that are part of that union would be the very small subset. Once that single "ID" if found, it now becomes a direct match to the budget items table and have the final filtering limits of categoryID / Family / Custom Item considerations.
By having indexes better match the context of your query WHERE clause will optimize pulling data. I have had answers to several other questions with similar resolutions and clarify indexes and why in those... take a look for example, and another here.

SQL nested query alternative

SQL newbie here. I have a table where I have OrderID and State of the order.
OrderID, State, TimeStamp
1 0 20210502151515
1 1 20210502161616
1 2 20210502171717
2 0 20210502151617
2 1 20210502161718
2 3 20210502171819
3 0 20210502121617
3 4 20210502121718
4 0 20210502131617
5 0 20210502141718
6 0 20210502151515
6 2 20210502171717
7 0 20210502151515
7 1 20210502171717
Where 0 = OPEN, 1=Partially Completed, 2=Fully Completed, 3=Cancelled, 4=Rejected
I want to run a query where it would return orders that are OPEN (state=0) or Partially Completed (state=1). If the order is Fully completed, Cancelled or Rejected, I want to exclude those orders.
If I run to select orders with state 0,1 then it would return some orders that are fully done or cancelled or rejected. I need to run query where order states anything but 0 or 1.
I have this query which works but I am wondering if there is a better way to do it.
SELECT *
FROM myTable
WHERE OrderID NOT IN (select OrderId from myTable where state not in (0, 1))
Thank you!
If you just want orders, you can use aggregation:
select orderid
from mytable
group by orderid
having max(state) = 1;
If you want the details of the rows, you can use join, in or exists along with this query.
There is a better way, but not with sql. Maybe you want to create another table to store the current state of the order. It is much easier to get what you want.
Old-fashioned sql you would easily solve this with a correlated sub-query:
Select * from Mytable a
Where a.Timestamp=(Select max(Timestamp) from Mytable b
Where a.OrderId=b.OrderID)
and state<2
This selects only the most recent record by order (max(Timestamp)) and further only keeps it if that most recent record is 0 or 1.
Might something like this work or would it be end up being too brutal as the recordset grows?
select Mytable.orderid, Mytable.State, Mytable.TimeStamp
from Mytable
inner join
(
select orderid, max(Timestamp) newesttimestamp
from Mytable
group by orderid
) newestorderdetails
on Mytable.orderid = newestorderdetails.orderid and Mytable.Timestamp = newestorderdetails.newesttimestamp
where Mytable.state IN (0, 1)
order by Mytable.orderid, Mytable.state

Perform calculations on two rows from UNION query

I have the following query
SELECT count
FROM (
SELECT count(id) as count FROM listens
WHERE user='None'
UNION
SELECT count(id) as count FROM listens
WHERE user!='None'
) as details
which returns
count
36793
112755
I would like to perform the division on the two values (e.g. 36793 / 112755) so that the output from my query is
count
0.3263092546
You don't need union at all! Here is a much simpler way of writing the query:
SELECT sum(user = 'None') / sum(user <> 'None')
FROM listens;
MySQL treats a boolean expression as a number in a numeric context, with 0 for false and 1 for true. The above counts the number of values that match the conditions.
If you want to be verbose or to be compatible with other dialects of SQL, you can do:
SELECT (sum(case when user = 'None' then 1 else 0 end) /
sum(case when user <> 'None' then 1 end)
) as ratio
FROM listens;
I don't see a particular advantage to the verbosity if you are using MySQL, but the logic is equivalent.

Mysql: Get records from last date

I want to get all records which are not "older" than 20 days. If there are no records within 20 days, I want all records from the most recent day. I'm doing this:
SELECT COUNT(DISTINCT t.id) FROM t
WHERE
(DATEDIFF(NOW(), t.created) <= 20
OR
(date(t.created) >= (SELECT max(date(created)) FROM t)));
This works so far, but it is awful slow. created is a datetime, might be due tue the conversion to a date... Any ideas how to speed this up?
SELECT COUNT(*) FROM (
SELECT * FROM t WHERE datediff(now(),created) between 0 and 20
UNION
SELECT * FROM (SELECT * FROM t WHERE created<now() LIMIT 1) last1
) last20d
I used the between clause just in case there might be dates in the future in the table. These will be excluded. Also you can simplify the select, if you just need the count() to
SELECT COUNT(*) FROM (
SELECT id FROM t WHERE datediff(now(),created) between 0 and 20
UNION
SELECT id FROM (SELECT id FROM t WHERE created<now() LIMIT 1) last1
) last20d
otherwise, in the first select version you can leave out the outer select if you want all the data of the chosen records. The UNION will make sure that duplicates will be excluded (in other cases I always use UNION ALL since it is faster).

Union All Query takes too long

This question have been asked multiple times I am sure, but every case is different.
I have MySQL setup on a strong computer with 2GB RAM, it does not do too much so the computer is sufficient.
The following query has been built as a view :
create view view_orders as
select distinct
tbl_orders_order.order_date AS sort_col,
tbl_orders_order.order_id AS order_id,
_utf8'website' AS src,tbl_order_users.company AS company,
tbl_order_users.phone AS phone,
tbl_order_users.full_name AS full_name,
time_format(tbl_orders_order.order_date,_utf8'%H:%i') AS c_time,
date_format(tbl_orders_order.order_date,_utf8'%d/%m/%Y') AS c_date,
tbl_orders_order.comments AS comments,
tbl_orders_order.tmp_cname AS tmp_cname,
tbl_orders_order.tmp_pname AS tmp_pname,
count(tbl_order_docfiles.docfile_id) AS number_of_files,
(case tbl_orders_order.status when 1 then _utf8'completed' when 2 then _utf8'hc' when 0 then _utf8'not-completed' when 3 then _utf8'hc-canceled' end) AS status,
tbl_orders_order.employee_name AS employee_name,
tbl_orders_order.status_date AS status_date,
tbl_orders_order.cancel_reason AS cancel_reason
from
tbl_orders_order left join tbl_order_users on tbl_orders_order.user_id = tbl_order_users.user_id
left join
tbl_order_docfiles on tbl_order_docfiles.order_id = tbl_orders_order.order_id
group by
tbl_orders_order.order_id
union all
select distinct tbl_h.h_date AS sort_col,
(case tbl_h.sub_oid when 0 then tbl_h.order_number else concat(tbl_h.order_number,_utf8'-',tbl_h.sub_oid) end) AS order_id,
(case tbl_h.type when 1 then _utf8'פקס' when 2 then _utf8'email' end) AS src,_utf8'' AS company,
_utf8'' AS phone,_utf8'' AS full_name,time_format(tbl_h.h_date,_utf8'%H:%i') AS c_time,
date_format(tbl_h.h_date,_utf8'%d/%m/%Y') AS c_date,_utf8'' AS comments,tbl_h.client_name AS tmp_cname,
tbl_h.project_name AS tmp_pname,
tbl_h.quantity AS number_of_files,
_utf8'completed' AS status,
tbl_h.computer_name AS employee_name,
_utf8'' AS status_date,
_utf8'' AS cancel_reason
from tbl_h;
The query used UNION, than I read an article about UNION ALL and now uses that.
Query alone takes about 3 seconds to execute (UNION took 4.5-5.5 seconds)
Each part in seperate runs in seconds.
The application does sorting and select on this view, which makes it processing time even larger - about 6 seconds when query is cached, about 12 seconds or more if data has changed.
I see no other way to combine these two results, as both sorted needs to display to the user and I guess something I am doing is wrong.
Of course both tables uses primary keys.
UPDATE!!!!
It didn't help, I got the utf8/case/date_format out of the union query, and removed distincts, now query takes 4 seconds (even longer).
query without case/date/utf8 (only union) was shortened to 2.3 seconds (0.3 seconds improvement).
create view view_orders as
select *,
(CASE src
WHEN 1 THEN
_utf8'fax'
WHEN 2 THEN
_utf8'mail'
WHEN 3 THEN
_utf8'website'
END) AS src,
time_format(order_date,'%H:%i') AS c_time,
date_format(order_date,'%d/%m/%Y') AS c_date,
(CASE status
WHEN 1 THEN
_utf8'completed'
WHEN 2 THEN
_utf8'hc handling'
WHEN 0 THEN
_utf8'not completed'
WHEN 3 THEN
_utf8'canceled'
END) AS status
FROM
(
select
o.order_date AS sort_col,
o.order_id,
3 AS src,
u.company,
u.phone,
u.full_name,
o.order_date,
o.comments,
o.tmp_cname,
o.tmp_pname,
count(doc.docfile_id) AS number_of_files,
o.status,
o.employee_name,
o.status_date,
o.cancel_reason
from
tbl_orders_order o
LEFT JOIN
tbl_order_users u ON u.user_id = o.user_id
LEFT JOIN
tbl_order_docfiles doc ON doc.order_id = o.order_id
GROUP BY
o.order_id
union all
select
h.h_date AS sort_col,
(case h.sub_oid when 0 then h.order_number else concat(h.order_number,'-',h.sub_oid) end) AS order_id,
h.type as src,
'' AS company,
'' AS phone,
'' AS full_name,
h.h_date,
'' AS comments,
h.client_name AS tmp_cname,
h.project_name AS tmp_pname,
h.quantity AS number_of_files,
1 AS status,
h.computer_name AS employee_name,
'' AS status_date,
'' AS cancel_reason
from tbl_h h
)
Think about your using UNION and DISTINCT keywords. Can your query really result in duplicate rows? If yes, the optimal query for removing duplicates would probably be of this form:
SELECT ... -- No "DISTINCT" here
UNION
SELECT ... -- No "DISTINCT" here
There is probably no need for DISTINCT in the two subqueries. If duplicates are impossible anyway, try using this form instead. This will be the fastest execution of your query (without further optimising the subqueries):
SELECT ... -- No "DISTINCT" here
UNION ALL
SELECT ... -- No "DISTINCT" here
Rationale: Both UNION and DISTINCT apply a "UNIQUE SORT" operation on your intermediate result sets. Depending on how much data your subqueries return, this can be very expensive. That's one reason why omitting DISTINCT and replacing UNION by UNION ALL is much faster.
UPDATE Another idea, if you do have to remove duplicates: Remove duplicates first in an inner query, and format dates and codes only afterwards in an outer query. That will accelerate the "UNIQUE SORT" operation because comparing 32/64-bit integers is less expensive than comparing varchars:
SELECT a, b, date_format(c), case d when 1 then 'completed' else '...' end
FROM (
SELECT a, b, c, d ... -- No date format here
UNION
SELECT a, b, c, d ... -- No date format here
)
It may be related to the UNION triggering a character set conversion. For example cancel_reason in the one query is defined as utf8, but in the other it is not specified.
Check if there is a very high cpu spike when you run this query, this would indicate conversion.
Personally I would have done a union of the raw data first, and then applied the case and conversion statements. But I am not sure that that would make a difference in the performance.
Can you try this one:
SELECT
o.order_date AS sort_col,
o.order_id AS order_id,
_utf8'website' AS src,
u.company AS company,
u.phone AS phone,
u.full_name AS full_name,
time_format(o.order_date,_utf8'%H:%i') AS c_time,
date_format(o.order_date,_utf8'%d/%m/%Y') AS c_date,
o.comments AS comments,
o.tmp_cname AS tmp_cname,
o.tmp_pname AS tmp_pname,
COALESCE(d.number_of_files, 0) AS number_of_files,
( CASE o.status WHEN 1 THEN _utf8'completed'
WHEN 2 THEN _utf8'hc'
WHEN 0 THEN _utf8'not-completed'
WHEN 3 THEN _utf8'hc-canceled'
END ) AS status,
o.employee_name AS employee_name,
o.status_date AS status_date,
o.cancel_reason AS cancel_reason
FROM
tbl_orders_order AS o
LEFT JOIN
tbl_order_users AS u
ON o.user_id = u.user_id
LEFT JOIN
( SELECT order_id
, COUNT(*) AS number_of_files
FROM tbl_order_docfiles
GROUP BY order_id
) AS d
ON d.order_id = o.order_id
UNION ALL
SELECT
tbl_h.h_date AS sort_col,
...
FROM tbl_h