Conditional calculation when using group by - mysql

I have 3 tables order, orderitem,shipment. I have to get the count of all these and MRP of items
here is the below query
SELECT
DATE_FORMAT(order.orderdate,"%Y-%m%-%d") AS i_orderdt,
COUNT(DISTINCT order.orderid) AS i_orderscount,
COUNT(DISTINCT orderitem.orderitemid) as i_orderitemcount,
COUNT(DISTINCT shipment.shipmentid) AS i_shipmentcount,
ROUND(SUM(ifnull(orderitem.unitprice,0) * orderitem.quantity),0) AS i_mrp,
FROM order
LEFT JOIN shipment
ON shipment.orderid = order.orderid
JOIN orderitem
ON orderitem.orderid = order.orderid
WHERE order.orderdate >= "2014-01-01 00:00:00" AND order.orderdate <= "2014-03-31 23:59:59" ) GROUP BY DATE_FORMAT(`order`.`orderdate`,"%y-%m%-%d")
but MRP comes wrong as order to shipment will be 1 to many relationship. How can i write a query so that the calculation happens only for distinct orderitemid ??

One solution is to move the shipment count into a subquery:
SELECT
DATE_FORMAT(order.orderdate,"%Y-%m%-%d") AS i_orderdt,
COUNT(DISTINCT order.orderid) AS i_orderscount,
COUNT(DISTINCT orderitem.orderitemid) as i_orderitemcount,
COALESCE(
(
SELECT COUNT(DISTINCT shipmentid)
FROM shipment
WHERE shipment.orderid = order.orderid
), 0) AS i_shipmentcount,
ROUND(SUM(ifnull(orderitem.unitprice,0) * orderitem.quantity),0) AS i_mrp,
FROM order
JOIN orderitem ON orderitem.orderid = order.orderid
WHERE order.orderdate >= "2014-01-01 00:00:00"
AND order.orderdate <= "2014-03-31 23:59:59" )
GROUP BY DATE_FORMAT(order.orderdate,"%y-%m%-%d");

Related

a sum() function with aritmatich 4 table

this is sample data in table pengiriman_supply.
and this is for data_barang
this is for data_supplier and table masuk.
if I'm not using 3 tables the sum is no a problem but if I'm using 4 tables and using subtraction with (sum(table1.a)-ifnull(table2.b)). here is the result with just sum
and this is the picture with subtraction
the code is like this
SELECT DISTINCT
row_number() over(
order by pengiriman_supply.po_nomor desc) as no,
pengiriman_supply.po_nomor as PO,
data_supplier.nama_supplier,
data_barang.nama_barang,
((sum( pengiriman_supply.jumlah ))- (sum( COALESCE ( masuk.terima, 0 )) over ( PARTITION BY masuk.refrence ))) as total
FROM
pengiriman_supply
LEFT JOIN masuk ON pengiriman_supply.po_nomor = masuk.refrence
INNER JOIN data_supplier ON data_supplier.id_supplier = pengiriman_supply.idsupplier
INNER JOIN data_barang ON data_barang.idbarang = pengiriman_supply.idbarang
WHERE
pengiriman_supply.tanggal between date_sub(curdate(), interval 60 day) and curdate()
GROUP BY
pengiriman_supply.po_nomor,masuk.po_nomor,data_supplier.nama_supplier
ORDER BY
GROUP_CONCAT(DISTINCT pengiriman_supply.po_nomor) DESC
this the code that SQL statement that I can find. but the group by not make the SQL statement just pengiriman_supply.po_nomor. can I make the group by just the pengiriman_supply.po_nomor .
can the number 31194 make in one group?
it seems you need to include ifnull(masuk.terima,0) inside sum()
SELECT
pengiriman_supply.po_nomor AS po,
data_supplier.nama_supplier,
data_barang.nama_barang,
Sum((pengiriman_supply.jumlah)-ifnull(masuk.terima,0)) as total
FROM
pengiriman_supply
INNER JOIN data_barang ON pengiriman_supply.idbarang = data_barang.idbarang
INNER JOIN data_supplier ON pengiriman_supply.idsupplier = data_supplier.id_supplier
LEFT JOIN masuk ON masuk.refrence = pengiriman_supply.po_nomor
GROUP BY
pengiriman_supply.po_nomor
ORDER BY
po DESC

MySQL join on substring is slow

I have a query where I do a join on a substring, the problem is that this is really slow to complete. Is there a more effecient way to write this?
SELECT *, SUM(s.pris*s.antall) AS total, SUM(s.antall) AS antall
FROM ecs_statistikk AS s
JOIN butikk_ordre AS bo ON ordreId=bo.ecs_ordre_id AND butikkNr=bo.site_id
JOIN ecs_supplier AS l ON SUBSTRING( s.artikkelId, 1,2 )=l.lev_id
WHERE s.salgsDato>='2016-6-01' AND s.salgsDato<='2016-09-30'
GROUP BY l.lev_id ORDER BY total DESC
First, I would check indexes. For this query:
SELECT *, SUM(s.pris*s.antall) AS total, SUM(s.antall) AS antall
FROM ecs_statistikk s JOIN
butikk_ordre bo
ON s.ordreId = bo.ecs_ordre_id AND
s.butikkNr = bo.site_id JOIN
ecs_supplier l
ON SUBSTRING(s.artikkelId, 1, 2 ) = l.lev_id
WHERE s.salgsDato >= '2016-06-01' AND s.salgsDato <= '2016-09-30'
GROUP BY l.lev_id
ORDER BY total DESC ;
You want indexes on ecs_statistikk(salgsDato, ordreId, butikkNr, artikkelId), butikk_ordre(ecs_ordre_id, site_id), and ecs_supplier(lev_id)`.
Next, I would question whether you need the last JOIN at all. Does this do what you want?
SELECT LEFT(s.artikkelId, 2) as lev_id, *,
SUM(s.pris*s.antall) AS total, SUM(s.antall) AS antall
FROM ecs_statistikk s JOIN
butikk_ordre bo
ON s.ordreId = bo.ecs_ordre_id AND
s.butikkNr = bo.site_id
WHERE s.salgsDato >= '2016-06-01' AND s.salgsDato <= '2016-09-30'
GROUP BY LEFT(s.artikkelId, 2)
ORDER BY total DESC ;

MySQL: Grouped by hour, need to show all hours, null where no data

Here's the query:
SELECT h.idhour, h.`hour`, outnumber, count(*) as `count`, sum(talktime) as `duration`
FROM (
SELECT
`cdrs`.`dcustomer` AS `dcustomer`,
(CASE
WHEN (`cdrs`.`cnumber` like "02%") THEN '02'
WHEN (`cdrs`.`cnumber` like "05%") THEN '05'
END) AS `outnumber`,
FROM_UNIXTIME(`cdrs`.`start`) AS `start`,
(`cdrs`.`end` - `cdrs`.`start`) AS `duration`,
`cdrs`.`talktime` AS `talktime`
FROM `cdrs`
WHERE `cdrs`.`start` >= #_START and `cdrs`.`start` < #_END
AND `cdrs`.`dtype` = _LATIN1'external'
GROUP BY callid
) cdr
JOIN customers c ON c.id = cdr.dcustomer
LEFT JOIN hub.hours h ON HOUR(cdr.`start`) = h.idhour
WHERE (c.parent = _ID or cdr.dcustomer = _ID or c.parent IN
(SELECT id FROM customers WHERE parent = _ID))
GROUP BY h.idhour, cdr.outnumber
ORDER BY h.idhour;
The above query results skips the hours where there is no data, but I need to show all hours (00:00 to 23:00) with null or 0 values. How can I do this?
SELECT h.idhour
, h.hour
,IFNULL(outnumber,'') AS outnumber
,IFNULL(cdr2.duration,0) AS duration
,IFNULL(output_count,0) AS output_count
FROM hub.hours h
LEFT JOIN (
SELECT HOUR(start) AS start,outnumber, SUM(talktime) as duration ,COUNT(1) AS output_count
FROM
(
SELECT cdrs.dcustomer AS dcustomer
, (CASE WHEN (cdrs.cnumber like "02%") THEN '02' WHEN (cdrs.cnumber like "05%") THEN '05' END) AS outnumber
, FROM_UNIXTIME(cdrs.start) AS start
, (cdrs.end - cdrs.start) AS duration
, cdrs.talktime AS talktime
FROM cdrs cdrs
INNER JOIN customers c ON c.id = cdrs.dcustomer
WHERE cdrs.start >= #_START and cdrs.start < #_END AND cdrs.dtype = _LATIN1'external'
AND
(c.parent = _ID or cdrs.dcustomer = _ID or c.parent IN (SELECT id FROM customers WHERE parent = _ID))
GROUP BY callid
) cdr
GROUP BY HOUR(start),outnumber
) cdr2
ON cdr2.start = h.idhour
ORDER BY h.idhour
You need a table with all hours, nothing else.
Then use LEFT JOIN with the hours table on the "left" and your current query on the "right":
SELECT b.*
FROM hours h
LEFT JOIN ( ... ) b ON b.hr = h.hr
WHERE h.hr BETWEEN ... AND ...
ORDER BY hr;
Any missing hours will be NULLs in b.*.

query sql error on pentaho data integration (subquery)

this is my sql query
SELECT
p.Product_Name, d.year4
COUNT (fact_order.sk_product)
FROM
(SELECT * FROM fact_order limit 0,5000) fo , product p , dim_date d
WHERE fo.sk_product = p.sk_product and fo.sk_order_date = d.date_key and fo.sk_product = ${product_name}
GROUP BY fo.sk_product, d.year4
LIMIT 0,2000
i hope to show product based years
try
SELECT p.Product_Name, d.year4, COUNT(fo.sk_product)
FROM (SELECT * FROM fact_order limit 0, 5000) fo, product p, dim_date d
WHERE fo.sk_product = p.sk_product
and fo.sk_order_date = d.date_key
and fo.sk_product = ${product_name}
GROUP BY fo.sk_product, d.year4,p.Product_Name LIMIT 0, 2000
Missing , for your query:
SELECT
p.Product_Name, d.year4,COUNT (fact_order.sk_product)
FROM
(SELECT * FROM fact_order limit 0,5000) fo , product p , dim_date d
WHERE fo.sk_product = p.sk_product and fo.sk_order_date = d.date_key and fo.sk_product = ${product_name}
GROUP BY fo.sk_product, d.year4
LIMIT 0,2000
I think that the error is in the clause WHERE. You are doing fo.sk_product = ${product_name}.
If I'm not wrong, you should to compare sk_product (I guess it is an integer) with another
sk_product, not with a product_name (that it is a String).
As in the fact table you have the sk_product, I would do fo.sk_product = p.sk_product.
Further, in your SELECT you have p.Product_Name, but it isn't in GROUP BY clause. If you want to get the product names in the rows, replace fo.sk_product with p.product_name in GROUP BY, or if you want to get the surrogate keys instead of the names of the products, replace p.Product_Name with p.sk_product in the SELECT clause.
You must to think that to get the products by product_name, this column must to be UNIQUE.
The query would be something like this (getting by sk_product, and sk_product as parameter):
SELECT
p.sk_product, d.year4, COUNT(fact_order.sk_product)
FROM
(SELECT * FROM fact_order limit 0,5000) fo, product p , dim_date d
WHERE fo.sk_product = p.sk_product and fo.sk_order_date = d.date_key and fo.sk_product = ${sk_product_parameter}
GROUP BY p.sk_product, d.year4
LIMIT 0,2000
Or like this (getting by product_name if product_name is UNIQUE for each sk_product):
SELECT
p.Product_Name, d.year4
COUNT (fact_order.sk_product)
FROM
(SELECT * FROM fact_order limit 0,5000) fo , product p , dim_date d
WHERE fo.sk_product = p.sk_product and fo.sk_order_date = d.date_key and p.product_name = ${product_name}
GROUP BY fo.sk_product, d.year4
LIMIT 0,2000

Help calculating average per day

The daily_average column is always returning zero. The default timestamp values are for the past week. Any thoughts on what I'm doing wrong here in getting the average order value per day?
SELECT
SUM(price+shipping_price) AS total_sales,
COUNT(id) AS total_orders,
AVG(price+shipping_price) AS order_total_average,
(SELECT
SUM(quantity)
FROM `order_product`
INNER JOIN `order` ON (
`order`.id = order_product.order_id AND
`order`.created >= '.$startTimestamp.' AND
`order`.created <= '.$endTimestamp.' AND
`order`.type_id = '.$type->getId().' AND
`order`.fraud = 0
)
) as total_units,
SUM(price+shipping_price)/DATEDIFF('.$endTimestamp.', '.$startTimestamp.') as daily_average
FROM `order`
WHERE created >= '.$startTimestamp.' AND
created <= '.$endTimestamp.' AND
fraud = 0 AND
type_id = '.$type->getId().'
You're using aggregate functions (SUM, COUNT, AVG) without an aggregate command (group by). I think your SQL is more complicated than it needs to be (no need for the inner select).
Here's a SQL command that should work (hard to test without test data ;))
SELECT
COUNT(id) total_orders,
SUM(finalprice) total_sales,
AVG(finalprice) order_average,
SUM(units) total_units,
SUM(finalprice)/DATEDIFF('.$endTimestamp.', '.$startTimestamp.') daily_average
FROM (
SELECT
o.id id,
o.price+o.shipping_price finalprice,
SUM(p.quantity) units
FROM order o INNER JOIN order_product p ON p.order_id=o.id
WHERE o.created>='.$startTimestamp.'
AND o.created<='.$endTimestamp.'
AND o.fraud=0
AND o.type_id='.$type->getId().'
GROUP BY p.order_id
) t;
Does casting one of the elements in the division work for you?
SELECT
SUM(price+shipping_price) AS total_sales,
COUNT(id) AS total_orders,
AVG(price+shipping_price) AS order_total_average,
(SELECT
SUM(quantity)
FROM `order_product`
INNER JOIN `order` ON (
`order`.id = order_product.order_id AND
`order`.created >= '.$startTimestamp.' AND
`order`.created <= '.$endTimestamp.' AND
`order`.type_id = '.$type->getId().' AND
`order`.fraud = 0
)
) as total_units,
CAST(SUM(price+shipping_price) AS float)/DATEDIFF('.$endTimestamp.', '.$startTimestamp.') as daily_average
FROM `order`
WHERE created >= '.$startTimestamp.' AND
created <= '.$endTimestamp.' AND
fraud = 0 AND
type_id = '.$type->getId().'