Issue, Receipt and Balance from MySql table - mysql

I have two tables, issue and receipt where I am issuing and receiving quantities :
IssueTable:
Order
Type
Qty
OD12
A
48
OD19
A
33
OD12
B
14
ReceiptTable:
Order
Type
Qty
OD12
A
20
OD19
A
15
OD12
B
11
The desired result that I want:
Balance:
Order
Type
Qty
OD12
A
28
OD19
A
18
OD12
B
03
IssueTable contains details of Orders which have been issued, a single order can have multiple "Type" of products. Similarly, ReceiptTable contains details of Orders which have been completed and received. I want a Balance table which subtracts issue qty from receipt qty based on Order and Type.

SELECT `Order`,
`Type`,
COALESCE(IssueTable.Qty, 0) - COALESCE(ReceiptTable.Qty, 0) Qty
FROM ( SELECT `Order`, `Type` FROM IssueTable
UNION
SELECT `Order`, `Type` FROM ReceiptTable ) TotalTable
LEFT JOIN IssueTable USING (`Order`, `Type`)
LEFT JOIN ReceiptTable USING (`Order`, `Type`);
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=cafd416abcbf7ab31f54bf6efbd6566f
The query assumes that (Order, Type) is unique in each separate table. If not then use aggreagating subqueries instead if the tables itself.

You may try using a join approach:
SELECT it.`Order`, it.Type, it.Qty - rt.Qty AS Qty
FROM IssueTable it
INNER JOIN ReceiptTable rt
ON rt.`Order` = it.`Order` AND rt.Type = it.Type;
This answer assumes that every order would have a matching receipt. If not, the approach might have to change slightly based on your expectations. As a side note, ORDER is a reserved keyword in MySQL, and you should avoid naming your columns and tables using it.

Related

how to count number of lines with jointure in Talend on Oracle

i have 3 tables
supplier(id_supp, name, adress, ...)
Customer(id_cust, name, adress, ...)
Order(id_order, ref_cust, ref_supp, date_order...)
I want to make a job that counts the number of orders by Supplier, for last_week, last_two_weeks with Talend
select
supp.name,
(
select
count(*)
from
order
where
date_order between sysdate-7 and sysdate
nd ref_supp=id_supp
) as week_1,
(
select
count(*)
from
order
where
date_order between sysdate-14 and sysdate-7
nd ref_supp=id_supp
) as week_2
from supplier supp
the resaon for what i'm doing this, is that my query took to much time
You need a join between supplier and order to get supplier names. I show an inner join, but if you need ALL suppliers (even those with no orders in the order table) you may change it to a left outer join.
Other than that, you should only have to read the order table once and get all the info you need. Your query does more than one pass (read EXPLAIN PLAN for your query), which may be why it is taking too long.
NOTE: sysdate has a time-of-day component (and perhaps the date_order value does too); the way you wrote the query may or may not do exactly what you want it to do. You may have to surround sysdate by trunc().
select s.name,
count(case when o.date_order between sysdate - 7 and sysdate then 1 end)
as week_1,
count(case when o.date_order between sysdate - 14 and sysdate - 7 then 1 end)
as week_2
from supplier s inner join order o
on s.id_supp = o.ref_supp
;

MySQL right outer join query

I have a query regarding a query in MySQL.
I have 2 tables one containing SalesRep details like name, email, etc. I have another table with the sales data which has reportDate, customers served and link to the salesrep via a foreign key. One thing to note is that the reportDate is always a friday.
So the requirement is this: I need to find sales data for a 13 week period for a given list of sales reps - with 0 as customers served if on a particular friday there is no data. The query result is consumed by a Java application which relies on the 13 rows of data per sales rep.
I have created a table with all the Friday dates populated and wrote a outer join like below:
select * from (
select name, customersServed, reportDate
from Sales_Data salesData
join `SALES_REPRESENTATIVE` salesRep on salesRep.`employeeId` = salesData.`employeeId`
where employeeId = 1
) as result
right outer join fridays on fridays.datefield = reportDate
where fridays.datefield between '2014-10-01' and '2014-12-31'
order by datefield
Now my doubts:
Is there any way where i can get the name to be populated for all 13 rows in the above query?
If there are 2 sales reps, I'd like to use a IN clause and expect 26 rows in total - 13 rows per sales person (even if there is no record for that person, I'd still like to see 13 rows of nulls), and 39 for 3 sales reps
Can these be done in MySql and if so, can anyone point me in the right direction?
You must first select your lines (without customersServed) and then make an outer join for the customerServed
something like that:
select records.name, records.datefield, IFNULL(salesRep.customersServed,0)
from (
select employeeId, name, datefield
from `SALES_REPRESENTATIVE`, fridays
where fridays.datefield between '2014-10-01' and '2014-12-31'
and employeeId in (...)
) as records
left outer join `Sales_Data` salesData on (salesData.employeeId = records.employeeId and salesData.reportDate = records.datefield)
order by records.name, records.datefield
You'll have to do 2 level nesting, in your nested query change to outer join for salesrep, so you have atleast 1 record for each rep, then a join with fridays without any condition to have atleast 13 record for each rep, then final right outer join with condition (fridays.datefield = innerfriday.datefield and (reportDate is null or reportDate=innerfriday.datefield))
Very inefficient, try to do it in code except for very small data.

How do I get the sum of a column across multiple keys?

I have data that looks like this:
id int (11) primary key auto_increment
key int (2)
type int (2)
data int (4)
timestamp datetime
There are 5 different keys - 1,2,3,4,5 and three types - 1,2,3
Data is put in continuously against a key and of a particular type.
What I need to extract is a sum of the data for a particular type (say, type 1) across all 5 keys (1,2,3,4,5) so it is a sum of exactly 5 records. I only want to sum the latest (max(timestamp) values (there are 5 of them) of data for each key, but they may all have different timestamps.
Something like this....
SELECT sum(data) FROM table WHERE type='1' AND timestamp=(SELECT max(timestamp FROM table WHERE type='1' GROUP BY key)
Or something like that. That isn't even close of course. I am completely lost on this one. it feels like I need to group by key but the syntax eludes me. Any suggestions are appreciated.
EDIT: additional info:
if: 'data' is temperature. 'key' is day of the week. 'type' is morning, noon or night
So the data might look like
morning mon 70 (timestamp)
noon tue 78 (timestamp)
morning wed 72 (timestamp)
night tue 74 (timestamp)
morning thu 76 (timestamp)
noon wed 77 (timestamp)
night fri 78 (timestamp)
noon tue 79 (timestamp)
If these are in timestamp order (desc) and I want the sum of most recent noon temps for all five days, the result would be: 155 in this case since the last noon was also tuesday and it was earlier and thus, not included. Make sense? I want sum of 'data' for any key, specific type, latest timestamp only. In this example, I would be summing at most 7 pieces of data.
If the timestamp column is guaranteed to be unique for each (key,type) (That is, there's a UNIQUE constraint ON (key,type,timestamp), then this query will return the specified resultset. (This isn't the only approach, but it is a familiar pattern):
SELECT SUM(t.data) AS latest_total
FROM mytable t
JOIN ( SELECT h.type
, h.key
, MAX(h.timestamp) AS max_ts
FROM mytable h
WHERE h.type='1'
GROUP
BY h.type
, h.key
) m
ON m.type = t.type
AND m.key = t.key
AND m.max_ts = t.timestamp
The inline view assigned an alias of m returns the "latest" timestamp for type=1 for all 5 key values (if at least one row exists)
That is joined to the original table, to retrieve the row that has that "latest" timestamp.
A suitable index with leading columns of type,key,timestamp will likely improve performance.
(That's based on my understanding of the specification; I may not be totally clear on the specification. What this query is doing is getting the latest timestamp for the type=1 rows. If there happen to be two (or more) rows with the same latest timestamp value for a given key and type, this query will retrieve both (or all) of those rows, and include them in the sum.
We could add a GROUP BY t.type on that query, and that wouldn't change the result, since we are guaranteed that the t.type will be equal to the constant 1 (specified in the predicate in the WHERE clause of the inline view query.)
But we would need to add the GROUP BY if we wanted to get totals for all three type in the same query:
SELECT t.key
, SUM(t.data) AS latest_total
FROM mytable t
JOIN ( SELECT h.type
, h.key
, MAX(h.timestamp) AS max_ts
FROM mytable h
WHERE h.type IN ('1','2','3')
GROUP
BY h.type
, h.key
) m
ON m.type = t.type
AND m.key = t.key
AND m.max_ts = t.timestamp
GROUP
BY t.key
NOTE:
Using reserved words as identifiers (e.g. TIMESTAMP and KEY isn't illegal, but those identifiers (usually) need to be enclosed in backticks. But changing the names of these columns so that they aren't reserved words is best practice.
SELECT SUM(data)
FROM ( SELECT CONCAT(MAX(timestamp), '_', type) AS customId
FROM table
WHERE type = '1'
GROUP BY key ) a
JOIN table b ON a.customId = CONCAT(b.timestamp, '_', type)
GROUP BY type;
This would probably do the trick...
SQL-Fiddle
I would for simplicity and maintainability use a temp-table and fill it with several statements. The solution with "union-subselect" looks a bit long for me.
So
drop tamporary table if exists tmp_data;
create temporary table tmp_data (type int, value int);
insert into tmp_data select 1, value from data_table where type=1 order by timestamp desc limit 5;
insert into tmp_data select 2, value from data_table where type=2 order by timestamp desc limit 5;
insert into tmp_data select 3, value from data_table where type=3 order by timestamp desc limit 5;
select type, sum(value) as total from tmp_data group by type;
EDIT:
The subselect-solution would be similar, and since there are only 3 types not too bad
select type, sum(value) as total from
(select 1 as type, value from data_table where type=1 order by timestamp desc limit 5
union
select 2 as type, value from data_table where type=2 order by timestamp desc limit 5
union
select 3 as type, value from data_table where type=3 order by timestamp desc limit 5) as subtab group by type;
Hope that helps.

mysql moving average of N rows

I have a simple MySQL table like below, used to compute MPG for a car.
+-------------+-------+---------+
| DATE | MILES | GALLONS |
+-------------+-------+---------+
| JAN 25 1993 | 20.0 | 3.00 |
| FEB 07 1993 | 55.2 | 7.22 |
| MAR 11 1993 | 44.1 | 6.28 |
+-------------+-------+---------+
I can easily compute the Miles Per Gallon (MPG) for the car using a select statement, but because the MPG varies widely from fillup to fillup (i.e. you don't fill the exact same amount of gas each time), I would like to computer a 'MOVING AVERAGE' as well. So for any row the MPG is MILES/GALLON for that row, and the MOVINGMPG is the SUM(MILES)/SUM(GALLONS) for the last N rows. If less than N rows exist by that point, just SUM(MILES)/SUM(GALLONS) up to that point.
Is there a single SELECT statement that will fetch the rows with MPG and MOVINGMPG by substituting N into the select statement?
Yes, it's possible to return the specified resultset with a single SQL statement.
Unfortunately, MySQL does not support analytic functions, which would make for a fairly simple statement. Even though MySQL does not have syntax to support them, it is possible to emulate some analytic functions using MySQL user variables.
One of the ways to achieve the specified result set (with a single SQL statement) is to use a JOIN operation, using a unique ascending integer value (rownum, derived by and assigned within the query) to each row.
For example:
SELECT q.rownum AS rownum
, q.date AS latest_date
, q.miles/q.gallons AS latest_mpg
, COUNT(1) AS cnt_rows
, MIN(r.date) AS earliest_date
, SUM(r.miles) AS rtot_miles
, SUM(r.gallons) AS rtot_gallons
, SUM(r.miles)/SUM(r.gallons) AS rtot_mpg
FROM ( SELECT #s_rownum := #s_rownum + 1 AS rownum
, s.date
, s.miles
, s.gallons
FROM mytable s
JOIN (SELECT #s_rownum := 0) c
ORDER BY s.date
) q
JOIN ( SELECT #t_rownum := #t_rownum + 1 AS rownum
, t.date
, t.miles
, t.gallons
FROM mytable t
JOIN (SELECT #t_rownum := 0) d
ORDER BY t.date
) r
ON r.rownum <= q.rownum
AND r.rownum > q.rownum - 2
GROUP BY q.rownum
Your desired value of "n" to specify how many rows to include in each rollup row is specified in the predicate just before the GROUP BY clause. In this example, up to "2" rows in each running total row.
If you specify a value of 1, you will get (basically) the original table returned.
To eliminate any "incomplete" running total rows (consisting of fewer than "n" rows), that value of "n" would need to be specified again, adding:
HAVING COUNT(1) >= 2
sqlfiddle demo: http://sqlfiddle.com/#!2/52420/2
Followup:
Q: I'm trying to understand your SQL statement. Does your solution do a select of twenty rows for each row in the db? In other words, if I have 1000 rows will your statement perform 20000 selects? (I'm worried about performance)...
A: You are right to be concerned with performance.
To answer your question, no, this does not perform 20,000 selects for 1,000 rows.
The performance hit comes from the two (essentially identical) inline views (aliased as q and r). What MySQL does with these (basically) is create temporary MyISAM tables (MySQL calls them "derived tables"), which are basically copies of mytable, with an extra column, each row assigned a unique integer value from 1 to the number of rows.
Once the two "derived" tables are created and populated, MySQL runs the outer query, using those two "derived" tables as a row source. Each row from q, is matched with up to n rows from r, to calculate the "running total" miles and gallons.
For better performance, you could use a column already in the table, rather than having the query assign unique integer values. For example, if the date column is unique, then you could calculate "running total" over a certain period of days.
SELECT q.date AS latest_date
, SUM(q.miles)/SUM(q.gallons) AS latest_mpg
, COUNT(1) AS cnt_rows
, MIN(r.date) AS earliest_date
, SUM(r.miles) AS rtot_miles
, SUM(r.gallons) AS rtot_gallons
, SUM(r.miles)/SUM(r.gallons) AS rtot_mpg
FROM mytable q
JOIN mytable r
ON r.date <= q.date
AND r.date > q.date + INTERVAL -30 DAY
GROUP BY q.date
(For performance, you would want an appropriate index defined with date as a leading column in the index.)
For the first query, any predicates included (in the inline view definition queries) to reduce the number of rows returned (for example, return only date values in the past year) would reduce the number of rows to be processed, and would also likely improve performance.
Again, to your question about running 20,000 selects for 1,000 rows... a nested loops operation is another way to get the same result set. For a large number of rows, this can exhibit slower performance. (On the other hand, this approach can be fairly efficient, when only a few rows are being returned:
SELECT q.date AS latest_date
, q.miles/q.gallons AS latest_mpg
, ( SELECT SUM(r.miles)/SUM(r.gallons)
FROM mytable r
WHERE r.date <= q.date
AND r.date >= q.date + INTERVAL -90 DAY
) AS rtot_mpg
FROM mytable q
ORDER BY q.date
Something like this should work:
SELECT Date, Miles, Gallons, Miles/Gallons as MilesPerGallon,
#Miles:=#Miles+Miles overallMiles,
#Gallons:=#Gallons+Gallons overallGallons,
#RunningTotal:=#Miles/#Gallons runningTotal
FROM YourTable
JOIN (SELECT #Miles:= 0) t
JOIN (SELECT #Gallons:= 0) s
SQL Fiddle Demo
Which produces the following:
DATE MILES GALLONS MILESPERGALLON RUNNINGTOTAL
January, 25 1993 20 3 6.666667 6.666666666667
February, 07 1993 55.2 7.22 7.645429 7.358121330724
March, 11 1993 44.1 6.28 7.022293 7.230303030303
--EDIT--
In response to the comment, you can add another Row Number to limit your results to the last N rows:
SELECT *
FROM (
SELECT Date, Miles, Gallons, Miles/Gallons as MilesPerGallon,
#Miles:=#Miles+Miles overallmiles,
#Gallons:=#Gallons+Gallons overallGallons,
#RunningTotal:=#Miles/#Gallons runningTotal,
#RowNumber:=#RowNumber+1 rowNumber
FROM (SELECT * FROM YourTable ORDER BY Date DESC) u
JOIN (SELECT #Miles:= 0) t
JOIN (SELECT #Gallons:= 0) s
JOIN (SELECT #RowNumber:= 0) r
) t
WHERE rowNumber <= 3
Just change your ORDER BY clause accordingly. And here is the updated fiddle.

Access select latest entry of unique identifier

I have an Access table with multiple date entries for each unique identifier
Year ID TotalSpent
2003-2004 001 1000
2002-2003 001 900
2001-2002 001 100
2009-2010 002 8000
2008-2009 002 4000
2000-2001 003 100
1999-2000 003 0
I want to keep the latest (top) entry for each unique ID to produce
Year ID TotalSpent
2003-2004 001 1000
2009-2010 002 8000
2000-2001 003 100
I have looked at the top() function but cannot get it to produce more than 1 result (as opposed to 1 result for each unique ID). Any help would be appreciated.
Remou makes a valid point that a unique ID would be beneficial as it would allow to refer to the top row in the future but this could be a constraint outside of your control.
The data source is a bit awkward with the hyphenated years which prevents a simple grouping query. The second issue is that you simply cannot just group by the max of the TotalSpent field as it may not be the last field (A large refund for instance may affect a years total).
My solution involves finding the latest Year for each ID (Query A) and then reforms the year-tag to join onto table B. I didn't want to perform a join on a calculated field so I have wrapped it in another subquery (Query B). This is then joined onto the original table/query to extract the key rows and values.
SELECT YourTable.[YourYearField],
YourTable.ID,
YourTable.TotalSpent
FROM (SELECT A.ID,
[StartYear] & "-" & [EndYear] AS Grouping
FROM (SELECT YourTable.ID,
Max(Val(Right$([YourYearField], 4))) AS EndYear,
Max(Val(Right$([YourYearField], 4)) - 1) AS StartYear
FROM YourTable
GROUP BY YourTable.ID) AS A
GROUP BY A.ID,
[StartYear] & "-" & [EndYear]) AS B
INNER JOIN YourTable
ON ( B.Grouping = YourTable.[YourYearField] )
AND ( B.ID = YourTable.ID )
GROUP BY YourTable.[YourYearField],
YourTable.ID,
YourTable.TotalSpent;
You can get the Year and ID values you want with this query:
SELECT ID, Max([Year]) AS MaxOfYear
FROM YourTable
GROUP BY ID;
Then to get the corresponding TotalSpent values, use that SQL for a subquery which you join to YourTable.
SELECT y.Year, y.ID, y.TotalSpent
FROM
YourTable AS y
INNER JOIN
(
SELECT ID, Max([Year]) AS MaxOfYear
FROM YourTable
GROUP BY ID
) AS sub
ON
(y.Year = sub.MaxOfYear)
AND (y.ID = sub.ID);