Two rows of data combined into one - mysql

A beginner in SQL but given the opportunity to create my own database.
So I have a table Invoice with similar reference numbers but different status code numbers.
Table Invoice:
Reference Status Code Status Date
10198053 300 08/07/2013
10198053 500 08/09/2013
I would like the output to show:
Table Invoice:
Reference Status Code Status Date Status Date 2
10198053 300 08/07/2013 08/09/2013
Code:
select reference r, status Code s, status Date,
case
when r=r and s=s (???)
from Table Invoice

Assuming that you need to see a date for 300 and a date for 500, you could aggregate two columns based on each of those Status_Code:
SELECT
Reference,
MIN(CASE WHEN Status_Code = '300' THEN Status_Date ELSE NULL END) AS Status_Date_300,
MIN(CASE WHEN Status_Code = '500' THEN Status_Date ELSE NULL END) AS Status_Date_500
FROM Invoice
GROUP BY Reference;
As is indicated in the comments, this could change depending on the requirements of exactly what you are trying to find - however, this should get you started.
SQL Fiddle to test the above.

EDITED
Try this:
SELECT Invoice.reference, MIN(Invoice.statusCode) AS statusCode, MIN(Invoice.statusDate) AS statusDate, MAX(Invoice.statusDate) AS statusDate2
FROM Invoice
GROUP BY Invoice.reference;
GroupBy your reference number to get one instance of it on your query. Then use the MIN and MAX function to get the first and last instance of statusDate on a specific reference.

Related

How do I SELECT a MySQL Table value that has not been updated on a given date?

I have a MySQL database named mydb in which I store daily share prices for
423 companies in a table named data. Table data has the following columns:
`epic`, `date`, `open`, `high`, `low`, `close`, `volume`
epic and date being primary key pairs.
I update the data table each day using a csv file which would normally have 423 rows
of data all having the same date. However, on some days prices may not available
for all 423 companies and data for a particular epic and date pair will
not be updated. In order to determine the missing pair I have resorted
to comparing a full list of epics against the incomplete list of epics using
two simple SELECT queries with different dates and then using a file comparator, thus
revealing the missing epic(s). This is not a very satisfactory solution and so far
I have not been able to construct a query that would identify any epics that
have not been updated for any particular day.
SELECT `epic`, `date` FROM `data`
WHERE `date` IN ('2019-05-07', '2019-05-08')
ORDER BY `epic`, `date`;
Produces pairs of values:
`epic` `date`
"3IN" "2019-05-07"
"3IN" "2019-05-08"
"888" "2019-05-07"
"888" "2019-05-08"
"AA." "2019-05-07"
"AAL" "2019-05-07"
"AAL" "2019-05-08"
Where in this case AA. has not been updated on 2019-05-08. The problem with this is that it is not easy to spot a value that is not a pair.
Any help with this problem would be greatly appreciated.
You could do a COUNT on epic, with a GROUP BY epic for items in that date range and see if you get any with a COUNT less than 2, then select from this result where UpdateCount is less than 2, forgive me if the syntax on the column names is not correct, I work in SQL Server, but the logic for the query should still work for you.
SELECT x.epic
FROM
(
SELECT COUNT(*) AS UpdateCount, epic
FROM data
WHERE date IN ('2019-05-07', '2019-05-08')
GROUP BY epic
) AS x
WHERE x.UpdateCount < 2
Assuming you only want to check the last date uploaded, the following will return every item not updated on 2019-05-08:
SELECT last_updated.epic, last_updated.date
FROM (
SELECT epic , max(`date`) AS date FROM `data`
GROUP BY 'epic'
) AS last_updated
WHERE 'date' <> '2019-05-08'
ORDER BY 'epic'
;
or for any upload date, the following will compare against the entire database, so you don't rely on '2019-08-07' having every epic row. I.e. if the epic has been in the database before then it will show if not updated:
SELECT d.epic, max(d.date)
FROM data as d
WHERE d.epic NOT IN (
SELECT d2.epic
FROM data as d2
WHERE d2.date = '2019-05-08'
)
GROUP BY d.epic
ORDER BY d.epic

SQL Pivot Table with Subtraction

I am using SQL and I have three columns of user_id, user_action, and timestamp that apply timestamps to five different types of user actions numbered 1 through 5.
I have several thousand user_ids and actions over a period of several years. Example of Raw Data
First, I need to create a pivot table that has the timestamps from
only two of the user_actions, grouped by user_id, and then create a
brand new column that subtracts the time difference - call this
column time_difference.
The code that was provided by Caius Jard in the comments works for this part.
Now, I need to add another column of week number (TIMESTAMP is in DATETIME2 format so I need to incorporate
DATEPART(week, timestamp) as week into this code and use it to create a two week moving average based on the week number and time_difference.
This isnt a complete answer to the entire question. Here's a stub for step 1:
SELECT
user,
MAX(CASE WHEN action = 2 THEN action END) as action2,
MAX(CASE WHEN action = 5 THEN action END) as action5,
DATEDIFF(
MAX(CASE WHEN action = 2 THEN action END),
MAX(CASE WHEN action = 5 THEN action END)
) as days_between
FROM t
WHERE action in (2,5)
GROUP BY user
Though this doesn't calculate your minus column-I wasn't entirely sure what data type timestamp is and whether any direct math was possible, or if it's a string that needs converting. If you can add this detail to your q it will help (or leave a comment)
I wasn't easily able to make sense of your step 2, please enhance your q by providing expected results

Get count and sum for different purchase types on same table/colum

I have a "transaction" table that has the following columns
ID TIMESTAMP USER ID DESCRIPTION AMOUNT REF_ID TYPE
The description column contains the payment platform used for example "STRIPE-ch_1745". We currently have 4 platforms all described in the reference as in the example above. What I want is to get the payment platform, the total amount processed by the platform and the count of transactions. Like this
Platform Amount Count
Stripe 100,000 78
iOS 78,000 50
My current code only gives me these values for one platform, I've been unable to structure this properly to give me the desired result. I assumed I needed nested select statements, so I wrote the code in that manner
SELECT txn_count, sum
FROM
(SELECT count(*) AS txn_count, sum(`transaction`.`amount`) AS `sum`
FROM `transaction`
WHERE (`transaction`.`type` = 'credit'
AND (`transaction`.`description` like 'stripe%')
AND str_to_date(concat(date_format(`transaction`.`timestamp`, '%Y-%m'), '-01'), '%Y-%m-%d') = str_to_date(concat(date_format(now(), '%Y-%m'), '-01'), '%Y-%m-%d'))) t1
What this gives me right now is
Txn Count Sum
311 501,000
Would appreciate some help on how to get the expected table
Try this : ( edited to remove the reference part, assuming the reference is always separated by the platform by '-' )
SELECT
LEFT(t.description,LOCATE('-',t.description) - 1) as 'Platform',
SUM(t.amount) as 'Amount',
COUNT(*) as 'Count'
FROM transaction t
GROUP BY Platform

Search record daywise

I am stuck in a query, actually I have a search box which provide start datetime and end datetime, e.g. 2015-09-18 13:00:00 to 2015-09-21 17:00:00
I tried the query
select l.date, l.center_name, l.center_office, l.offline, l.online, l.category,
(select service_name from services_master where service_id = w.services) as servicename,
(select vendor_name from vendor_master where vendor_id = w.vendor) as vendorname
from lease_reports l,wan_entries w
where l.center_id = '7' and
w.id = l.center_id and
l.date between $startdatetime and $endatetime
I am not getting the exact result, result includes all the time between these two dates. I want day wise rows with exact interval time supplied by the user
The exact syntax will depend on what database you are using (see stackoverflow.com/questions/1658340/sql-query-to-group-by-day),
but the logic/pseudocode you are looking for is as follows.
...
AND day(l.date) BETWEEN day($startdatetime) and day($endatetime)
AND time(l.date) BETWEEN time($startdatetime) and time($endatetime)
GROUP BY day(l.date)
However, it might be clearer to the end user if you queried for date and time separately.

MySQL Query to perform calculation and display data based on 2 different date criteria

Good morning,
I am trying to combine two queries into one so that the result array can be populated into a single table. Data is pulled from a single table, and math calculations must take place for one of the columns. Here is what I have currently:
SELECT
laboratory,
SUM(total_produced_week) AS total_produced_sum,
SUM(total_produced_over14) AS total_over14_sum,
100*(SUM(total_produced_over14)/sum(total_produced_week)) as divided_sum,
max(case when metrics_date =maxdate then total_backlog else null end) as total_backlog,
max(case when metrics_date =maxdate then days_workable else null end) as days_workable,
max(case when metrics_date =maxdate then workable_backlog else null end) as workable_backlog,
max(case when metrics_date =maxdate then deferred_over_30_days else null end) as deferred_over_30_days
FROM
test,
(
select max(metrics_date) as maxdate
from metrics
) as x
WHERE
YEAR(metrics_date) = YEAR(CURDATE())
AND MONTH(metrics_date) = MONTH(CURDATE())
GROUP BY
laboratory
ORDER BY 1 ASC
Here's the breakdown:
For each laboratory site, I need:
1) Perform a MONTH TO DATE (current month only) sum, division and multiply by 100 for each site to obtain percentage.
2) Display other columns (total_backlog, days_workable, workable_backlog, deferred_over_30_days) for the most recent update date (metrics_date) only.
The above query performs #1 just fine - I get a total_produced_sum, total_over14_sum and divided_sum column with correct math.
The other columns mentioned in #2, however, return NULL. Data is available in the table for the most recently updated date, so the columns should be reporting that data. It seems like I have a problem with the CASE, but I'm not very familiar with the function so it could be incorrect.
I am running MySQL 5.0.45
Thanks in advance for any suggestions!
Chris
P.S. Here are the two original queries that work correctly. These need to be combined so that the full resultset can be output to a table, organized by laboratory.
Query 1:
SELECT SUM(total_produced_week) AS total_produced_sum,
SUM(total_produced_over14) AS total_over14_sum
FROM test
WHERE laboratory = 'Site1'
AND YEAR(metrics_date) = YEAR(CURDATE()) AND MONTH(metrics_date) = MONTH(CURDATE())
Query 2:
SELECT laboratory, total_backlog, days_workable, workable_backlog, deferred_over_30_days,
items_over_10_days, open_ncs, total_produced_week, total_produced_over14
FROM metrics
WHERE metrics_date = (select MAX(metrics_date) FROM metrics)
ORDER BY laboratory ASC
Operator Error.
I created a copy of the original table (named "metrics") to a table named "test". I then modified the metrics_date in the new "test" table to include data from January 2011 (for the month-to-date). While the first part of the query that performs the math was using the "test" table (and working properly), the second half that pulls the most-recently-updated data was using the original "metrics" table, which did not have any rows with a metrics_date this month.
When I changed the query to use "test" for both parts of the query, everything works as expected. And now I feel really dumb.
Thanks anyway, guys!