MySQL Procedure to transform key/value pair to columns - mysql

We are trying to build a de-normalised version of our database for use in Tableau. One of the challenges we have is dealing with our key/value pairs in one of our tables. See the following simplified version of some data.
Asset Table
Asset ID Site ID Make Model
1 1 Toyota Corolla
2 2 Honda Civic
3 2 Suzuki Swift
Asset Property Types Table
Site ID Asset Property Name Asset Property Type
1 Odometer Numeric
1 Registration Text
1 Expiry Date Date
2 Odometer Numeric
2 Registration Text
2 Expiry Date Date
2 Colour Text
Asset Properties Table
Asset ID Key Text Value Numeric Value Date Value
1 Odometer 1234
1 Registration ABC123
1 Expiry Date 2018-02-08
2 Odometer 1255
2 Registration ABC124
2 Colour Red
2 Expiry Date 2018-01-08
3 Registration ABC125
3 Odometer 1266
3 Colour Blue
3 Expiry Date 2018-03-25
Some points to note on this. This is a simplified version of the data. Not all Assets will have the same key/value pairs. Sites will have individual data types that they want to store which may be different to other sites.
Ultimately here is what we want to try and achieve:
Site 1 Asset Table
Asset ID Make Model Odometer Registration Expiry Date
1 Toyota Corolla 1234 ABC123 2018-02-08
Site 2 Asset Table
Asset ID Make Model Odometer Registration Expiry Date Colour
2 Honda Civic 1255 ABC124 2018-01-08 Red
3 Suzuki Swift 1266 ABC125 2018-03-25 Blue
My general approach to this was the following:
Create the Site Specific Asset Tables with just the Make & Model and populate them.
Use a loop to ALTER the table adding a column for each of the Asset Properties that exist
Populate the Site Specific Asset table with the Asset Property data in the appropriate columns
I was hoping to do all of this with a MySQL Procedure so that I can schedule it to run automatically every hour and replace the various Site level tables. A CASE statement will not work as this needs to be dynamic.
I really appreciate any advice/help on how to achieve this. Whilst I'm okay with SQL, procedures are way out of my depth.

I guess the difficult bit is building dynamic case statements. So given your data (with some changes to remove white spaces and possible key word clashes)
create table Asset(Asset_ID int, Site_ID int, Make varchar(20), Model varchar(20));
insert into asset values
(1 , 1 , 'Toyota' , 'Corolla'),
(2 , 2 , 'Honda' , 'Civic'),
(3 , 2 , 'Suzuki' , 'Swift');
drop table if exists Asset_Property_Types;
create table Asset_Property_Types (Site_ID int, Asset_Property_Name varchar(100), Asset_Property_Type varchar(100));
insert into asset_property_types values
(1 , 'Odometer' , 'Numeric_value'),
(1 , 'Registration' , 'Text_value'),
(1 , 'Expiry_Date' , 'Date_value'),
(2 , 'Odometer' , 'Numeric_value'),
(2 , 'Registration' , 'Text_value'),
(2 , 'Expiry_Date' , 'Date_value'),
(2 , 'Colour' , 'Text_value');
create table Asset_Properties(Asset_ID int , asset_property_name varchar(30),Text_Value varchar(100), Numeric_Value int , Date_Value date);
insert into asset_properties values
(1 , 'Odometer' , null , 1234 ,null),
(1 , 'Registration' , 'ABC123' ,null ,null),
(1 , 'Expiry_Date' , null , null, '2018-02-08'),
(2 , 'Odometer' , null , 1255 ,null),
(2 , 'Registration' , 'ABC124' ,null ,null),
(2 , 'Colour' , 'Red' ,null ,null),
(2 , 'Expiry_Date' , null , null ,'2018-01-08'),
(3 , 'Registration' , 'ABC125' ,null ,null),
(3 , 'Odometer' , null , 1266,null),
(3 , 'Colour' , 'Blue' ,null ,null),
(3 , 'Expiry_Date' , null , null ,'2018-03-25');
This code
set #sql = (select concat('select a.asset_id ,a.site_id,a.make,a.model,',
group_concat(
concat('max(case when asset_property_name = ' ,char(39),asset_property_name,char(39), ' then ' ,asset_property_type ,' else null end) as ', asset_property_name)
)
,' from asset a join asset_properties ap on ap.asset_id = a.asset_id group by a.site_id,a.asset_id;'
)
from
(select distinct asset_property_name,asset_property_type from asset_property_types) a
)
;
builds this sql statement
select a.asset_id ,a.site_id,a.make,a.model,
max(case when asset_property_name = 'Odometer' then Numeric_value else null end) as Odometer,
max(case when asset_property_name = 'Registration' then Text_value else null end) as Registration,
max(case when asset_property_name = 'Expiry_Date' then Date_value else null end) as Expiry_Date,
max(case when asset_property_name = 'Colour' then Text_value else null end) as Colour
from asset a join asset_properties ap on ap.asset_id = a.asset_id
group by a.site_id,a.asset_id;
Which you can then submit to sql like so
prepare sqlstmt from #sql;
execute sqlstmt;
deallocate prepare sqlstmt;

Your problem is similar to the pivot table problem. You can google it to find further variants of solving it. One way to solve it is
A) Fetch the column names
`Select distinct key from assetPropertiesTable;`
B) Build a Select query string using dynamic SQL that, for each key k from the previous query, has a column using a subselect like
(select textvalue from assetPropertiesTable t where t.id = outerselect.id and key = k) as column_k
C) Execute the full query and either return it from the stored procedure or do a create table xyz as /*select query here*/.
No need to create and fill the table in two steps.
There are more distinguished and more performant ways of doing that, for example joining the assetPropertiesTable once and doing sums on boolean condition key=k for each column you want to add

Related

SSIS Package -Count based on multiple columns

I need to create an SSIS Package that provides me the count of workdoneby (contractor/company).
Input table from sql server db:
I need to count no of orders by contractor and company for a particular day + station + worktype + accountno.
My output should look like this.
Can someone help me how to create a package to get the desired output?
Since the data is in a table, you can ask the database engine to do the calculation logic.
Setup
I created a temporary table and populated it with the supplied data.
CREATE TABLE
#Source
(
[Date] date
, Station char(3)
, worktype char(2)
, Accountno varchar(10)
, workdoneby varchar(10)
)
INSERT INTO
#Source
(
Date
, Station
, worktype
, Accountno
, workdoneby
)
VALUES
('2018-06-24', 'RMS', 'RH', 'I.145.001', 'Company')
, ('2018-06-24', 'RMS', 'PH', 'I.145.001', 'Contractor')
, ('2018-06-24', 'RMS', 'PH', 'I.145.002', 'Company')
, ('2018-06-24', 'RMS', 'PH', 'I.145.002', 'Contractor');
Query time
Now let's query! I find it is helpful to break these problems down into smaller pieces. The first thing I want to do is break out the workdoneby column into two columns with a 1 or 0
SELECT
S.Date
, S.Station
, S.worktype
, S.Accountno
, CASE S.workdoneby
WHEN 'Contractor' THEN 1
ELSE 0
END AS contractorCount
, CASE S.workdoneby
WHEN 'Company' THEN 1
ELSE 0
END AS companyCount
FROM
#Source AS S
Running that let's me look at the results and see I still have 4 rows and I get the correct entity counted.
The next step is to collapse/summarize/roll-up the values. You indicate we should group by date/station/worktype/accountno so that's exactly what we're going to to do.
I find it easier to debug if I take that first query and make it a derived table so the basic form now becomes SELECT * FROM (ORIGINAL QUERY HERE) AS D thus
SELECT
D.Date
, D.Station
, D.worktype
, D.Accountno
, D.contractorCount
, D.companyCount
FROM
(
SELECT
S.Date
, S.Station
, S.worktype
, S.Accountno
, CASE S.workdoneby
WHEN 'Contractor' THEN 1
ELSE 0
END AS contractorCount
, CASE S.workdoneby
WHEN 'Company' THEN 1
ELSE 0
END AS companyCount
FROM
#Source AS S
) D
Now that you can see it's giving the same original results, we're going to use the SUM function on the contractorCount and companyCount columns and GROUP BY date/station/worktype/accountno
SELECT
D.Date
, D.Station
, D.worktype
, D.Accountno
, SUM(D.contractorCount) AS contractor
, SUM(D.companyCount) AS company
FROM
(
SELECT
S.Date
, S.Station
, S.worktype
, S.Accountno
, CASE S.workdoneby
WHEN 'Contractor' THEN 1
ELSE 0
END AS contractorCount
, CASE S.workdoneby
WHEN 'Company' THEN 1
ELSE 0
END AS companyCount
FROM
#Source AS S
) D
GROUP BY
D.Date
, D.Station
, D.worktype
, D.Accountno;
SSIS
Now that we have data looking as expected, within SSIS you need to do something with it. Your question doesn't specify what you need to do but likely you're going to use a Data Flow Task to push this aggregated data from one place to another destination (different server, Excel, etc) or you're going to push this data into a table on the same server in which case you're going to use an Execute SQL Task

Stuffing multiple rows to see if columns are populated?

I'm trying to figure out a way in MySql to stuff multiple records for the same user into a single row to see which columns are populated. For example:
Username Height Weight Age
Bob123 6ft
Bob123 100lbs
Bob123 120lbs 25yrs
Let's say I have these three records in a table. I want to be able to combine them into a single row that just indicates if each column was populated in any of the records. My hopeful result record would look something like this for each user:
Username Height Weight Age
Bob123 True True True
Is there a way to do this in MySQL or do I need to look at doing this programmatically?
A generic sql method would be like this:
select username
, case when maxheight is not null then 'true' else 'false' end hasheight
, etc
from
(select username
, max(height) maxheight
, etc
from yourtables
where whatever
group by username) temp
CREATE TABLE person (Username VARCHAR(55), Height VARCHAR(55), Weight VARCHAR(55) , Age VARCHAR(55)
);
INSERT INTO person VALUES
('Bob123' , '6ft' , NULL , NULL),
('Bob123' , NULL , '100lbs' , NULL),
('Bob123' , NULL , '120lbs' , '25yrs');
SELECT username,
CASE height WHEN NULL THEN ' ' ELSE 'True' END,
CASE weight WHEN NULL THEN ' ' ELSE 'True' END,
CASE age WHEN NULL THEN ' ' ELSE 'True' END
FROM person
GROUP BY username;
DEMO

Number of logins per user per day in a date range

I saw many posts around that area, but couldn't find the exact one.
I have a table that registers all of users logins to my app and it contains two columns - userID and timeOfLogin. It looks like that:
userID, timeOfLogin
1 , 14-01-10 00:07:38
2 , 14-01-10 01:28:45
3 , 14-01-10 01:28:45
1 , 14-01-09 02:04:08
1 , 14-01-09 06:14:54
etc....
I want to have a table that counts the number of unique user logins per day since a specific "day1" that I define in the query (day2 is the following day and so on). The table should look something like:
userID, numOFLogins day1, numOfLogins Day2, numOfLogins Day3, ...., numOfLogins DayN
1 , 10 , 12 , 0 , ...., 12
2 , 3 , 6 , 7 , ...., 15
132 , 0 , 5 , 9 , ...., 14
You can do this with conditional aggregation:
select userId,
sum(date(timeOfLogin) = date(#day1)) as NumLogins_0,
sum(date(timeOfLogin) = date(date(#day1) + 1)) as NumLogins_1,
sum(date(timeOfLogin) = date(date(#day1) + 2)) as NumLogins_2,
sum(date(timeOfLogin) = date(date(#day1) + 3)) as NumLogins_3,
sum(date(timeOfLogin) = date(date(#day1) + 4)) as NumLogins_4
from table t
group by userId;
In MySQL, date(timeOfLogin) = #day1 is treated as a 0 when the expression is false and 1 when it is true.
I'm just using #day1 to represent your variable for the date, whatever that is.
This will work for a fixed number of columns (such as the 5 days shown above). If you want a variable number of columns, then you cannot do this with a simple SQL statement. You will need to construct the SQL in a string, use prepare, and execute it.

Calculating date in SQL

I am having problem calculating the number of years a faculty member has been hired. (current time - faculty hire date)
I am trying to use this, however i keep getting error that DATEDIFF is an invalid identifier. Please help.
SELECT FAC_FN , DATEDIFF ('CURDATE()' , 'FAC_HIRE DATE')
FROM FACULTY;
--CREATING TABLE FACULTY
CREATE TABLE FACULTY
(
FAC_ID NUMBER (4) CONSTRAINT FAC_ID_PK PRIMARY KEY,
FAC_FN VARCHAR2 (15),
FAC_LN VARCHAR2 (15),
FAC_DEPT VARCHAR2 (10),
FAC_RANK VARCHAR2 (10),
FAC_HIRE_DATE DATE,
FAC_SALARY NUMBER (7),
FAC_SUPERVISOR NUMBER (4)
);
--INSERTING RECORDS INTO FACULTY TABLE
INSERT INTO FACULTY VALUES ( 9001 , 'Leonard' , 'Vince' , 'IS' , 'ASST' , TO_DATE('12-APR-1997','DD-MON-YYYY') , 67000 , 9003);
INSERT INTO FACULTY VALUES ( 9002 , 'Victor' , 'Strong' , 'CSCI' , 'ASSO' , TO_DATE('8-AUG-1999','DD-MM-YYYY') , 70000 , 9003);
INSERT INTO FACULTY VALUES ( 9003 , 'Nicki' , 'Colan' , 'IS' , 'PROF' , TO_DATE('20-AUG-1981','DD-MM-YYYY') , 75000, 9010);
INSERT INTO FACULTY VALUES ( 9004 , 'Fred' , 'Wells' , 'ACCT' , 'ASST' , TO_DATE('28-AUG-1996','DD-MM-YYYY'), 60000, 9010);
INSERT INTO FACULTY VALUES ( 9010 , 'Chris' , 'Macon' , 'ACCT' , 'ASST' , TO_DATE('4-AUG-1980','DD-MM-YYYY') , 75000 , '');
Remove the quotes (they are causing MySQL to parse the function arguments as string literals):
SELECT FAC_FN, DATEDIFF(CURDATE(), FAC_HIRE_DATE) FROM FACULTY
Note also that DATEDIFF() returns a result in number of days; to obtain the difference in years, you may wish to use TIMESTAMPDIFF() instead.

How to wirte a query for updating two tables at a time?

HI i have two tables in my database named...Requests and Balance tracker which has no relation....but i want to select data from two tables and binf it two grid...
Requests
EmpID |EmpRqsts|EmpDescription|ApproverID|ApprovedAmount|RequestPriority
1 |asdfsb |sadbfsbdf |1 |
2 |asbfd |sjkfbsd |1 |
Balance Tracker
EmpId|BalanceAmnt|LastUpdated|lastApprovedAmount
| 1 |5000 |sdfbk |
| 2 |3000 |sjbfsh |
now i want to update based on the EmpID two tables at a time...when ever amount is approved it should be updates in request table column [ApprovedAmount] and with priority...
when [ApprovedAmount] is Updated [BalanceAmnt] Balance Tracker of also should be Updated by adding the amount approved,[LastUpdated],[lastApprovedAmount] should be updated with date and time
can any one help me with the query please....
#Anil, here is an example of SQL Server 2008 code which would help you to get your goal acomplished:
DECLARE #Requests TABLE
(
EmpId int
, EmpRqsts nvarchar(50)
, EmpDescription nvarchar(250)
, ApproverID int
, ApprovedAmount money
, RequestPriority int
)
DECLARE #BalanceTracker TABLE
(
EmpId int
, BalanceAmnt money
, LastUpdated datetime
, lastApprovedAmount money
)
-- Insert data for testing
INSERT INTO #Requests VALUES
(
1
, 'Something here'
, 'Some descriptio here'
, 1
, 100
, 1
)
INSERT INTO #Requests VALUES
(
2
, 'Something here 2 '
, 'Some descriptio here 3'
, 1
, 215
, 2
)
INSERT INTO #BalanceTracker VALUES
(
1
, 5000
, GETDATE() - 3
, 310
)
INSERT INTO #BalanceTracker VALUES
(
2
, 3000
, (GETDATE() - 1)
, 98
)
-- Declare local variables
DECLARE
#NewAmount money
, #NewPriority int
, #SelectedEmpId int
-- Assing values for example
SELECT #NewAmount = 1000
, #SelectedEmpId = 1
, #NewPriority = 5
-- Get the tables values pre - updates
SELECT *
FROM #Requests
SELECT *
FROM #BalanceTracker
BEGIN TRY
-- Update the record with new ApprovedAmount and Request Priority
UPDATE #Requests
SET ApprovedAmount = #NewAmount
, RequestPriority = #NewPriority
WHERE EmpId = #SelectedEmpId
-- If no error found then update BalanceAmnt trable
IF (##ERROR = 0)
BEGIN TRY
UPDATE #BalanceTracker
SET BalanceAmnt = (BalanceAmnt + #NewAmount)
, LastUpdated = GETDATE()
, lastApprovedAmount = #NewAmount
WHERE EmpId = #SelectedEmpId
END TRY
BEGIN CATCH
PRINT N'Error found updating #BalanceTracker table: ' + ISNULL(LTRIM(STR(ERROR_NUMBER())) , N'Unknown Error' )
+ N', Message: ' + ISNULL ( ERROR_MESSAGE() , N'No Message' )
END CATCH
END TRY
BEGIN CATCH
PRINT N'Error found updating #Requests table: ' + ISNULL(LTRIM(STR(ERROR_NUMBER())) , N'Unknown Error' )
+ N', Message: ' + ISNULL ( ERROR_MESSAGE() , N'No Message' )
END CATCH
-- Get the tables values post - updates
SELECT *
FROM #Requests
SELECT *
FROM #BalanceTracker
Note 1: #Table are Variable Tables handlded by SQL Server 2008. If you're using previous version you should be able to create Temporary Table (#Table).
Note 2: data data-types may vary depending upon the SQL version you're using.
You could do this type of thing with a trigger. This way whenever you do the first update, it will automatically do the other update you specify.