Unable to get an SQL Pivot to work - sql-server-2008

I have a temporary table that contains a part category and the associated part cost:
#part_costs
part_cat | part_cost
tire | 0
fuel | 24
wheel | 34
The number of rows and the values within #part_costs are dynamic. I am trying to create a pivot so I will have (doesn't matter the order of the columns):
tire | fuel | wheel
0 | 24 | 34
I created a table variable that holds the part category variable NVARCHAR(2000) that holds the names:
"[fuel],[tires],[wheel]"
So far for my pivot I have
SELECT [fuel],[tires],[wheel] FROM (SELECT part_cat, part_cost FROM #part_costs ) p PIVOT ( [part_cost] FOR part_category IN ( [fuel],[tires],[wheel] )) AS pvt
Yet I can't get this to work. I can run my stored procedure, that this is located in, yet when I execute my stored procedure I get the following error: Incorrect syntax near the keyword 'FOR'.
Previously I had select 'part_cost' as costs, #part_names from (select part_cat, part_cost from #part_costs) as p PIVOT (part_cost for part_cat in (#part_names)) as PivotTable though with this I wasn't even able to run my stored procedure as I got Incorrect syntax near the keyword 'for'.

You have to use an Aggregate function (e.g. AVG, MAX) before the FOR. Also make sure to list the column value names correctly i.e. tires is not same as tire.
SELECT [fuel],[tire],[wheel]
FROM (
SELECT part_cat, part_cost
FROM #part_costs
) p
PIVOT
(
Avg([part_cost]) FOR part_cat IN ( [fuel],[tire],[wheel] )
) AS pvt

Related

Recursively running a MySQL function

I have a function in MySQL that needs to be run about 50 times (not a set value) in a query. the inputs are currently stored in an array such as
[1,2,3,4,5,6,7,8,9,10]
when executing the MySQL query individually it's working fine, please see below
column_name denotes the column it's getting the data for, in this case, it's a DOUBLE in the database
The second value in the MOD() function is the input I'm supplying MySQL from the aforementioned array
SELECT id, MOD(column_name, 4) AS mod_output
FROM table
HAVING mod_output > 10
To achieve the output I require* the following code works
SELECT id, MOD(column_name, 4) AS mod_output1, MOD(column_name, 5) AS mod_output2, MOD(column_name, 6) AS mod_output3
FROM table
HAVING mod_output1 > 10 AND mod_output2 > 10 AND mod_output3 > 10
However this obviously is extremely dirty, and when having not 3 inputs, but over 50, this will become highly inefficient.
Appart from calling over 50 individual querys, is there a better way to acchieve the same sort (see below) of output?
In escennce i need to supply MySQL with a list of values and have it run MOD() over all of them on a specified column.
The only data I need returned is the id's of the rows that match the MOD() functions output with the specified input (see value 2 of the MOD() function) where the output is less than 10
Please note, MOD() has been used as an example function, however, the final function required *should* be a drop in replacement
example table layout
id | column_name
1 | 0.234977
2 | 0.957739
3 | 2.499387
4 | 48.395777
5 | 9.943782
6 | -39.234894
7 | 23.49859
.....
(The title may be worded wrong, I'm not quite sure how else you'd explain what I'm trying to do here)
Use a join and derived table or temporary table:
SELECT n.n, t.id, MOD(t.column_name, n.n) AS mod_output
FROM table t CROSS JOIN
(SELECT 4 as n UNION ALL SELECT 5 UNION ALL SELECT 6 . . .
) n
WHERE MOD(t.column_name, n.n) > 10;
If you want the results as columns, you can use conditional aggregation afterwards.

Mysql: I want to compare the results of two queries and return the results

In MS Access I have the following query and I want to duplicate it in MysQl
SELECT New_Date_Sev54.*
FROM New_Date_Sev54 LEFT JOIN Old_Date_Sev54 ON New_Date_Sev54.[Expr1] = Old_Date_Sev54.[Expr1]
WHERE (((Old_Date_Sev54.Expr1) Is Null));
New_date query :
SELECT perimeter.*, perimeter.IP, perimeter.QID, perimeter.Severity, [IP] & [QID] AS Expr1
FROM perimeter
WHERE (((perimeter.QID)<>38628 And (perimeter.QID)<>38675) AND ((perimeter.Severity)=5) AND ((perimeter.Date)=22118)) OR (((perimeter.Severity)=4));
Old Date Query:
SELECT perimeter.*, perimeter.IP, perimeter.QID, perimeter.Severity, [IP] & [QID] AS Expr1
FROM perimeter
WHERE (((perimeter.QID)<>38628 And (perimeter.QID)<>38675) AND ((perimeter.Severity)=5) AND ((perimeter.Date)=21918)) OR (((perimeter.Severity)=4));
In the ACCESS query, I basically take all the results with the new date and compare them against the results of the old date (week prior) and return anything that did not exist the week prior.
The database is used to quickly identify new vulnerabilities that exist in the perimeter. And is shaped like this
Date | IP| VulnID | VulnName | Severity | Threat | Resolution
What I have been trying in mysql is using the "NOT IN" comparison of two select statements. However, it is not working.
I want to know all the new vulnerabilities that have a severity of 4 or 5 and that do not have the Vuln id of 32628
Thanks
Put each query into temp tables:
CREATE TEMPORARY TABLE newVulns AS ([new date query])
CREATE TEMPORARY TABLE oldVulns AS ([old date query])
where [new date query] and [old date query] are your select statements.
Then
SELECT * FROM newVulns n
LEFT JOIN oldVulns o
ON n.VulnID = o.VulnID
WHERE o.VulnID IS NULL
AND n.VulnID != 32628
AND n.Severity NOT IN (4, 5)
I believe that should do it.
Temp table creation info found in the manual and a neat visual representation of joins can be found here. I find myself looking at those all the time.

ColdFusion Query too slow

I have queries inside a cfloop that makes the process very slow. Is there a way to make this query faster?
<cfquery name="GetCheckRegister" datasource="myDB">
SELECT * FROM CheckRegister, ExpenseType
Where PropertyID=10
and ExpenseType.ExpenseTypeID=CheckRegister.ExpenseTypeID
</cfquery>
<CFOUTPUT query=GetCheckRegister>
<cfquery name="GetVendorName" datasource="myDB"> SELECT * FROM Vendors WHERE VendorID=#VendorID#</cfquery>
<!--- I use the vendor name here --->
<cfset local.CreditDate = "" />
<cfquery name="getTenantTransactionDateFrom" dataSource="myDB">
Select TenantTransactionDate as fromDate From TenantTransactions
Where CheckRegisterID = #CheckRegisterID#
Order By TenantTransactionDate Limit 1
</cfquery>
<cfquery name="getTenantTransactionDateTo" dataSource="myDB">
Select TenantTransactionDate as ToDate From TenantTransactions
Where CheckRegisterID = #CheckRegisterID#
Order By TenantTransactionDate desc Limit 1
</cfquery>
<cfif getTenantTransactionDateFrom.fromDate neq "" AND getTenantTransactionDateTo.ToDate neq "">
<cfif getTenantTransactionDateFrom.fromDate eq getTenantTransactionDateTo.ToDate>
<cfset local.CreditDate = DateFormat(getTenantTransactionDateFrom.fromDate, 'mm/dd/yyyy') />
<cfelse>
<cfset local.CreditDate = DateFormat(getTenantTransactionDateFrom.fromDate, 'mm/dd/yyyy') & " - " & DateFormat(getTenantTransactionDateTo.ToDate, 'mm/dd/yyyy') />
</cfif>
</cfif>
<!--- I use the local.CreditDate here --->
<!--- Here goes a table with the data --->
</CFOUTPUT>
cfoutput works like a loop.
As others have said, you should get rid of the loop and use joins. Looking at your inner loop, the code retrieves the earliest and latest date for each CheckRegisterID. Instead of using LIMIT, use aggregate functions like MIN and MAX and GROUP BY CheckRegisterID. Then wrap that result in a derived query so you can join the results back to CheckRegister ON id.
Some of the columns in the original query aren't scoped, so I took a few guesses. There's room for improvement, but something like is enough to get you started.
-- select only needed columns
SELECT cr.CheckRegisterID, ... other columns
FROM CheckRegister cr
INNER JOIN ExpenseType ex ON ex.ExpenseTypeID=cr.ExpenseTypeID
INNER JOIN Vendors v ON v.VendorID = cr.VendorID
LEFT JOIN
(
SELECT CheckRegisterID
, MIN(TenantTransactionDate) AS MinDate
, MAX(TenantTransactionDate) AS MaxDate
FROM TenantTransactions
GROUP BY CheckRegisterID
) tt ON tt.CheckRegisterID = cr.CheckRegisterID
WHERE cr.PropertyID = 10
I'd highly recommend reading up on JOIN's as they're critical to any web application, IMO.
You should get all of your data in one query, then work with that data to output what you want. Multiple connections to a database almost always be more resource-intensive than getting the data in one trip and working with it. To get your results:
SQL Fiddle
Initial Schema Setup:
CREATE TABLE CheckRegister ( checkRegisterID int, PropertyID int, VendorID int, ExpenseTypeID int ) ;
CREATE TABLE ExpenseType ( ExpenseTypeID int ) ;
CREATE TABLE Vendors ( VendorID int ) ;
CREATE TABLE TenantTransactions ( checkRegisterID int, TenantTransactionDate date, note varchar(20) );
INSERT INTO CheckRegister ( checkRegisterID, PropertyID, VendorID, ExpenseTypeID )
VALUES (1,10,1,1),(1,10,1,1),(1,10,2,1),(1,10,1,2),(1,5,1,1),(2,10,1,1),(2,5,1,1)
;
INSERT INTO ExpenseType ( ExpenseTypeID ) VALUES (1), (2) ;
INSERT INTO Vendors ( VendorID ) VALUES (1), (2) ;
INSERT INTO TenantTransactions ( checkRegisterID, TenantTransactionDate, note )
VALUES
(1,'2018-01-01','start')
, (1,'2018-01-02','another')
, (1,'2018-01-03','another')
, (1,'2018-01-04','stop')
, (2,'2017-01-01','start')
, (2,'2017-01-02','another')
, (2,'2017-01-03','another')
, (2,'2017-01-04','stop')
;
Main Query:
SELECT cr.*
, max(tt.TenantTransactionDate) AS startDate
, min(tt.TenantTransactionDate) AS endDate
FROM CheckRegister cr
INNER JOIN ExpenseType et ON cr.ExpenseTypeID = et.ExpenseTypeID
INNER JOIN Vendors v ON cr.vendorID = v.VendorID
LEFT OUTER JOIN TenantTransactions tt ON cr.checkRegisterID = tt.CheckRegisterID
WHERE cr.PropertyID = 10
GROUP BY cr.CheckRegisterID, cr.PropertyID, cr.VendorID, cr.ExpenseTypeID
Results:
| checkRegisterID | PropertyID | VendorID | ExpenseTypeID | startDate | endDate |
|-----------------|------------|----------|---------------|------------|------------|
| 1 | 10 | 1 | 1 | 2018-01-04 | 2018-01-01 |
| 1 | 10 | 1 | 2 | 2018-01-04 | 2018-01-01 |
| 1 | 10 | 2 | 1 | 2018-01-04 | 2018-01-01 |
| 2 | 10 | 1 | 1 | 2017-01-04 | 2017-01-01 |
I only added 2 check registers, but CheckRegisterID 1 has 2 vendors and 2 Expense Types for Vendor 1. This will look like repeated data in your query. If your data isn't set up that way, you won't have to worry about it in the final query.
Use proper JOIN syntax to get the related data you need. Then you can aggregate that data to get the fromDate and toDate. If your data is more complex, you may need to look at Window Functions. https://dev.mysql.com/doc/refman/8.0/en/window-functions.html
I don't know what your final output looks like, but the above query gives you all of the query data in one pass. And each row of that data should give you what you need to output, so now you've only got one query to loop over.
It has been a long time since I did any ColdFusion development but a common rule of thumb would be to not call queries within a loop. Depending on what you are doing, loops can be considered an RBAR (row by agonizing row) operation.
You are essentially defining one query and looping over each record. For each record, you are doing three additional queries aka three additional database network calls per record. The way I see it, you have a couple options:
Rewrite your first query to already include the data you need within each
record check.
Leave your first query the way it is and create functionality that provides more information when the user interacts with the record and do it asynchronously. Something like a "Show Credit Date" link which goes out and gets the data on demand.
Combine the queries in your loop to be one query instead of the two getTenantTransaction... and see if performance improves. This reduces the RBAR database calls from three to two.
You always want to avoid having queries in a loop. Whenever you query the database, you have roundtrip (from server to database and back from database to server) which is slow by nature.
A general approach is to bulk data by querying all required information with as few statements as possible. Joining everything in a single statement would be ideal, but this obviously depends on your table schemes. If you cannot solve it using SQL only, you can transform your queries like this:
GetCheckRegister...
(no loop)
<cfquery name="GetVendorName" datasource="rent">
SELECT * FROM Vendors WHERE VendorID IN (#valueList(GetCheckRegister.VendorID)#)
</cfquery>
<cfquery name="getTenantTransactionDateFrom" dataSource="rent">
Select TenantTransactionDate as fromDate From TenantTransactions
Where CheckRegisterID IN (#valueList(GetCheckRegister.CheckRegisterID)#)
</cfquery>
etc.
valueList(query.column) returns a comma delimited list of the specified column values. This list is then used with MySQL's IN (list) selector to retrieve all records that belong to all the listed values.
Now you would only have a single query for each statement in your loop (4 queries total, instead of 4 times number of records in GetCheckRegister). But all records are clumped together, so you need to match them accordingly. To do this we can utilize ColdFusion's Query of Queries (QoQ), which allows you to query on already retrieved data. Since the retrieved data is in memory, accessing them is quick.
GetCheckRegister, GetVendorName, getTenantTransactionDateFrom, getTenantTransactionDateTo etc.
<CFOUTPUT query="GetCheckRegister">
<!--- query of queries --->
<cfquery name="GetVendorNameSingle" dbType="query">
SELECT * FROM [GetVendorName] WHERE VendorID = #GetCheckRegister.VendorID#
</cfquery>
etc.
</CFOUTPUT>
You basically moved the real queries out of the loop and instead query the result of the real queries in your loop using QoQ.
Regardless of this, make sure your real queries are fast by profiling them in MySQL. Use indices!
Using the main query and the loop to process the data could be faster if:
Using SELECT with only specific fields you need, to avoid fetching so many columns (instead of SELECT *), unless you are using all the fields:
SELECT VendorID, CheckRegisterId, ... FROM CheckRegister, ExpenseType ...
Using less subqueries in the loop, trying to join the table to the main query. For example, using the Vendors table in the main query (if it could be posible to join this table)
SELECT VendorID, CheckRegisterId, VendorName ... FROM CheckRegister, ExpenseType, Vendors ...
Finally, you can estimate the time of the process and detect the performance problem:
ROWS = Number of rows of the result in the main query
TIME_V = Time (ms) to get the result of GetVendorName using a valid VendorId
TIME_TD1 = Time (ms) to get the result of getTenantTransactionDateFrom using a valid CheckRegisterID
TIME_TD2 = Time (ms) to get the result of getTenantTransactionDateTo using a valid CheckRegisterID
Then, you can calculate the resulting time using TOTAL = ROWS * (TIME_V+ TIME_TD1 + TIME_TD2).
For example, if ROWS=10000 , TIME_V = 30, TIME_TD1 = 15, TIME_TD2 = 15 : RESULT = 10000 * (30 + 15 + 15) = 10000 * 60 = 600000 (ms) = 600 (sec) = 10 min
So, for 10000 rows, one milisecond of the loop results in 10 seconds added to the process.
When you have many resulting rows for the main query, you need to minimize the query time of each element in the loop. Each milisecond affects in the performance of the loop. So you need to make sure there are the right indexes for each field filtered for each query in the loop.

select one row multiple time when using IN()

I have this query :
select
name
from
provinces
WHERE
province_id IN(1,3,2,1)
ORDER BY FIELD(province_id, 1,3,2,1)
the Number of values in IN() are dynamic
How can I get all rows even duplicates ( in this example -> 1 ) with given ORDER BY ?
the result should be like this :
name1
name3
name2
name1
plus I shouldn't use UNION ALL :
select * from provinces WHERE province_id=1
UNION ALL
select * from provinces WHERE province_id=3
UNION ALL
select * from provinces WHERE province_id=2
UNION ALL
select * from provinces WHERE province_id=1
You need a helper table here. On SQL Server that can be something like:
SELECT name
FROM (Values (1),(3),(2),(1)) As list (id) --< List of values to join to as a table
INNER JOIN provinces ON province_id = list.id
Update: In MySQL Split Comma Separated String Into Temp Table can be used to split string parameter into a helper table.
To get the same row more than once you need to join in another table. I suggest to create, only once(!), a helper table. This table will just contain a series of natural numbers (1, 2, 3, 4, ... etc). Such a table can be useful for many other purposes.
Here is the script to create it:
create table seq (num int);
insert into seq values (1),(2),(3),(4),(5),(6),(7),(8);
insert into seq select num+8 from seq;
insert into seq select num+16 from seq;
insert into seq select num+32 from seq;
insert into seq select num+64 from seq;
/* continue doubling the number of records until you feel you have enough */
For the task at hand it is not necessary to add many records, as you only need to make sure you never have more repetitions in your in condition than in the above seq table. I guess 128 will be good enough, but feel free to double the number of records a few times more.
Once you have the above, you can write queries like this:
select province_id,
name,
#pos := instr(#in2 := insert(#in2, #pos+1, 1, '#'),
concat(',',province_id,',')) ord
from (select #in := '0,1,2,3,1,0', #in2 := #in, #pos := 10000) init
inner join provinces
on find_in_set(province_id, #in)
inner join seq
on num <= length(replace(#in, concat(',',province_id,','),
concat(',+',province_id,',')))-length(#in)
order by ord asc
Output for the sample data and sample in list:
| province_id | name | ord |
|-------------|--------|-----|
| 1 | name 1 | 2 |
| 2 | name 2 | 4 |
| 3 | name 3 | 6 |
| 1 | name 1 | 8 |
SQL Fiddle
How it works
You need to put the list of values in the assignment to the variable #in. For it to work, every valid id must be wrapped between commas, so that is why there is a dummy zero at the start and the end.
By joining in the seq table the result set can grow. The number of records joined in from seq for a particular provinces record is equal to the number of occurrences of the corresponding province_id in the list #in.
There is no out-of-the-box function to count the number of such occurrences, so the expression at the right of num <= may look a bit complex. But it just adds a character for every match in #in and checks how much the length grows by that action. That growth is the number of occurrences.
In the select clause the position of the province_id in the #in list is returned and used to order the result set, so it corresponds to the order in the #in list. In fact, the position is taken with reference to #in2, which is a copy of #in, but is allowed to change:
While this #pos is being calculated, the number at the previous found #pos in #in2 is destroyed with a # character, so the same province_id cannot be found again at the same position.
Its unclear exactly what you are wanting, but here's why its not working the way you want. The IN keyword is shorthand for creating a statement like ....Where province_id = 1 OR province_id = 2 OR province_id = 3 OR province_id = 1. Since province_id = 1 is evaluated as true at the beginning of that statement, it doesn't matter that it is included again later, it is already true. This has no bearing on whether the result returns a duplicate.

Search text within Varchar(max) column of Sql server

I wanted to write a t-sql query which finds values within a column of a sql server table.
Example,
CREATE TABLE Transactions (Details varchar(max));
Details Column has below type strings stored in it
ID=124|NAME=JohnDoe|DATE=020620121025|ISPRIMARY=True|
TRANSACTION_AMOUNT=124.36|DISCOUNT_AMOUNT=10.00|STATE=GA|
ADDR1=test|ADDR2=test22|OTHER=OtherDetailsHere
ID=6257|NAME=michael|DATE=050320111255|ISPRIMARY=False|
TRANSACTION_AMOUNT=4235.00|DISCOUNT_AMOUNT=33.25|STATE=VA|
ADDR1=test11|ADDR2=test5|OTHER=SomeOtherDetailsHere
Objective is to write query which gives below output
Name | Transaction Amount | Discount
-------------------------------------------
JohnDoe | 124.36 | 10.00
michael | 4235.00 | 33.25
Any help would be highly appreciated.
Thanks,
Joe
Why are you storing your data pipe delimited in a single column -- these fields should be added as columns to the table.
However, if that isn't an option, you'll need to use string manipulation. Here's one option using a couple Common Table Expressions, along with SUBSTRING and CHARINDEX:
WITH CTE1 AS (
SELECT
SUBSTRING(Details,
CHARINDEX('|NAME=', DETAILS) + LEN('|NAME='),
LEN(Details)) NAME,
SUBSTRING(Details,
CHARINDEX('|TRANSACTION_AMOUNT=', DETAILS) + LEN('|TRANSACTION_AMOUNT='),
LEN(Details)) TRANSACTION_AMOUNT,
SUBSTRING(Details,
CHARINDEX('|DISCOUNT_AMOUNT=', DETAILS) + LEN('|DISCOUNT_AMOUNT='),
LEN(Details)) DISCOUNT_AMOUNT
FROM Transactions
), CTE2 AS (
SELECT
SUBSTRING(NAME,1,CHARINDEX('|',NAME)-1) NAME,
SUBSTRING(TRANSACTION_AMOUNT,1,CHARINDEX('|',TRANSACTION_AMOUNT)-1) TRANSACTION_AMOUNT,
SUBSTRING(DISCOUNT_AMOUNT,1,CHARINDEX('|',DISCOUNT_AMOUNT)-1) DISCOUNT_AMOUNT
FROM CTE1
)
SELECT *
FROM CTE2
SQL Fiddle Demo