I am trying to unpivot some data in SSRS but am struggling to get the format I want. My current table and data is shown below :-
CREATE TABLE [dbo].sampledata(
[DDMMMYY] [date] NULL,
[DayN] [nvarchar](4000) NULL,
[CArticle] [int] NULL,
[TU] [int] NULL,
[Pieces] [int] NULL,
[ActualSpace] [int] NULL,
[InternalCore] [int] NULL,
[QuarantinedStock] [int] NULL,
[AvailableSpace] [int] NULL
)
insert into sampledata (DDMMMYY, DayN, CArticle, TU, Pieces, ActualSpace, InternalCore, QuarantinedStock, AvailableSpace)
VALUES
('2019-09-13','Fri','848','20403','1249144','59790','17652','433','0'),
('2019-09-16','Mon','878','21328','1253811','63133','18908','429','0'),
('2019-09-17','Tue','892','21106','1253607','61770','18780','402','0'),
('2019-09-18','Wed','910','20948','1250381','61543','18485','955','0'),
('2019-09-19','Thu','863','20351','1243131','60235','18845','627','0'),
('2019-09-20','Fri','847','19923','1242594','59565','19460','1385','0'),
('2019-09-23','Mon','863','20862','1254736','62773','18362','1418','0'),
('2019-09-24','Tue','860','20592','1259028','62864','19972','1422','0'),
('2019-09-25','Wed','871','20646','1273306','63079','20498','1430','0'),
('2019-09-26','Thu','875','20424','1264449','61508','20040','1430','0'),
('2019-09-27','Fri','884','20581','1277418','62128','20287','1430','0'),
('2019-09-30','Mon','873','21684','1341305','66764','22666','1266','0');
I will never be returning more than 31 days worth of data.
What I want to do is unpivot the data but keep the group by based on date, my data should look like this when finished :-
I want the dates to run across the top of my data and the headings to run down the left side e.g. TU, Pieces etc, initially I used unpivot but as there are multiple days, the data did not stretch across to the right as I wanted by day.
I have tried using a SSRS matrix but still struggle to get the desired output.
Any help would be much appreciated.
Here's how I would UNPIVOT your data:
SELECT DDMMMYY, DayN, TypeCount, DataType
FROM #sampledata
UNPIVOT (TypeCount FOR DataType IN (CArticle
,TU
,Pieces
,ActualSpace
,InternalCore
,QuarantinedStock
,AvailableSpace )
) u
Personally, I would just use a DATE field and let SSRS figure out what day it is (Mon, Tue...) rather than have a DayN field.
Related
I have a database of phone call data from our phone system that I am trying to create a report on. These phone calls match up to a table of internal and external numbers. The report needs to try to match the phone call to an external number in our database first and if there is no match try to match it to an internal number.
I have created a sample data set and db-fiddle, and removed some data to hopefully explain it better:
CREATE TABLE `cdr` (
`callnumber` int(11) NOT NULL,
`origLegCallIdentifier` int(11) NOT NULL,
`dateTimeOrigination` datetime NOT NULL,
`callType` varchar(50) NOT NULL,
`chargeable` varchar(10) NOT NULL,
`callCharge` decimal(10,2) NOT NULL,
`origNodeId` int(11) NOT NULL,
`destLegIdentifier` int(11) NOT NULL,
`destNodeId` int(11) NOT NULL,
`callingPartyNumber` varchar(50) NOT NULL,
`callingPartyNumberPartition` varchar(50) NOT NULL,
`callingPartyNumberState` varchar(10) NOT NULL,
`callingPartyNumberSite` varchar(30) NOT NULL,
`originalCalledPartyNumber` varchar(50) NOT NULL,
`originalCalledPartyNumberPartition` varchar(50) NOT NULL,
`finalCalledPartyNumber` varchar(50) NOT NULL,
`finalCalledPartyNumberPartition` varchar(50) NOT NULL,
`lastRedirectDn` varchar(50) NOT NULL,
`lastRedirectDnPartition` varchar(50) NOT NULL,
`dateTimeConnect` datetime DEFAULT NULL,
`dateTimeDisconnect` datetime NOT NULL,
`duration` int(11) NOT NULL,
`origDeviceName` varchar(129) NOT NULL,
`destDeviceName` varchar(129) NOT NULL,
`origIpv4v6Addr` varchar(64) NOT NULL,
`destIpv4v6Addr` varchar(64) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `cdr` (`callnumber`, `origLegCallIdentifier`, `dateTimeOrigination`, `callType`, `chargeable`, `callCharge`, `origNodeId`, `destLegIdentifier`, `destNodeId`, `callingPartyNumber`, `callingPartyNumberPartition`, `callingPartyNumberState`, `callingPartyNumberSite`, `originalCalledPartyNumber`, `originalCalledPartyNumberPartition`, `finalCalledPartyNumber`, `finalCalledPartyNumberPartition`, `lastRedirectDn`, `lastRedirectDnPartition`, `dateTimeConnect`, `dateTimeDisconnect`, `duration`, `origDeviceName`, `destDeviceName`, `origIpv4v6Addr`, `destIpv4v6Addr`) VALUES
(52004, 69637277, '2020-08-31 03:05:05', 'outbound-national', 'yes', '0.00', 4, 69637278, 4, '6220', 'PT_INTERNAL', 'NSW', 'Site A', '0412345678', 'PT_NATIONAL_TIME_RESTRICT', '0412345678', 'PT_NATIONAL_TIME_RESTRICT', '0412345678', 'PT_NATIONAL_TIME_RESTRICT', NULL, '2020-08-31 03:05:08', 0, 'SEP00XXXXX', 'XXXXX', '1.1.1.1', '1.1.1.1');
CREATE TABLE `numbers` (
`numberid` int(11) NOT NULL,
`number` varchar(30) NOT NULL,
`memberid` int(11) NOT NULL,
`type` enum('internal','external') NOT NULL,
`description` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `numbers` (`numberid`, `number`, `memberid`, `type`, `description`) VALUES
(1555, '0412345678', 436, 'internal', ''),
(1556, '6220', 437, 'external', '');
https://www.db-fiddle.com/f/ofH6sENoce8tGVsoxMejwZ/1
The above example shows how it ends up with a duplicate for a single record because it matches 6220 as the callingPartyNumber and 0412345678 as the finalCalledPartyNumber in each respective select.
This is an example of what I want to see (union has been removed):
https://www.db-fiddle.com/f/bVSWESvnKJKvuNefLqH4aU/0
I want a single record for when it either matches a finalCalledPartyNumber first or then a callingPartyNumber. Records that don't match anything will not be shown.
Updated select using Caius's example
SELECT
DATE(CONVERT_TZ(cdr.dateTimeOrigination,'+00:00',##global.time_zone)) as 'Date',
TIME(CONVERT_TZ(cdr.dateTimeOrigination,'+00:00',##global.time_zone)) as 'Time',
cdr.callType,
cdr.callingPartyNumberState,
cdr.callingPartyNumber,
COALESCE(finalcalledparty.memberid, callingparty.memberid, originalcalledparty.memberid, 'No Match') as MemberID,
cdr.originalCalledPartyNumber,
cdr.finalCalledPartyNumber,
CONCAT(MOD(HOUR(SEC_TO_TIME(cdr.duration)), 24), ':', LPAD(MINUTE(SEC_TO_TIME(cdr.duration)),2,0), ':', LPAD(second(SEC_TO_TIME(cdr.duration)),2,0)) as 'duration',
cdr.callCharge
FROM `cdr`
LEFT JOIN numbers finalcalledparty ON finalcalledparty.number = cdr.finalCalledPartyNumber
LEFT JOIN numbers callingparty ON callingparty.number = cdr.callingPartyNumber
LEFT JOIN numbers originalcalledparty ON originalcalledparty.number = cdr.OriginalCalledPartyNumber
WHERE (cdr.callType LIKE '%outbound%' OR cdr.callType LIKE '%transfer%' OR cdr.callType LIKE '%forward%')
ORDER BY Date DESC, Time DESC
Select with members table join
SELECT
DATE(CONVERT_TZ(cdr.dateTimeOrigination,'+00:00',##global.time_zone)) as 'Date',
TIME(CONVERT_TZ(cdr.dateTimeOrigination,'+00:00',##global.time_zone)) as 'Time',
cdr.callType,
'Calling' as ChargeType,
cdr.callingPartyNumberState,
cdr.callingPartyNumber,
COALESCE(finalcalledmember.name, callingmember.name, 'No Match') as MemberName,
cdr.finalCalledPartyNumber,
CONCAT(MOD(HOUR(SEC_TO_TIME(cdr.duration)), 24), ':', LPAD(MINUTE(SEC_TO_TIME(cdr.duration)),2,0), ':', LPAD(second(SEC_TO_TIME(cdr.duration)),2,0)) as 'duration',
cdr.callCharge
FROM `cdr`
LEFT JOIN numbers callingparty ON callingparty.number = cdr.callingPartyNumber
LEFT JOIN numbers finalcalledparty ON finalcalledparty.number = cdr.finalCalledPartyNumber
LEFT JOIN members callingmember ON callingmember.memberid = callingparty.memberid
LEFT JOIN members finalcalledmember ON finalcalledmember.memberid = finalcalledparty.memberid
WHERE (callType LIKE '%outbound%' OR callType LIKE '%transfer%' OR callType LIKE '%forward%') AND DATE(CONVERT_TZ(cdr.dateTimeOrigination,'+00:00',##global.time_zone)) = '2020-09-01'
ORDER BY Date DESC, Time DESC
The report needs to try to match the phone call to an external number in our database first and if there is no match try to match it to an internal number.
You can use a pair of left joins for this. Here's a simpler dataset:
Person, Number
John, e1
James, i2
Jenny, x3
ExternalNumber, Message
e1, Hello
InternalNumber
i2, Goodbye
SELECT p.Person, COALESCE(e.Message, i.Message, 'No Match')
FROM
Person p
LEFT JOIN Externals e ON p.Number = e.ExternalNumber
LEFT JOIN Internal e ON p.Number = i.InternalNumber
Results:
John, Hello
James, Goodbye
Jenny, No Match
Few things you need to appreciate about SQL in general:
A UNION makes a dataset grow taller (more rows)
A JOIN makes a dataset grow wider (more columns)
It is easy to compare things on the same row, more difficult to compare things on different rows
There isn't exactly a concept of "doing something now" and "doing something later" - i.e. your "try to match it to external first and if that doesn't work try match it to internal" isn't a good way to think about the problem, mentally. The SQL way would be to "match it to external and match it to internal, then preferentially pick the external match, then the internal match, then maybe no match"
COALESCE takes a list of arguments and, working left to right, returns the first one that isn't null. Coupled with LEFT JOIN putting nulls when the match fails, it means we can use it to prefer external matches over internal
Because it's easier to compare things on the same row, we just try and match the data against the external and internal numbers tables as a direct operation. We use LEFT JOIN so that if the match doesn't work out, at least it doesn't cause the row to disappear..
So you join both numbers tables in and the matches either work out for external (and you will pick external), work out for internal but not external (and you will pick internal), work out for both int and ext (and you will pick ext over int), or don't work out (and you might have a message to say No Match)
It should be pointed out that the COALESCE approach only really works well if the data won't naturally contain nulls. If the data looked like this:
Person, Number
John, e1
James, i2
Jenny, x3
ExternalNumber, Message
e1, NULL
InternalNumber
i2, Goodbye
Then this will be the result:
John, Goodbye
James, Goodbye
Jenny, No Match
Even though the join succeeded, the presence of a NULL in the ExternalNumber.Message means the InternalNumber.Message is used instead, and this might not be correct. We can solve this by using CASE WHEN instead, to test for a column that definitely won't be null when a record matches:
CASE
WHEN e.ExternalNumber IS NOT NULL THEN e.Message
WHEN i.InternalNumber IS NOT NULL THEN i.Message
ELSE 'No Match'
END
Because we test the column that is the key for the join the only way we can get a null there is when the join fails to find a match.
I'm far from a MYSQL expert, and I'm struggling with a relatively complicated query.
I have two tables:
A Data table with columns as follows:
`Location` bigint(20) unsigned NOT NULL,
`Source` bigint(20) unsigned NOT NULL,
`Param` bigint(20) unsigned NOT NULL,
`Type` bigint(20) unsigned NOT NULL,
`InitTime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`ValidTime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`Value` double DEFAULT NULL
A Location Group table with columns as follows:
`Group` bigint(20) unsigned NOT NULL,
`Location` bigint(20) unsigned NOT NULL,
The data table stores data of interest, where each 'value' is valid for a particular 'validtime'. However, the data in the table comes from a calculation which is run periodically. The initialisation time at which the calculation is run is stored in the 'inittime' field. A given calculation with particular inittime may result in, say 10 values being output with valid times (A - J). A more recent calculation, with a more recent inittime, may result in another 10 values being output with valid times (B - K). There is therefore an overlap in available values. I always want a result set of Values and ValidTimes for the most recent inittime (i.e. max(inittime)).
I can determine the most recent inittime using the following query:
SELECT MAX(InitTime)
FROM Data
WHERE
Location = 100060 AND
Source = 10 AND
Param = 1 AND
Type = 1;
This takes 0.072 secs to execute.
However, using this as a sub-query to retrieve data from the Data table results in an execution time of 45 seconds (it's a pretty huge table, but not super ridiculous).
Sub-Query:
SELECT Location, ValidTime, Value
FROM Data data
WHERE Source = 10
AND Location IN (SELECT Location FROM Location Group WHERE Group = 3)
AND InitTime = (SELECT max(data2.InitTime) FROM Data data2 WHERE data.Location = data2.Location AND data.Source = data2.Source AND data.Param = data2.Param AND data.Type = data2.Type)
ORDER BY Location, ValidTime ASC;
(Snipped ValidTime qualifiers for brevity)
I know there's likely some optimisation that would help here, but I'm not sure where to start. Instead, I created a stored procedure to effectively perform the MAX(InitTime) query, but because the MAX(InitTime) is determined by a combo of Location, Source, Param and Type, I need to pass in all the locations that comprise a particular group. I implemented a cursors-based stored procedure for this before realising there must be an easier way.
Putting aside the question of optimisation via indices, how could I efficiently perform a query on the data table using the most recent InitTime for a given location group, source, param and type?
Thanks in advance!
MySQL can do a poor job optimizing IN with a subquery (sometimes). Also, indexes might be able to help. So, I would write the query as:
SELECT d.Location, d.ValidTime, d.Value
FROM Data d
WHERE d.Source = 10 AND
EXISTS (SELECT 1 FROM LocationGroup lg WHERE d.Location = lg.Location and lg.Group = 3) AND
d.InitTime = (SELECT max(d2.InitTime)
FROM Data d2
WHERE d.Location = d2.Location AND
d.Source = d2.Source AND
d.Param = d2.Param AND
d.Type = d2.Type
)
ORDER BY d.Location, d.ValidTime ASC;
For this query, you want indexes on data(Location, Source, Param, Type, InitTime) and LocationGroup(Location, Group), and data(Source, Location, ValidTime).
I'm writing a report in SSRS which requires a search parameter to filter the report. The parameter is setup to default allow nulls which should let the report run as normal.
The report run's, however nothing get's returned until I ad something into into the search parameter.
Is it possible to use an IIF expression to say that if the parameter is null run the report as normal?
Here is the query I'm using to generate the dataset within SSRS.
CREATE TABLE #StartDateTable(
stSecurityType varchar(10) NOT NULL,
stSecuritySymbol varchar(50) NOT NULL,
stPrice float NOT NULL,
stSecurityID int NOT NULL,
stPriceDate date NOT NULL
)
INSERT INTO #StartDateTable (stSecurityType, stSecuritySymbol, stPrice, stSecurityID, stPriceDate )
SELECT DISTINCT
Instruments.SecurityType, Instruments.SecuritySymbol,
InstrumentPrice.Price, InstrumentPrice.SecurityID, InstrumentPrice.PriceDate
FROM
InstrumentPrice
JOIN
Instruments ON Instruments.ID = InstrumentPrice.SecurityID
WHERE
InstrumentPrice.PriceDate = #StartDate;
CREATE TABLE #EndDateTable
(
etSecurityType varchar(10) NOT NULL,
etSecuritySymbol varchar(50) NOT NULL,
etPrice float NOT NULL,
etSecurityID int NOT NULL,
etPriceDate date NOT NULL
)
INSERT INTO #EndDateTable (etSecurityType, etSecuritySymbol, etPrice, etSecurityID, etPriceDate)
SELECT DISTINCT
Instruments.SecurityType, Instruments.SecuritySymbol,
InstrumentPrice.Price, InstrumentPrice.SecurityID,
InstrumentPrice.PriceDate
FROM
InstrumentPrice
JOIN
Instruments ON Instruments.ID = InstrumentPrice.SecurityID
WHERE
InstrumentPrice.PriceDate = #EndDate;
SELECT *
FROM #StartDateTable
LEFT JOIN #EndDateTable ON #EndDateTable.etSecurityID = #StartDateTable.stSecurityID
I setup the search parameter as a filter on the dataset within SSRS with a LIKE as I want it to be a wildcard.
There are 2 ways to do this. Either filter at Dataset with a where clause or below is using group filter.
From the group properties use between and use expression..
Example =IIF(Isnothing(Parameters!Param1.Value) = true, 0, Parameters!Param1.Value) and =IIF(Isnothing(Parameters!Param1.Value) = true, 999999, Parameters!Param1.Value)
I wrote a fairly simple stored procedure that accepts some data, selects a value, inserts a couple records and then returns that value. But execution is too long in our production environment where I might eventually want it to run a few hundred thousand times in a day and it's adversely affecting other processes even when we're only running it say 30000 times.
I started by looking at the queries and adding an index on the date field that's used in where clauses. Then I ran SQL Server Profiler -- feeding the results into Tuning Advisor and implementing the indexing suggestions that it came up with. In the past, I've seen that tool call for really ugly indexes but this time it just wanted a single addition which made sense, so I added it. Each of those steps helped. But still too slow.
It was easy to figure out that the first query was the holdup, not the two inserts where are nearly instantaneous. So here's what I had at that time, including subquery run-times:
--all combined, this typically takes in the range of 1200-1500 ms but occasionally spikes up to ~2200 ms
select coalesce(
(
--when the following is run, this takes ~690 ms
select MIN(maxes.imbsn)
from (
--when the following is run without the higher limiting scopes, this takes ~3600 ms
select imbsn, MAX(assignmentDate) maxAD
from imbsnAssignments
group by imbsn
) maxes
where datediff(d, maxes.maxAD, GETDATE()) > 90
)
,
(
--this is rarely executed but takes ~0 ms when it is
select max(imbsn)+1 from imbsnAssignments
)
)
Based on those times, it seemed like the coalesce was mucking things up (this is a thing that I imagine I could verify with the execution plan if I had ever figured out how to read it, but I haven't -- plans remain largely opaque to my poor brain). To get rid of the coalesce, I changed up the query to:
--this runs ~480-700 ms
select MIN(maxes.imbsn)
from (
select imbsn, MAX(assignmentDate) maxAD
from imbsnAssignments
group by imbsn
union
select max(imbsn)+1, getDate()
from imbsnAssignments
) maxes
where datediff(d, maxes.maxAD, GETDATE()) > 90
which is a big improvement. But it's still pretty slow.
I verified that Profiler-Tuning Advisor still didn't want me to make any changes before coming here to ask you all about it. Which I think leaves me with two approaches: 1) maintain the basic algorithm but squeeze greater efficiency out of it or 2) switch to some smarter way of obtaining the same basic effect through methods that I don't know of but one will be obvious to one of you big-brains who sniff out that I'm engaged in some kind of anti-pattern here.
Thanks in advance for your time and attention!
I'm not precisely sure what the expected format for this additional information is, but I'll give it a try. The table is:
CREATE TABLE [dbo].[imbsnAssignments](
[id] [int] IDENTITY(1,1) NOT NULL,
[imbsn] [int] NOT NULL,
[assignmentDate] [date] NOT NULL,
[jobCode] [varchar](10) NOT NULL,
[name] [varchar](45) NOT NULL,
[a1] [varchar](45) NOT NULL,
[a2] [varchar](45) NOT NULL,
[a3] [varchar](45) NOT NULL,
[a4] [varchar](45) NOT NULL,
[city] [varchar](40) NOT NULL,
[state] [char](10) NOT NULL,
[zip] [varchar](10) NOT NULL,
[batchIdent] [varchar](256) NOT NULL,
CONSTRAINT [PK_imbsnAssignments] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
and I suspect this isn't the 'right' way to show the indexes but:
TableName IndexName IndexType
imbsnAssignments PK_imbsnAssignments CLUSTERED
imbsnAssignments IX_imbsnAssignments_assignmentDate NONCLUSTERED
imbsnAssignments _dta_index_imbsnAssignments_36_149575571__K2_3 NONCLUSTERED
Execution plan:
|--Stream Aggregate(DEFINE:([Expr1013]=MIN([partialagg1018])))
|--Concatenation
|--Stream Aggregate(DEFINE:([partialagg1018]=MIN([IMB].[dbo].[imbsnAssignments].[imbsn])))
| |--Filter(WHERE:([Expr1003]<dateadd(day,(-90),getdate())))
| |--Stream Aggregate(GROUP BY:([IMB].[dbo].[imbsnAssignments].[imbsn]) DEFINE:([Expr1003]=MAX([IMB].[dbo].[imbsnAssignments].[assignmentDate])))
| |--Index Scan(OBJECT:([IMB].[dbo].[imbsnAssignments].[_dta_index_imbsnAssignments_36_149575571__K2_3]), ORDERED FORWARD)
|--Stream Aggregate(DEFINE:([partialagg1018]=MIN([Expr1009])))
|--Compute Scalar(DEFINE:([Expr1009]=[Expr1008]+(1)))
|--Stream Aggregate(DEFINE:([Expr1008]=MAX([IMB].[dbo].[imbsnAssignments].[imbsn])))
|--Top(TOP EXPRESSION:((1)))
|--Index Scan(OBJECT:([IMB].[dbo].[imbsnAssignments].[_dta_index_imbsnAssignments_36_149575571__K2_3]), ORDERED BACKWARD)
First I would suggest taking your where clause closer to the actual table in your second example, and I would check the execution plan to see what its doing, maybe calculate what 90 days back is check it, rather than DATEDIFF every single maxAD value. And just lets change the union to a union all so it doesn't have to check for uniqueness between the datasets.
select MIN(maxes.imbsn)
from (
select imbsn, MAX(assignmentDate) maxAD
from imbsnAssignments
group by imbsn
having max(assignmentdate) < dateadd(-90, d, GETDATE())
union all
select max(imbsn)+1, getDate()
from imbsnAssignments
) maxes
I've got a user-defined function to split lists of integers into a table of values. I'm using this to parse input to select a set of records for a given set of types or statuses.
This works:
select * from RequestStatus
where RequestStatusUID in (select [value] from dbo.SplitIDs('1,2,3', ','))
This does not:
select * from Requests
where RequestStatusUID in (select [value] from dbo.SplitIDs('1,2,3', ','))
The Requests query returns the error "Conversion failed when converting the varchar value '1,2,3' to data type int." RequestStatusUID on both tables are int columns. Both Explain plans look the same to me. The function is working perfectly the same way in unrelated queries. So far as I can tell it's only the Requests table that has the problem.
CREATE TABLE [dbo].[Requests] (
[RequestUID] int IDENTITY(1,1) NOT NULL,
[UserUID] int NOT NULL,
[LocationUID] int NOT NULL,
[DateOpened] date NULL,
[DateClosed] date NULL,
[RequestStatusUID] int NOT NULL,
[DiscussionUID] int NULL,
[RequestTypeUID] int NOT NULL,
[RequestNo] varchar(16) NOT NULL,
[LastUpdateUID] int NOT NULL,
[LastUpdated] date NOT NULL,
CONSTRAINT [PK_Requests] PRIMARY KEY NONCLUSTERED([RequestUID])
It does work if I use a different function that returns varchars and I convert the RequestStatusUID column to a varchar as well:
select * from Requests
where cast(RequestStatusUID as varchar(4)) in (select [value] from dbo.Split('1,2,3', ','))
For reference, the SplitIDs function I'm using (a modified version of Arnold Fribble's solution). The Split function is the same without the cast as int at the end:
ALTER FUNCTION [dbo].[SplitIDs] ( #str VARCHAR(MAX), #delim char(1)=',' )
RETURNS TABLE
AS
RETURN
(
with cte as (
select 0 a, 1 b
union all
select b, cast(charindex(#delim, #str, b) + 1 as int)
from cte
where b > a
)
select cast(substring(#str,a,
case when b > 1 then b-a-1 else len(#str) - a + 1 end) as int) [value]
from cte where a >0
)
I can use the convert-to-strings solution but I'd really like to know why this is failing in the first place.
I think you will find that this syntax performs a lot better:
SELECT r.* FROM dbo.Requests AS r
INNER JOIN dbo.SplitIDs('1,2,3', ',') AS s
ON r.RequestStatusUID = s.value;
The predicate still has a bunch of implicit converts, due to your function choice, but the join eliminates an expensive table spool. You may see even slightly better performance if you use a proper column list, limited to the actual columns you need, instead of using SELECT *. You should change this even if you do need all of the columns.
Your IN () query, with an expensive table spool (click to enlarge):
My JOIN version, where the cost is transferred to the scan you're doing anyway (click to enlarge):
And here are runtime metrics (based on a small number of rows of course) - (click to enlarge):
The conversion errors seem to be stemming from the function. So I substituted my own (below). Even after adding the foreign key we didn't initially know about, I was unable to reproduce the error. I am not sure exactly what the problem is with the original function, but all those implicit converts it creates seem to cause an issue to the optimizer at some point. So I suggest this one instead:
CREATE FUNCTION dbo.SplitInts
(
#List VARCHAR(MAX),
#Delimiter VARCHAR(255) = ','
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT [value] = y.i.value('(./text())[1]', 'int')
FROM
(
SELECT x = CONVERT(XML, '<i>'
+ REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
So, seems to me like you want to get rid of the function.
Also, here is one way to use a join and still make the param optional:
DECLARE #param VARCHAR(MAX) = NULL;-- also try = '1,2,3';
SELECT r.*
FROM dbo.Requests AS r
LEFT OUTER JOIN dbo.SplitInts(#param, default) AS s
ON r.RequestStatusUID = s.value
WHERE (r.RequestStatusUID = s.value OR #param IS NULL);