Search text within Varchar(max) column of Sql server - sql-server-2008

I wanted to write a t-sql query which finds values within a column of a sql server table.
Example,
CREATE TABLE Transactions (Details varchar(max));
Details Column has below type strings stored in it
ID=124|NAME=JohnDoe|DATE=020620121025|ISPRIMARY=True|
TRANSACTION_AMOUNT=124.36|DISCOUNT_AMOUNT=10.00|STATE=GA|
ADDR1=test|ADDR2=test22|OTHER=OtherDetailsHere
ID=6257|NAME=michael|DATE=050320111255|ISPRIMARY=False|
TRANSACTION_AMOUNT=4235.00|DISCOUNT_AMOUNT=33.25|STATE=VA|
ADDR1=test11|ADDR2=test5|OTHER=SomeOtherDetailsHere
Objective is to write query which gives below output
Name | Transaction Amount | Discount
-------------------------------------------
JohnDoe | 124.36 | 10.00
michael | 4235.00 | 33.25
Any help would be highly appreciated.
Thanks,
Joe

Why are you storing your data pipe delimited in a single column -- these fields should be added as columns to the table.
However, if that isn't an option, you'll need to use string manipulation. Here's one option using a couple Common Table Expressions, along with SUBSTRING and CHARINDEX:
WITH CTE1 AS (
SELECT
SUBSTRING(Details,
CHARINDEX('|NAME=', DETAILS) + LEN('|NAME='),
LEN(Details)) NAME,
SUBSTRING(Details,
CHARINDEX('|TRANSACTION_AMOUNT=', DETAILS) + LEN('|TRANSACTION_AMOUNT='),
LEN(Details)) TRANSACTION_AMOUNT,
SUBSTRING(Details,
CHARINDEX('|DISCOUNT_AMOUNT=', DETAILS) + LEN('|DISCOUNT_AMOUNT='),
LEN(Details)) DISCOUNT_AMOUNT
FROM Transactions
), CTE2 AS (
SELECT
SUBSTRING(NAME,1,CHARINDEX('|',NAME)-1) NAME,
SUBSTRING(TRANSACTION_AMOUNT,1,CHARINDEX('|',TRANSACTION_AMOUNT)-1) TRANSACTION_AMOUNT,
SUBSTRING(DISCOUNT_AMOUNT,1,CHARINDEX('|',DISCOUNT_AMOUNT)-1) DISCOUNT_AMOUNT
FROM CTE1
)
SELECT *
FROM CTE2
SQL Fiddle Demo

Related

How to sum daily resetting data using MySQL

I am attempting to plot data cumulatively from a MySQL table which logs a value, resetting to 0 every day. After selecting the values using select * from table where DateTime BETWEEN DateA AND DateB, the data looks like this: current data. I would like the output to look like this: preferred data, ignoring the daily resets.
As I am a novice in SQL I was unable to find a solution to this. I did, however, obtain the correct output in Matlab using a for loop:
output = data;
for k=1:(size(data, 1)-1)
% check if next value is smaller than current
if data(k+1)<data(k)
% add current value to all subsequent values
output = output + (1:size(data, 1)>k)'.*input(k);
end
end
I would like the final product to connect to a web page, so I am curious if it would be possible obtain a similar result using only SQL. While I have tried using SUM(), I have only been able to sum all values, but I need to add the last value each day to all subsequent values.
Using CTE and comparing dates, you can sum all values each date.
Let's say that table1 below is defined.
create table table1 (col_date date, col_value int);
insert into table1 values
('2020-07-15',1000),
('2020-07-15',2000),
('2020-07-16',1000),
('2020-07-16',3000),
('2020-07-16',4000),
('2020-07-17',1000),
('2020-07-18',2000),
('2020-07-19',1000),
('2020-07-19',1000),
('2020-07-19',2000),
('2020-07-19',3000),
('2020-07-20',4000),
('2020-07-20',5000),
('2020-07-21',6000)
;
In this case, the query looks like this:
with cte1 as (
select col_date, sum(col_value) as col_sum from table1
where col_date between '2020-07-16' and '2020-07-20'
group by col_date
)
select a.col_date, max(a.col_sum), sum(b.col_sum)
from cte1 a inner join cte1 b on a.col_date >= b.col_date
group by a.col_date;
The output is below:
col_date |max(a.col_sum) |sum(b.col_sum)
2020-07-16 |8000 | 8000
2020-07-17 |1000 | 9000
2020-07-18 |2000 |11000
2020-07-19 |7000 |18000
2020-07-20 |9000 |27000
The column of max() is just for reference.

Oracle SQL when querying a range of data

I have a table that for an ID, will have data in several bucket fields. I want a function to pull out a sum of buckets, but the function parameters will include the start and end bucket field.
So, if I had a table like this:
ID Bucket0 Bucket30 Bucket60 Bucket90 Bucket120
10 5.00 12.00 10.00 0.0 8.00
If I send in the ID and the parameters Bucket0, Bucket0, it would return only the value in the Bucket0 field: 5.00
If I send in the ID and the parameters Bucket30, Bucket120, it would return the sum of the buckets from 30 to 120, or (12+10+0+8) 30.00.
Is there a nicer way to write this other than a huge ugly
if parameter1=bucket0 and parameter2=bucket0
then select bucket0
else if parameter1=bucket0 and parameter2=bucket1
then select bucket0 + bucket1
else if parameter1=bucket0 and parameter2=bucket2
then select bucket0 + bucket1 + bucket2
and so on?
The table already exists, so I don't have a lot of control over that. I can make my parameters for the function however I want. I can safely say that if a set of buckets are wanted, none in the middle will be skipped, so specifying start and end buckets would work. I could have a single comma delimited string of all buckets wanted.
It would have been better if your table had been normalised, like this:
id | bucket | value
---+-----------+------
10 | bucket000 | 5
10 | bucket030 | 12
10 | bucket060 | 10
10 | bucket090 | 0
10 | bucket120 | 8
Also, the buckets should better have names that are easy to compare in ranges, so that bucket030 comes between bucket000 and bucket120 in the normal alphabetical order, which is not the case if you leave out the padded zeroes.
If the above normalisation is not possible, then use an unpivot clause to turn your current table into the structure depicted above:
select id, sum(value)
from (
select *
from mytable
unpivot (value for bucket_id in (bucket0 as 'bucket000',
bucket30 as 'bucket030',
bucket60 as 'bucket060',
bucket90 as 'bucket090',
bucket120 as 'bucket120'))
) normalised
where bucket_id between 'bucket000' and 'bucket060'
group by id
When you do this with parameter variables, make sure those parameters have the padded zeroes as well.
You could for instance ensure that as follows for parameter1:
if parameter1 like 'bucket%' then
parameter1 := 'bucket' || lpad(+substr(parameter1, 7), 3, '0');
end if;
...etc.

select one row multiple time when using IN()

I have this query :
select
name
from
provinces
WHERE
province_id IN(1,3,2,1)
ORDER BY FIELD(province_id, 1,3,2,1)
the Number of values in IN() are dynamic
How can I get all rows even duplicates ( in this example -> 1 ) with given ORDER BY ?
the result should be like this :
name1
name3
name2
name1
plus I shouldn't use UNION ALL :
select * from provinces WHERE province_id=1
UNION ALL
select * from provinces WHERE province_id=3
UNION ALL
select * from provinces WHERE province_id=2
UNION ALL
select * from provinces WHERE province_id=1
You need a helper table here. On SQL Server that can be something like:
SELECT name
FROM (Values (1),(3),(2),(1)) As list (id) --< List of values to join to as a table
INNER JOIN provinces ON province_id = list.id
Update: In MySQL Split Comma Separated String Into Temp Table can be used to split string parameter into a helper table.
To get the same row more than once you need to join in another table. I suggest to create, only once(!), a helper table. This table will just contain a series of natural numbers (1, 2, 3, 4, ... etc). Such a table can be useful for many other purposes.
Here is the script to create it:
create table seq (num int);
insert into seq values (1),(2),(3),(4),(5),(6),(7),(8);
insert into seq select num+8 from seq;
insert into seq select num+16 from seq;
insert into seq select num+32 from seq;
insert into seq select num+64 from seq;
/* continue doubling the number of records until you feel you have enough */
For the task at hand it is not necessary to add many records, as you only need to make sure you never have more repetitions in your in condition than in the above seq table. I guess 128 will be good enough, but feel free to double the number of records a few times more.
Once you have the above, you can write queries like this:
select province_id,
name,
#pos := instr(#in2 := insert(#in2, #pos+1, 1, '#'),
concat(',',province_id,',')) ord
from (select #in := '0,1,2,3,1,0', #in2 := #in, #pos := 10000) init
inner join provinces
on find_in_set(province_id, #in)
inner join seq
on num <= length(replace(#in, concat(',',province_id,','),
concat(',+',province_id,',')))-length(#in)
order by ord asc
Output for the sample data and sample in list:
| province_id | name | ord |
|-------------|--------|-----|
| 1 | name 1 | 2 |
| 2 | name 2 | 4 |
| 3 | name 3 | 6 |
| 1 | name 1 | 8 |
SQL Fiddle
How it works
You need to put the list of values in the assignment to the variable #in. For it to work, every valid id must be wrapped between commas, so that is why there is a dummy zero at the start and the end.
By joining in the seq table the result set can grow. The number of records joined in from seq for a particular provinces record is equal to the number of occurrences of the corresponding province_id in the list #in.
There is no out-of-the-box function to count the number of such occurrences, so the expression at the right of num <= may look a bit complex. But it just adds a character for every match in #in and checks how much the length grows by that action. That growth is the number of occurrences.
In the select clause the position of the province_id in the #in list is returned and used to order the result set, so it corresponds to the order in the #in list. In fact, the position is taken with reference to #in2, which is a copy of #in, but is allowed to change:
While this #pos is being calculated, the number at the previous found #pos in #in2 is destroyed with a # character, so the same province_id cannot be found again at the same position.
Its unclear exactly what you are wanting, but here's why its not working the way you want. The IN keyword is shorthand for creating a statement like ....Where province_id = 1 OR province_id = 2 OR province_id = 3 OR province_id = 1. Since province_id = 1 is evaluated as true at the beginning of that statement, it doesn't matter that it is included again later, it is already true. This has no bearing on whether the result returns a duplicate.

Counting comma separated values in TSQL

SCHEMA / DATA for TABLE :
SubscriberId NewsletterIdCsv
------------ ---------------
11 52,52,,52
We have this denormalized data, where I need to count the number of comma separated values, for which I am doing this :
SELECT SUM(len(newsletteridcsv) - len(replace(rtrim(ltrim(newsletteridcsv)), ',','')) +1) as SubscribersSubscribedtoNewsletterCount
FROM TABLE
WHERE subscriberid = 11
Result :
SubscribersSubscribedtoNewsletterCount
--------------------------------------
4
The problem is some of our data has blanks / spaces in between the comma separated values, if I run the above query the expected result should be 3 (as one of the value is blank space), how do I check in my query to exclude the blank spaces?
EDIT :
DATA :
SubscriberId NewsletterIdCsv
------------ ---------------
11 52,52,,52
12 22,23
I need to get an accumulative SUM instead of just each rows sum, so for this above data I need to have just a final count i.e. 5 in this case, excluding the blank space.
Here's one solution, although their may be a more efficient way:
SELECT A.[SubscriberId],
SUM(CASE WHEN Split.a.value('.', 'VARCHAR(100)') = '' THEN 0 ELSE 1 END) cnt
FROM
(
SELECT [SubscriberId],
CAST ('<M>' + REPLACE(NewsletterIdCsv, ',', '</M><M>') + '</M>' AS XML) AS String
FROM YourTable
) AS A
CROSS APPLY String.nodes ('/M') AS Split(a)
GROUP BY A.[SubscriberId]
And the SQL Fiddle.
Basically it converts your NewsletterIdCsv field to XML and then uses CROSS APPLY to split the data. Finally, using CASE to see if it's blank and SUM the non-blank values. Alternatively, you could probably build a UDF to do something similar.

Unable to get an SQL Pivot to work

I have a temporary table that contains a part category and the associated part cost:
#part_costs
part_cat | part_cost
tire | 0
fuel | 24
wheel | 34
The number of rows and the values within #part_costs are dynamic. I am trying to create a pivot so I will have (doesn't matter the order of the columns):
tire | fuel | wheel
0 | 24 | 34
I created a table variable that holds the part category variable NVARCHAR(2000) that holds the names:
"[fuel],[tires],[wheel]"
So far for my pivot I have
SELECT [fuel],[tires],[wheel] FROM (SELECT part_cat, part_cost FROM #part_costs ) p PIVOT ( [part_cost] FOR part_category IN ( [fuel],[tires],[wheel] )) AS pvt
Yet I can't get this to work. I can run my stored procedure, that this is located in, yet when I execute my stored procedure I get the following error: Incorrect syntax near the keyword 'FOR'.
Previously I had select 'part_cost' as costs, #part_names from (select part_cat, part_cost from #part_costs) as p PIVOT (part_cost for part_cat in (#part_names)) as PivotTable though with this I wasn't even able to run my stored procedure as I got Incorrect syntax near the keyword 'for'.
You have to use an Aggregate function (e.g. AVG, MAX) before the FOR. Also make sure to list the column value names correctly i.e. tires is not same as tire.
SELECT [fuel],[tire],[wheel]
FROM (
SELECT part_cat, part_cost
FROM #part_costs
) p
PIVOT
(
Avg([part_cost]) FOR part_cat IN ( [fuel],[tire],[wheel] )
) AS pvt