remove [" from string is SSIS using derived column - ssis

I am trying to remove [" from beginning of the string and "] end of the string by using REPLACE function in derived column. But it is giving an error.
I have used the below formula
REPLACE(columnanme,"["","")
is used in the to remove [" in the beginning of the string. But not working.
Can someone help me on this.
Note: Data is in table and datatype is NTEXT
Regards,
Khatija

I believe you just need to escape the " value
so
\”
REPLACE(columnanme,"[\"","")
otherwise it sees the " in the middle as the closing quote and you have an invalid statement.

I am trying to remove [" from beginning of the string and "] end of the string
Supposing that we reliably have brackets and quotes wrapping the data, the simplest approach would be to use substring. This would be easier to do in SQL:
UPDATE myTable SET columnname = SUBSTRING(columnname, 3, LEN(columnname) -4)
WHERE columnname LIKE '["%"]'
If you want to do this in SSIS, you'll need to use a script component transformation to avoid data loss when converting the value to a string. Select the column you want to work with and set the usage type to ReadWrite:
In the script, I have added a method GetNewString, which converts the blob to a string and strips the unwanted characters. You can also use Replace or Regex.Replace if that makes more sense.
In the Input0_ProcessInputRow method, we convert the columns data, reset the blob and then add the new value:
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
var input = GetNewString(Row.columname);
Row.columname.ResetBlobData();
Row.columname.AddBlobData(System.Text.Encoding.Unicode.GetBytes(input));
}
public string GetNewString(Microsoft.SqlServer.Dts.Pipeline.BlobColumn blobColumn)
{
if (blobColumn.IsNull)
return string.Empty;
var blobData = blobColumn.GetBlobData(0, (int)blobColumn.Length);
var stringData = System.Text.Encoding.Unicode.GetString(blobData);
stringData = stringData.Substring(2, stringData.Length - 4);
return stringData;
}

Related

Use Split, Join, and another function in SSRS

I have a field in SQL Server that contains an comma separated list. Here are 2 examples:
select 'ex1,ex2,ex3' as str union all
select 'ax1,ax2'
In my report, I have to transform all of these values (5 in this case) using a function. In this question I will use Trim, but in actuality we are using another custom made function with the same scope.
I know how I can split every value from the string and recombine them:
=Join(Split(Fields!str.Value,","),", ")
This works great. However, I need to execute a function before I recombine the values. I thought that this would work:
=Join( Trim(Split(Fields!VRN.Value,",")) ,", ")
However, this just gives me an error:
Value of type '1-dimensional array of String' cannot be converted to 'String'. (rsCompilerErrorInExpression)
I can't personally change the function that we use.
How do I use an extra function when dealing with both an split and a join?
You can use custom code to include all the logic (Split->Custom Code->Join).
Make adjustments inside the loop to call your custom function instead of trim
Public Function fixString (ByVal s As String) As String
Dim mystring() As String
mystring = s.Split(",")
For index As Integer = 0 To mystring.Length-1
mystring(index) = Trim(mystring(index))
Next
Return Join(mystring, ",")
End Function
To call the custom code use the following expression
Code.fixString( Fields!VRN.Value )

Parsing a String That's Kind of JSON

I have a set of strings that's JSONish, but totally JSON uncompliant. It's also kind of CSV, but values themselves sometimes have commas.
The strings look like this:
ATTRIBUTE: Value of this attribute, ATTRIBUTE2: Another value, but this one has a comma in it, ATTRIBUTE3:, another value...
The only two patterns I can see that would mostly work are that the attribute names are in caps and followed by a : and space. After the first attribute, the pattern is , name-in-caps : space.
The data is stored in Redshift, so I was going to see if I can use regex to resolved this, but my regex knowledge is limited - where would I start?
If not, I'll resort to python hacking.
What you're describing would be something like:
^([A-Z\d]+?): (.*?), ([A-Z\d]+?): (.*?), ([A-Z\d]+?): (.*)$
Though this answer would imply your third attribute value doesn't really start with a comma, and that your attributes name could countain numbers.
If we take this appart:
[A-Z\d] Capital letters and numbers
+?: As many as needed, up to the first :
(.*?), a space, then as many characters as needed up to a coma and a space
^ and $ The begining and the end of a string, respectively
And the rest is a repetition of that pattern.
The ( ) are just meant to identify your capture sections, in this case, they don't impact directly the match.
Here's a working example
Often regex is not the right tool to use when it seems like it is.
Read this thoughtful post for details: https://softwareengineering.stackexchange.com/questions/223634/what-is-meant-by-now-you-have-two-problems
When a simpler scheme will do, use it! Here is one scheme that would successfully parse the structure as long as colons only occur between attributes and values, and not in them:
Code
static void Main(string[] args)
{
string data = "ATTRIBUTE: Value of this attribute,ATTRIBUTE2: Another value, but this one has a comma in it,ATTRIBUTE3:, another value,value1,ATTRIBUTE4:end of file";
Console.WriteLine();
Console.WriteLine("As an String");
Console.WriteLine();
Console.WriteLine(data);
string[] arr = data.Split(new[] { ":" }, StringSplitOptions.None);
Dictionary<string, string> attributeNameToValue = new Dictionary<string, string>();
Console.WriteLine();
Console.WriteLine("As an Array Split on ':'");
Console.WriteLine();
Console.WriteLine("{\"" + String.Join("\",\"", arr) + "\"}");
string currentAttribute = null;
string currentValue = null;
for (int i = 0; i < arr.Length; i++)
{
if (i == 0)
{
// The first element only has the first attribute name
currentAttribute = arr[i].Trim();
}
else if (i == arr.Length - 1)
{
// The last element only has the final value
attributeNameToValue[currentAttribute] = arr[i].Trim();
}
else
{
int indexOfLastComma = arr[i].LastIndexOf(",");
currentValue = arr[i].Substring(0, indexOfLastComma).Trim();
string nextAttribute = arr[i].Substring(indexOfLastComma + 1).Trim();
attributeNameToValue[currentAttribute] = currentValue;
currentAttribute = nextAttribute;
}
}
Console.WriteLine();
Console.WriteLine("As a Dictionary");
Console.WriteLine();
foreach (string key in attributeNameToValue.Keys)
{
Console.WriteLine(key + " : " + attributeNameToValue[key]);
}
}
Output:
As an String
ATTRIBUTE: Value of this attribute,ATTRIBUTE2: Another value, but this one has a comma in it,ATTRIBUTE3:, another value,value1,ATTRIBUTE4:end of file
As an Array Split on ':'
{"ATTRIBUTE"," Value of this attribute,ATTRIBUTE2"," Another value, but this one has a comma in it,ATTRIBUTE3",", another value,value1,ATTRIBUTE4","end of file"}
As a Dictionary
ATTRIBUTE : Value of this attribute
ATTRIBUTE2 : Another value, but this one has a comma in it
ATTRIBUTE3 : , another value,value1
ATTRIBUTE4 : end of file

SSIS Convert Blank or other values to Zeros

After applying the unpivot procedure, I have an Amount column that has blanks and other characters ( like "-"). I would like to convert those non-numberic values to zero. I use replace procedure but it only converts one at the time.
Also, I tried to use the following script
/**
Public Overrides Sub Input()_ProcessInputRows(ByVal Row As Input()Buffer)
If Row.ColumnName_IsNull = False Or Row.ColumnName = "" Then
Dim pattern As String = String.Empty
Dim r As Regex = Nothing
pattern = "[^0-9]"
r = New Regex(pattern, RegexOptions.Compiled)
Row.ColumnName = Regex.Replace(Row.ColumnName, pattern, "")
End If
End Sub
**/
but i'm getting error.I don't much about script so maybe I placed in the wrong place. The bottom line is that I need to convert those non-numberic values.
Thank you in advance for your help.
I generally look at regular expressions as a great way to introduce another problem into an existing one.
What I did to simulate your problem was to write a select statement that added 5 rows. 2 with valid numbers, the rest were an empty string, string with spaces and one with a hyphen.
I then wired it up to a Script Component and set the column as read/write
The script I used is as follows. I verified there was a value there and if so, I attempted to convert the value to an integer. If that failed, then I assigned it zero. VB is not my strong suit so if this could have been done more elegantly, please edit my script.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
' Ensure we have data to work with
If Not Row.ColumnName_IsNull Then
' Test whether it's a number or not
' TryCast doesn't work with value types so I'm going the lazy route
Try
' Cast to an integer and then back to string because
' my vb is weak
Row.ColumnName = CStr(CType(Row.ColumnName, Integer))
Catch ex As Exception
Row.ColumnName = 0
End Try
End If
End Sub

String.replace() function to parse XML string so that it can be displayed in HTML

I have a XML string which needs to be displayed within HTML. I understand the first thing needed to be done here is to convert all '<' and '>' into '& lt;' and '& gt;' (ignore the space after & sign). This is what I am doing to replace '<' -
regExp = new RegExp("/</g");
xmlString = xmlString.replace(regExp, '& lt;');
xmlString does not change.
Also, trace(regExp.test("<")); prints false.
What is wrong here?
replace returns a new string, it doesn't modify the old one. So if you want to overwrite the old you have to do the following:
xmlString = xmlString.replace(regExp, '<');
Or if you don't want to overwrite the old one, just store the result in a new variable.
var newString = xmlString.replace(regExp, '<');
The issue is the way you create your RegExp object.
Because your using the RegExp constructor, don't include the / characters:
regExp = new RegExp("<", "g");
or use / as a shortcut:
regExp = /</g;
See this page for more details: http://livedocs.adobe.com/flash/9.0/ActionScriptLangRefV3/RegExp.html

Removing non-alphanumeric characters in an Access Field

I need to remove hyphens from a string in a large number of access fields. What's the best way to go about doing this?
Currently, the entries are follow this general format:
2010-54-1
2010-56-1
etc.
I'm trying to run append queries off of this field, but I'm always getting validation errors causing the query to fail. I think the cause of this failure is the hypens in the entries, which is why I need to remove them.
I've googled, and I see that there are a number of formatting guides using vbscript, but I'm not sure how I can integrate vb into Access. It's new to me :)
Thanks in advance,
Jacques
EDIT:
So, Ive run a test case with some values that are simply text. They don't work either, the issue isn't the hyphens.
I'm not sure that the hyphens are actually the problem without seeing sample data / query but if all you need to do is get rid of them, the Replace function should be sufficient (you can use this in the query)
example: http://www.techonthenet.com/access/functions/string/replace.php
If you need to do some more advanced string manipulation than this (or multiple calls to replace) you might want to create a VBA function you can call from your query, like this:
http://www.pcreview.co.uk/forums/thread-2596934.php
To do this you'd just need to add a module to your access project, and add the function there to be able to use it in your query.
I have a function I use when removing everything except Alphanumeric characters. Simply create a query and use the function in the query on whatever field you are trying to modify. Runs much faster than find and replace.
Public Function AlphaNumeric(inputStr As String)
Dim ascVal As Integer, originalStr As String, newStr As String, counter As Integer, trimStr As String
On Error GoTo Err_Stuff
' send to error message handler
If inputStr = "" Then Exit Function
' if nothing there quit
trimStr = Trim(inputStr)
' trim out spaces
newStr = ""
' initiate string to return
For counter = 1 To Len(trimStr)
' iterate over length of string
ascVal = Asc(Mid$(trimStr, counter, 1))
' find ascii vale of string
Select Case ascVal
Case 48 To 57, 65 To 90, 97 To 122
' if value in case then acceptable to keep
newStr = newStr & Chr(ascVal)
' add new value to existing new string
End Select
Next counter
' move to next character
AlphaNumeric = newStr
' return new completed string
Exit Function
Err_Stuff:
' handler for errors
MsgBox Err.Number & " " & Err.Description
End Function
Just noticed the link to the code, looks similar to mine. Guess this is just another option.