Extract domain name in MYSQL - mysql

I have the following rows:
http://domain1.com/moshe
https://domain2.com/
https://domain3.com
https://domain4.com?gembom
I need these results
domain1.com
domain2.com
domain3.com
domain4.com
How exactly can I do it in MySQL?
In JavaScript, I can simply use regex:
string.match(/https?:\/\/([^/]+)/)[0]
But I found, that MySQL doesn't have extracting regex. Is there an alternative way to achieve the same results?

Here is an example of how to extract the domain names from urls in MySQL,
DECLARE #WebUrl VARCHAR(35);
SET #WebUrl = 'http://stackoverflow.com/questions/ask?title=trim'
SELECT #WebUrl AS 'WebsiteURL',
LEFT(SUBSTRING(#WebUrl,
(CASE WHEN CHARINDEX('//',#WebUrl)=0
THEN 5
ELSE CHARINDEX('//',#WebUrl) + 2
END), 35),
(CASE
WHEN CHARINDEX('/', SUBSTRING(#WebUrl, CHARINDEX('//', #WebUrl) + 2, 35))=0
THEN LEN(#WebUrl)
else CHARINDEX('/', SUBSTRING(#WebUrl, CHARINDEX('//', #WebUrl) + 2, 35))- 1
END)
) AS 'Domain
';
You can make suitable modifications to suit your needs.

Related

What is the equivalent of MySQL LOCATE to Bigquery?

So I have this query with LOCATE function:
SELECT TRIM(CASE WHEN store_name like "%|%" THEN LEFT(store_name, LOCATE('|', store_name) - 1) ELSE
CASE WHEN store_name like "%,%" THEN LEFT(store_name, LOCATE(',', store_name) - 1) ELSE
CASE WHEN store_name like "% - %" THEN LEFT(store_name, LOCATE(' - ', store_name) - 1) ELSE
store_name
END
END
END)
Everything is working, but I need to change from MySQL to Bigquery now. When I tried to paste this query in Bigquery editor, I got an error: Function not found: LOCATE at [3:76]
There are different similar functions to Locate using BigQuery such as REGEXP_EXTRACT[1], as the INSTR [2], or SUBSTR[3] function.
[1]https://cloud.google.com/bigquery/docs/reference/legacy-sql#regularexpressionfunctions
[2]https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#instr
[3]https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#substr
Consider below approach (using REGEXP_EXTRACT_ALL)
select regexp_extract_all(store_name, r'(.*?)(?:,| - |\|)')[offset(0)]
if applied to dummy data - output is as below

User-defined function sorting column problem

I have taken reference from the internet about one user-defined function to locate 'nth occurrence of a string to do the sort column name in the database. I am using MySQL 5.5 version, not the latest version. Here is my sample database link https://dbfiddle.uk/?rdbms=mysql_5.5&fiddle=bcb32a6b47d0d5b061fd401d0888bdc3
My problem is I want to sort column name in the database follow the prefix number, but I am using below the SQL query, it doesn't work.
select t.id,t.name
from
(
select t.*, cast((case when col1_col2_ref > 0
then
substring_index(modified_name,'-',1)
else
modified_name
end
) as unsigned) col1
, cast((case when col1_col2_ref > 0
and col3_ref > 0
then
substr(modified_name,(col1_col2_ref + 1),(col3_ref - (col1_col2_ref + 1)))
when col1_col2_ref > 0
then
substr(modified_name,(col1_col2_ref + 1))
end) as unsigned) col2
, cast((case when col3_ref > 0
and col4_ref > 0
then
substr(modified_name,(col3_ref + 1),(col4_ref - (col3_ref + 1)))
when col3_ref > 0
then
substr(modified_name,(col3_ref + 1))
end) as unsigned) col3
, cast((case when col4_ref > 0
then
substr(modified_name,(col4_ref + 1))
end) as unsigned) col4
from
(
select t.*,substring_index(name,' ',1) modified_name
,locate('-',name,1) col1_col2_ref
,locate('/',name,1) col3_ref
,locate('/',name,locate('/',name,1)+1) col4_ref
from filing_code_management t
) t
) t
order by col1,col2,col3,col4
It shows me below the result, it cannot sort properly.
Output 1
Actually I want the output sample like below:
Output 2
Output 3
This is before I can sort the column name link, https://dbfiddle.uk/?rdbms=mysql_5.5&fiddle=6b12a4d42359cb30f27a5bfb9d0c8210. After I am inserted into new data, it cannot work for me. Maybe an example in new data like this error (R)100-6-2-2 Mesyuarat Majlis Kerajaan Negeri (MMKN) JKK if I put () in front. Or in new data like this error 100-1-1 Penggubalan/Penyediaan/Pindaan Undang-Undang/Peraturan if I put / in between the word.
Hope someone can guide me to solve this problem. Thanks.
You should be able to adapt the following code to your needs (tested at your DB Fiddle!). I've used the file_name column instead of the name column to slightly simplify building the sort fields, as it seems the file name is always repeated in the first part of the name field anyway.
This would be quite a bit simpler using regular expression support, but I note that the version of MySQL you are using doesn't have this feature (I think it arrives in SQL 8.0, if I'm not mistaken).
SELECT id,
num_hyphens,
CAST(SUBSTRING_INDEX(CONCAT(file_name_adj,'-'), '-', 1) AS UNSIGNED) AS sort1,
CAST(CASE WHEN num_hyphens = 0
THEN '0'
ELSE SUBSTRING_INDEX(SUBSTRING_INDEX(file_name_adj,'-', 2), '-',-1)
END AS UNSIGNED) AS sort2,
CAST(CASE WHEN num_hyphens <= 1
THEN '0'
ELSE SUBSTRING_INDEX(SUBSTRING_INDEX(file_name_adj,'-', 3), '-',-1)
END AS UNSIGNED) AS sort3,
CAST(CASE WHEN num_hyphens <= 2
THEN '0'
ELSE SUBSTRING_INDEX(file_name_adj, '-', -1)
END AS UNSIGNED) AS sort4,
file_name,
name
FROM (
SELECT id, name, MID(file_name, instr(file_name, ')') + 1) AS file_name_adj, file_name,
LENGTH(file_name) - LENGTH(REPLACE(file_name, '-', '')) AS num_hyphens
FROM filing_code_management
) t1
ORDER BY sort1, sort2, sort3, sort4

Using CASE in WHERE Statement when parameter has multiple values

I have a problem which I think relates to having a multiple value parameter.
In my TblActivity there are two fields TblActivity.ActivityServActId and TblActivity.ActivityContractId which I want to include in my WHERE statement.
Filtering by these is optional. If the user selects 'Yes' for the parameter #YESNOActivity, then I want to filter the query looking for rows where TblActivity.ActivityServActId matches one of the options in the parameter #ServiceActivity.
The same goes for the #YESNOContract, TblActivity.ActivityContractId and #Contract respectively
I managed to get to this:
WHERE
(CASE WHEN #YESNOActivity = 'Yes' THEN TblActivity.ActivityServActId ELSE 0 END)
IN (CASE WHEN #YESNOActivity = 'Yes' THEN #ServiceActivity ELSE 0 END)
AND (CASE WHEN #YESNOContract = 'Yes' THEN TblActivity.ActivityContractId ELSE 0 END)
IN (CASE WHEN #YESNOContract = 'Yes' THEN #Contract ELSE 0 END)
However, although this code works fine if there is only one value selected in the parameter #ServiceActivity or #Contract, as soon as I have more than one value in these parameters, I get the error:
Incorrect syntax near ','.
Query execution failed for dataset 'Activity'. (rsErrorExecutingCommand)
An error has occurred during report processing. (rsProcessingAborted)
Can anyone see what I'm doing wrong? I could understand it if I had an = instead of IN in the WHERE statement but can't figure this one out.
Using SQL Server 2008 and SSRS 2008-r2
If your #ServiceActivity is something like 1,2,3
You can do something like this
WHERE `,1,2,3,` LIKE `%,1,%`
So you format your variables
WHERE ',' + #ServiceActivity + ',' LIKE '%,' + ID + ',%'
SQL FIDDLE DEMO
SELECT *
FROM
(SELECT '1,2,3,4' as X UNION ALL
SELECT '2,3,4,5' as X UNION ALL
SELECT '3,4,5,6' as X UNION ALL
SELECT '1,3,4,5' as X
) as T
WHERE ',' + X + ',' LIKE '%,1,%'
For Your Case
(CASE WHEN #YESNOActivity = 'Yes'
THEN ',' + #ServiceActivity + ','
ELSE NULL
END)
LIKE
(CASE WHEN #YESNOActivity = 'Yes'
THEN '%,' + TblActivity.ActivityServActId + ',%'
ELSE 0
END)
In SQL, the IN clause does not support parameters the way you are using them. The general syntax is
IN (1, 2, 3, 4)
you have
IN (#Param)
where something like
#Param = '1, 2, 3, 4'
Internally, SQL will turn this into
IN ('1, 2, 3, 4')
Note the quotes... you are now matching against a string!
There are a number of ways to address this. Search SO for "sql in clause parameter", pick one that works for you, and upvote it.
(Added)
Parameterize an SQL IN clause seems pretty definitive on the subject. While long ago I upvoted the third reply (the one with table-value parameters), any of the high-vote answers could do the trick. The ideal answer depends on the overall problem you are working with. (I am not familiar with SSRS, and can't give more specific advice.)
So after a lot of messing around I put together a simple workaround for this by dropping my use of CASE altogether - but I have a suspicion that this is not a terribly efficient way of doing things.
WHERE
(#YESNOActivity = 'No' OR (#YESNOActivity = 'Yes' AND
TblActivity.ActivityServActId IN (#ServiceActivity)))
AND
(#YESNOContract = 'No' OR (#YESNOContract = 'Yes' AND
TblActivity.ActivityContractId IN (#Contract)))

SSIS Substring Extract based on qualifier

I've looked through a few different post trying to find a solution for this. I have a column that contains descriptions that follow the following format:
String<Numeric>
However the column isn't limited to one set of the previous mentioned format it could be something like
UNI<01> JPG<84>
JPG<84> UNI<01>
JPG<84>
UNI<01>
And other variations without any controlled pattern.
What I am needing to do is extract the number between <> into a separate column in another table based on the string before the <>. So UNI would qualify the following numeric to go to a certain table.column, while JPG would qualify to another table etc. I have seen functions to extract the numeric but not qualifying and only pulling the numeric if it is prefaced with a given qualifier string.
Based on the scope limitation mentioned in the question's comments that only one type of token (Foo, Bar, Blat, etc.) needs to be found at a time: you could use an expression in a Derived Column to find the token of interest and then extract the value between the arrows.
For example:
FINDSTRING([InputColumn], #[User::SearchToken] + "<", 1) == 0)?
NULL(DT_WSTR, 1) :
SUBSTRING([InputColumn],
FINDSTRING([InputColumn], #[User::SearchToken] + "<", 1)
+ LEN(#[User::SearchToken]) + 1,
FINDSTRING(
SUBSTRING([InputColumn],
FINDSTRING([InputColumn], #[User::SearchToken] + "<", 1)
+ LEN(#[User::SearchToken]) + 1,
LEN([InputColumn])
), ">", 1) - 1
)
First, the expression checks whether the token specified in #[User::SearchToken] is used in the current row. If it is, SUBSTRING is used to output the value between the arrows. If not, NULL is returned.
The assumption is made that no token's name will end with text matching the name of another token. Searching for token Bar will match Bar<123> and FooBar<123>. Accommodating Bar and FooBar as distinct tokens is possible but the requisite expression will be much more complex.
You could use an asynchronous Script Component that outputs a row with type and value columns for each type<value> token contained in the input string. Pass the output of this component through a Conditional Split to direct each type to the correct destination (e.g. table).
Pro: This approach gives you the option of using one data flow to process all tag types simultaneously vs. requiring one data flow per tag type.
Con: A Script Component is involved, which it sounds like you'd prefer to avoid.
Sample Script Component Code
private readonly string pattern = #"(?<type>\w+)<(?<value>\d+)>";
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
foreach (Match match in Regex.Matches(Row.Data, pattern, RegexOptions.ExplicitCapture))
{
Output0Buffer.AddRow();
Output0Buffer.Type = match.Groups["type"].Value;
Output0Buffer.Value = match.Groups["value"].Value;
}
}
Note: Script Component will need an output created with two columns (perhaps named Type and Value) and then have the output's SynchronousInputID property set to None).
I ended up writing a CTE for a view to handle the data manipulation and then handled the joins and other data pieces in the SSIS package.
;WITH RCTE (Status_Code, lft, rgt, idx)
AS ( SELECT a.Status_code
,LEFT(a.Description, CASE WHEN CHARINDEX(' ', a.Description)=0 THEN LEN(a.Description) ELSE CHARINDEX(' ', a.Description)-1 END)
,SUBSTRING(a.Description, CASE WHEN CHARINDEX(' ', a.Description)=0 THEN LEN(a.Description) ELSE CHARINDEX(' ', a.Description)-1 END + 1, DATALENGTH(a.Description))
,0
FROM [disp] a WHERE NOT( Description IS NULL OR Description ='')
UNION ALL
SELECT r.Status_Code
,CASE WHEN CHARINDEX(' ', r.rgt) = 0 THEN r.rgt ELSE LEFT(r.rgt, CHARINDEX(' ', r.rgt) - 1) END
,CASE WHEN CHARINDEX(' ', r.rgt) > 0 THEN SUBSTRING(r.rgt, CHARINDEX(' ', r.rgt) + 1, DATALENGTH(r.rgt)) ELSE '' END
,idx + 1
FROM RCTE r
WHERE DATALENGTH(r.rgt) > 0
)
SELECT Status_Code
-- ,lft,rgt -- Uncomment to see whats going on
,SUBSTRING(lft,0, CHARINDEX('<',lft)) AS [Description]
,CASE WHEN ISNUMERIC(SUBSTRING(lft, CHARINDEX('<',lft)+1, LEN(lft)-CHARINDEX('<',lft)-1)) >0
THEN CAST (SUBSTRING(lft, CHARINDEX('<',lft)+1, LEN(lft)-CHARINDEX('<',lft)-1) AS INT) ELSE NULL END as Value
FROM RCTE
where lft <> ''

Need help formatting CONCAT() for MySQL query

I have a table where I am attempting to take 3 database table values and reformat them in a single value. Here is the SQL statement that I have at the moment:
SELECT
CASE WHEN cb_cardtype = 'Discover Credit Card'
THEN 'DS'
END +
';' + RIGHT(cardnumbers,4) + ';' + LPAD(MONTH(planexpdate), 2, '0') +
'/' + LPAD(YEAR(planexpdate), 2, '0') AS account_billing_key
FROM my_table
So what I wanted to get as an output here would be:
DS;4242;07/14
The problem is that I am using the + to attempt this, which actually adds the values together. Rather, I understand that I need to use CONCAT() to merge the values. I am unclear about how I can pull the individual values and then concatenate them as desired.
If your query is otherwise correct, all you need to do is to wrap all the strings you want to concatenate - comma separated - inside a call to CONCAT;
SELECT
CONCAT(
CASE WHEN cb_cardtype = 'Discover Credit Card' THEN 'DS' END,
';',
RIGHT(cardnumbers,4),
';',
LPAD(MONTH(planexpdate), 2, '0'),
'/',
LPAD(YEAR(planexpdate), 2, '0')
) AS account_billing_key
FROM my_table