I have a field in my database which is encoded. After using from_base64 on the field it looks like this:
<string>//<string>//<string>/2017//06//21//<string>//file.txt
There may be an undetermined number of strings at the beginning of the path, however, the date (YYYY//MM//DD) will always have two fields to the right (a string followed by file extension).
I want to sort by this YYYY//MM//DD pattern and get a count for all paths with this date.
So basically I want to do this:
select '<YYYY//MM//DD portion of decoded_path>', count(*) from table group by '<YYYY//MM//DD portion of decoded_path>' order by '<YYYY//MM//DD portion of decoded_path>';
Summary
MySQL's SUBSTRING_INDEX comes in useful for doing this by looking for the specified delimiter and counting backwards from the end if a negative count value is specified.
Demo
Rextester demo: http://rextester.com/TCJ65469
SQL
SELECT datepart,
COUNT(*) AS occurrences
FROM
(SELECT CONCAT(
LEFT(SUBSTRING_INDEX(txt, '//', -5), INSTR(SUBSTRING_INDEX(txt, '//', -5), '//') - 1),
'/',
LEFT(SUBSTRING_INDEX(txt, '//', -4), INSTR(SUBSTRING_INDEX(txt, '//', -4), '//') - 1),
'/',
LEFT(SUBSTRING_INDEX(txt, '//', -3), INSTR(SUBSTRING_INDEX(txt, '//', -3), '//') - 1))
AS datepart
FROM tbl) subq
GROUP BY datepart
ORDER BY datepart;
Assumptions
Have assumed for now that the single slash before the year in the example given in the question was a typo and should have been a double slash. (If it turns out this isn't the case I'll update my answer.)
little crazy but it works
select REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(REPLACE('<string>//<string>//<string>/2017//06//21//<string>//file.txt',"//","-"),"/",-1),"-<",1),"-","/"), count(*) from `chaissilist` group by REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(REPLACE('<string>//<string>//<string>/2017//06//21//<string>//file.txt',"//","-"),"/",-1),"-<",1),"-","/") order by REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(REPLACE('<string>//<string>//<string>/2017//06//21//<string>//file.txt',"//","-"),"/",-1),"-<",1),"-","/");
Related
Given data like:
URL
some_url.com
some_url.com
some_url.co.uk
some_other_url.com
some_other_url.co.uk
some_other_url.co.uk
some_other_url.org
is there a way to construct a query that will result in;
some_url 3
some_other_url 4
Currently I'm either using a standard group by url or I query the aggregations one by one using LIKE
Is there a way to do this in one query? (using mysql currently, but will be moving this data over to postgresql)
Would it be better practice to add a column to reflect this grouping (at insert time)? (this feels redundant but would be best performing I guess)
EDIT:
data can contain www and non-www as well as http, https. Also I'll have to do similar thing on other columns that contain (free) text values.
This is ANSI SQL compliant and should probably work with both MySQL and Postgresql:
select url, count(*)
from
(
select substring(url from 1 for position('.' in url) -1) as url
from tablename
) dt
group by url
Using position() to find the first . character. Do substring() and finally GROUP BY the result.
use SUBSTRING_INDEX in mysql which help you substring from a string before a specified number of occurrences of the delimiter.
select count(*) as cnt, SUBSTRING_INDEX(c,'.',1) as val from cte
group by SUBSTRING_INDEX(c,'.',1)
Since the values can have http, https and www, and may be query string too, you will have to clean all such values first before grouping it. Took the reference from here and modified it to match your requirement.
SELECT url,
SUBSTRING_INDEX(
SUBSTRING_INDEX(
SUBSTRING_INDEX(
SUBSTRING_INDEX(
SUBSTRING_INDEX(
SUBSTRING_INDEX(url, '/', 3),
'://', -1),
'/', 1),
'?', 1),
'www.', -1),
'.', 1) AS domain,
COUNT(1)
FROM tblname
GROUP BY domain;
This works in Postgesql:
select split_part(url,'.',1) g,count(*)
from url_table
group by g
order by g;
Best regards,
Bjarni
I have a database that contains a column "Code" where the records have the following format "xx-xxx" and "xx-xx", for the later format i want to add a zero after the "-" to make it "xx-0xx", is there anyway to count the characters after a certain pattern in Mysql
Hmmm. If those are your only two possibilities, you can use case:
select (case when length(code) = 5
then replace(code, '-', '-0')
else code
end) as new_code
If you want to be more general, deconstruct the string and build it back again:
select concat_ws('-', substring_index(code, '-', 1),
lpad(substring_index(code, '-', -1), 3, '0')
)
Yes, you can use the CHAR_LENGTH(str) like this:
SELECT code,CHAR_LENGTH(SUBSTR(code,3))
from table
I have a column in database and having value like this
course_repeatfkfjkjfjkfer_10_topics_0_presentation_link
course_repeatfkfjfkfkfklfflkflkfs_1_presentation_link
course_repeatfkfjfkfkfklfflkflkfs_2_presentation_link
coursek_epeatfkfjfkfkfklfflkflkfs_10_presentation_link
course_hdhdhhdhdjhdrepeatfkfjfkfkfklfflkflkfs_21_presentation_link
and so on.
I need to pick 0,1,2,10,21, number before _presentation_link , But i need this in mysql as well
i used substr in mysql, but that is not working. Any idea?
Thanks
One option would be to use a combination of SUBSTRING_INDEX() and REPLACE():
SELECT SUBSTRING_INDEX(REPLACE(col, '_presentation_link', ''), '_', -1)
FROM yourTable
Taking course_repeatfkfjkjfjkfer_10_topics_0_presentation_link as an example, after the replacement, this would become:
course_repeatfkfjkjfjkfer_10_topics_0
The call to SUBSTRING_INDEX() then grabs everything appearing after the final underscore, which is the number you want to capture.
Demo here:
SQLFiddle
You can use substring_index twice like this:
select substring_index(substring_index(col, '_', -3), '_', 1)
from t
Demo
I have a table which i am using to query and getting its one column which matches regular expression which is (\/.+\/\?).
Content of the resulted column is like:
/Anything here/?
Example output:
\abc\cdf\?....
\ab\?....
\abc\cdf\?....
\sb\?....
where '....' can be anything
Desired result i want is unique values before \? such that rows with duplicate regexp matched content are shown once only like here (\abc\cdf\?.... showing twice instead of onece)
\abc\cdf\?....
\ab\?....
\sb\?....
OR
\abc\cdf\?
\ab\?
\sb\?
I have looked very much but couldn't find anything there is regexp_substr in oracle but that is not working in SQL.
Please if someone could help me with the sql query that would be awesome.
If you want everything before the last \, then you can use substring_index() and some string manipulation:
select substring_index(col, '\\',
length(col) - length(replace(col, '\\', ''))
) as firstpart,
count(*)
from table t
group by substring_index(col, '\\',
length(col) - length(replace(col, '\\', ''))
);
I have a field with text like "/site/index?sid=18&sub=321333&tid=site.net&ukey=1234543254".
How can I group it by part of string( 'sid' url param e.g.)?
And params may be in a different order.(sid on the end of line and etc.)
Take a look at the MySQL string functions:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html
Especially this looks helpful:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substring-index
UPDATE
This is exactly what you asked for:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX("/site/index?sid=18&sub=321333&tid=site.net&ukey=1234543254", 'sid=', -1), '&', 1) AS this_will_be_grouped
and use this_will_be_grouped in the GROUP BY clause of your query