Selecting multiple substrings from a field in MySQL - mysql

I have a field that is a longtext in MySQL. I'm looking for any instances of 'media' that could be in it, +/- ~10 characters of context. There are usually multiple instances in a single rows' field, so I need to see the context. How can I write a query to do this? I can't even think of where to start.
So what I'm looking at is this:
SELECT field_data_body FROM table WHERE field_data_body LIKE '%media%';
+----------------------------------+
| field_data_body |
+----------------------------------+
| ... ode__media_or ... e immediat |
+----------------------------------+
The field is actually a long string, and I just parsed the actual test value to show the substrings that would match the WHERE clause.
What I actually want to see is all instances of the string media, which in the example above is two, but in other fields could be more. SUBSTR only shows the first instance of media.

CREATE FUNCTION of your own. Inside the function you can use the WHILE statement and general string functions such as LOCATE and SUBSTRING.
Here is an example to get you started:
DELIMITER $$
CREATE FUNCTION substring_list(
haystack TEXT,
needle VARCHAR(100)
)
RETURNS TEXT
DETERMINISTIC
BEGIN
DECLARE needle_len INT DEFAULT CHAR_LENGTH(needle);
DECLARE output_str TEXT DEFAULT '';
DECLARE needle_pos INT DEFAULT LOCATE(needle, haystack);
WHILE needle_pos > 0 DO
SET output_str = CONCAT(output_str, SUBSTRING(haystack, GREATEST(needle_pos - 10, 1), LEAST(needle_pos - 1, 10) + needle_len + 10), '\n');
SET needle_pos = LOCATE(needle, haystack, needle_pos + needle_len);
END WHILE;
RETURN output_str;
END$$
DELIMITER ;
Here are some tests. For each match, the term ("media") and up to 10 characters on either side are returned, all concatenated in a single string:
SELECT substring_list('1234567890media12345678immediate34567890media1234567890', 'media');
+---------------------------+
| 1234567890media12345678im |
| 12345678immediate34567890 |
| te34567890media1234567890 |
+---------------------------+
SELECT substring_list('0media12345678immediate34567890media1', 'media');
+---------------------------+
| 0media12345678im |
| 12345678immediate34567890 |
| te34567890media1 |
+---------------------------+

In mysql you can create a user define function for this like wordcount. You can get help from this UDF.
mysql count word in sql syntax

Here is a solution using PHP that will return each row and each result plus the surrounding characters in a multidimensional array.
$value = "media";
$surroundingChars = 5;
$strlen = strlen($value);
$stmt = $pdo->prepare("SELECT field_data_body FROM table WHERE field_data_body LIKE ?";
$stmt->execute([ '%'.$value.'%' ]);
$result = 0;
while ($body = $stmt->fetchColumn()) {
$start = 0;
while (($pos = stripos($body, $value, $start)) !== FALSE) {
$return[$result][] = substr($body, $pos - $surroundingChars, $strlen + ($surroundingChars * 2));
// Adjust next start
$start = $pos + $strlen;
}
$result++;
}
You could always change the $return[$result][] line, but to echo all rows in the format you wanted, you could do this:
foreach($return as $row) {
echo implode('..', $row);
}
As you stated in the comments, you'd rather a query, but if you change your mind, here is a solution matching your PHP requirements.

Related

MySQL user-defined function returns incorrect value when used in a SELECT statement

I met a problem when calling a user-defined function in MySQL. The computation is very simple but can't grasp where it went wrong and why it went wrong. Here's the thing.
So I created this function:
DELIMITER //
CREATE FUNCTION fn_computeLoanAmortization (_empId INT, _typeId INT)
RETURNS DECIMAL(17, 2)
BEGIN
SET #loanDeduction = 0.00;
SELECT TotalAmount, PeriodicDeduction, TotalInstallments, DeductionFlag
INTO #totalAmount, #periodicDeduction, #totalInstallments, #deductionFlag
FROM loans_table
WHERE TypeId = _typeId AND EmpId = _empId;
IF (#deductionFlag = 1) THEN
SET #remaining = #totalAmount - #totalInstallments;
IF(#remaining < #periodicDeduction) THEN
SET #loanDeduction = #remaining;
ELSE
SET #loanDeduction = #periodicDeduction;
END IF;
END IF;
RETURN #loanDeduction;
END;//
DELIMITER ;
If I call it like this, it works fine:
SELECT fn_computeLoanAmortization(3, 4)
But if I call it inside a SELECT statement, the result becomes erroneous:
SELECT Id, fn_computeLoanAmortization(Id, 4) AS Amort FROM emp_table
There's only one entry in the loans_table and the above statement should only result with one row having value in the Amort column but there are lots of random rows with the same Amort value as the one with the matching entry, which should not be the case.
Have anyone met this kind of weird dilemma? Or I might have done something wrong from my end. Kindly enlighten me.
Thank you very much.
EDIT:
By erroneous, I meant it like this:
loans_table has one record
EmpId = 1
TypeId = 2
PeriodicDeduction = 100
TotalAmount = 1000
TotalInstallments = 200
DeductionFlag = 1
emp_table has several rows
EmpId = 1
Name = Paolo
EmpId = 2
Name = Nikko
...
EmpId = 5
Name = Ariel
when I query the following statements, I get the correct value:
SELECT fn_computeLoanAmortization(1, 2)
SELECT Id, fn_computeLoanAmortization(Id, 2) AS Amort FROM emp_table WHERE EmpId = 1
But when I query this statement, I get incorrect values:
SELECT Id, fn_computeLoanAmortization(Id, 2) AS Amort FROM emp_table
Resultset would be:
EmpId | Amort
--------------------
1 | 100
2 | 100 (this should be 0, but the query returns 100)
3 | 100 (same error here)
...
5 | 100 (same error here up to the last record)
Inside your function, the variables you use to retrieve the values from the loans_table table are not local variables local to the function but session variables. When the select inside the function does not find any row, those variables still have the same values as from the previous execution of the function.
Use real local variables instead. In order to do that, use the variables names without # as a prefix and declare the variables at the beginning of the function. See this answer for more details.
I suspect the problem is that the variables in the INTO are not re-set when there is no matching row.
Just set them before the INTO:
BEGIN
SET #loanDeduction = 0.00;
SET #totalAmount = 0;
SET #periodicDeduction = 0;
SET #totalInstallments = 0;
SET #deductionFlag = 0;
SELECT TotalAmount, PeriodicDeduction, TotalInstallments, DeductionFlag
. . .
You might just want to set them to NULL.
Or, switch your logic to use local variables:
SET v_loanDeduction = 0.00;
SET v_totalAmount = 0;
SET v_periodicDeduction = 0;
SET v_totalInstallments = 0;
SET v_deductionFlag = 0;
And so on.

Retrieve mysql data without any order applied

I have a table with the name of actions with primary key of action_id, i am retrieving data from this table as passing my own ordered action_ids for example
$actionIds = array(5,9,10,21,3,18,4);
$actionsTb = Engine_Api::_()->getDbtable('actions','activity');
$postSelect = $actionsTb->select()
->where('action_id IN(?)', $actionIds)
->where('type = ?', 'status')
;
now the issue is when i get the result back its in ascending order, like ( 3,4,5,9,10,18,21 ) but what i want the order of result as i passed the action ids means don't want to apply any order on the result.
please help me. you can reply with simple query too.
Since your using php, why not use loop to dynamically create where clause here is the code
$where = "action_id IN(";
for($x=0; $x<count($actionIds); $x++)
{
// code for adding comma in every end of id
if($x==count($actionIds)-1)
{
// found the last data in array add closing parenthesis in IN funtion
$where.=$actionIds[$x].")";
}
else
{
$where.=$actionIds[$x].",";
}
}
to test the output echo it first
echo $where; //so i think the result is "action_id IN(5,9,10,21,3,18,4)"
Here is the complete code
$actionIds = array(5,9,10,21,3,18,4);
$where = "action_id IN(";
for($x=0; $x<count($actionIds); $x++)
{
// code for adding comma in every end of id
if($x==count($actionIds)-1)
{
$where.=$actionIds[$x].",";
}
else
{
// add closing parenthesis in IN funtion
$where.=$actionIds[$x].")";
}
}
$actionsTb = Engine_Api::_()->getDbtable('actions','activity');
$postSelect = $actionsTb->select()
->where($where)
->where('type = ?', 'status')
;
I don't know zend coding but following approach may help you.
->where( 'FIND_IN_SET( action_id, ? )', $actionIds )
I am not sure if zend's where converts the array $actionIds to be linear item values i.e. '5,9,10,21,3,18,4'. If converted, part of the resulting query would look like:
where find_in_set( action_id, '5,9,10,21,3,18,4' )
Example:
mysql> select find_in_set( 18, '5,9,10,21,3,18,4' );
+---------------------------------------+
| find_in_set( 18, '5,9,10,21,3,18,4' ) |
+---------------------------------------+
| 6 |
+---------------------------------------+
1 row in set (0.00 sec)
Reference:
Read MySQL documentation on FIND_IN_SET

Select column names with their maximum data length MySQL

I have a table in mysql with many columns and I want to see maximum length of values. My purpose is that I do know that some of data is truncated when insert and I want to increase varchar length. But do not know, what columns. (Explanation little messy, but probably sql will make sense)
I tried:
select COLUMN_NAME, CHARACTER_MAXIMUM_LENGTH, DATA_TYPE, (SELECT LENGTH(COLUMN_NAME) as maxlen FROM my_database.my_table ORDER BY maxlen DESC LIMIT 1)
from information_schema.columns
where table_schema = 'my_database' AND
table_name = 'my_table'AND DATA_TYPE = 'varchar'
It works, but return the length of the column, but not data inside it. (I.e. id column is called id, I got 2).
If I use JOIN (ON TRUE condition), I got error that COLUMN_NAME is undefined.
Stored procedures does not allow for data return, and function does not allow dynamic sql inside it.
How to tell MySQL (in case of my query) to consider COLUMN_NAME not as a string, but as column name? If this is not, possible in select, how to get columns with maximum data inside them?
Desired result looks like:
column_1 | 25 | varchar | 20
column_2 | 25 | varchar | 7
I am interested only in varchar, as int does not make sense to adjust (and no need to). Columns has different length (varchar(20),varchar(25), etc.).
Update 1: This cannot be done also via loop (statements cannot be executed inside cursor).
I use something like this type of code to generate my view automticaly using table schema. Use can modify according to your need.
$sql = "show tables from DBName where Tables_in_yourtbalename = 'yourtbalename' ";
$result = executeQuery($sql, $conn);
$num = $result->num_rows;
if ($num) {
$sql = "show columns from yourtbalename where Extra != 'auto_increment'";
$result = executeQuery($sql, $conn);
$num2 = $result->num_rows;
while ($r = $result->fetch_assoc()) {
if ($r['Key'] == 'MUL' && ( preg_match("/^int/", $r['Type']) || preg_match("/^smallint/", $r['Type']) || preg_match("/^tinyint/", $r['Type']) || preg_match("/^bigint/", $r['Type']))) {
} else if ($r['Field'] == 'status') {
}
}
Where $r['Field'] is field name and $r['Type'] provides its type. For determining maxlength use
$maxlength="' . substr(str_replace(")", "", $r['Type']), 8, (strlen(str_replace(")", "", $r['Type']))));

SQL query to remove certain text from each field in a specific column?

I recently recoded one of my sites, and the database structure is a little bit different.
I'm trying to convert the following:
*----*----------------------------*
| id | file_name |
*----*----------------------------*
| 1 | 1288044935741310953434.jpg |
*----*----------------------------*
| 2 | 1288044935741310352357.rar |
*----*----------------------------*
Into the following:
*----*----------------------------*
| id | file_name |
*----*----------------------------*
| 1 | 1288044935741310953434 |
*----*----------------------------*
| 2 | 1288044935741310352357 |
*----*----------------------------*
I know that I could do a foreach loop with PHP, and explode the file extension off the end, and update each row that way, but that seems like way too many queries for the task.
Is there any SQL query that I could run that would allow me to remove the file exentision from each field in the file_name column?
You can use the REPLACE() function in native MySQL to do a simple string replacement.
UPDATE tbl SET file_name = REPLACE(file_name, '.jpg', '');
UPDATE tbl SET file_name = REPLACE(file_name, '.rar', '');
This should work:
UPDATE MyTable
SET file_name = SUBSTRING(file_name,1, CHAR_LENGTH(file_name)-4)
This will strip off the final extension, if any, from file_name each time it is run. It is agnostic with respect to extension (so you can have ".foo" some day) and won't harm extensionless records.
UPDATE tbl
SET file_name = TRIM(TRAILING CONCAT('.', SUBSTRING_INDEX(file_name, '.', -1) FROM file_name);
You can use SUBSTRING_INDEX function
SUBSTRING_INDEX(str,delim,count)
Where str is the string, delim is the delimiter (from which you want a substring to the left or right of), and count specifies which delimiter (in the event there are multiple occurrences of the delimiter in the string)
Example:
UPDATE table SET file_name = SUBSTRING_INDEX(file_name , '.' , 1);

How do I select noncontiguous characters from a string of text in MySQL?

I have a table with millions of rows and a single column of text that is exactly 11,159 characters long. It looks like this:
1202012101...(to 11,159 characters)
1202020120...
0121210212...
...
(to millions of rows)
I realize that I can use
SELECT SUBSTR(column,2,4) FROM table;
...if I wanted to pull out characters 2, 3, 4, and 5:
1202012101...
1202020120...
0121210212...
^^^^
But I need to extract noncontiguous characters, e.g. characters 1,5,7:
1202012101...
1202020120...
0121210212...
^ ^ ^
I realize this can be done with a query like:
SELECT CONCAT(SUBSTR(colm,1,1),SUBSTR(colm,5,1),SUBSTR(colm,7,1)) FROM table;
But this query gets very unwieldy to build for thousands of characters that I need to select. So for the first part of the question - how do I build a query that does something like this:
SELECT CHARACTERS(string,1,5,7) FROM table;
Furthermore, the indices of the characters I want to select are from a different table that looks something like this:
char_index keep_or_discard
1 keep
2 discard
3 discard
4 discard
5 keep
7 discard
8 keep
9 discard
10 discard
So for the second part of the question, how could I build a query to select specific characters from the first table based on whether keep_or_discard="keep" for that character's index in the second table?
this function does what you want:
CREATE DEFINER = `root`#`localhost` FUNCTION `test`.`getsubset`(selection mediumtext, longstring mediumtext)
RETURNS varchar(200)
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT 'This function returns a subset of characters.'
BEGIN
SET #res:='';
SET #selection:=selection;
WHILE #selection<>'' DO
set #pos:=CONVERT(#selection, signed);
set #res := concat_ws('',#res,SUBSTRING(longstring,#pos,1));
IF LOCATE(',',#selection)=0 THEN
SET #selection:='';
END IF;
set #selection:=SUBSTRING(#selection,LOCATE(',',#selection)+1);
END WHILE;
RETURN #res;
END
Note: the CONVERT('1,2,3,4',signed) will yield 1, but it will give a warning.
I have it defined to be available in the database test.
The function takes two parameters; a string(!) with a list of positions, and a long string from where you want the characters taken.
An example of using this:
mysql> select * from keepdiscard;
+---------+------------+
| charind | keepordisc |
+---------+------------+
| 1 | keep |
| 2 | discard |
| 3 | keep |
| 4 | discard |
| 5 | keep |
| 6 | keep |
+---------+------------+
6 rows in set (0.00 sec)
mysql> select * from test;
+-------------------+
| longstring |
+-------------------+
| abcdefghijklmnopq |
| 123456789 |
+-------------------+
2 rows in set (0.00 sec)
mysql> select getsubset(group_concat(charind ORDER BY charind),longstring) as result from keepdiscard, test where keepordisc='keep' group by longstring;
+--------+
| result |
+--------+
| 1356 |
| acef |
+--------+
2 rows in set, 6 warnings (0.00 sec)
The warnings stem from the fast conversion to integer that is done in the function. (See comment above)
How about dynamic sql? (You will need to build the select part of the query)
CREATE PROCEDURE example_procedure()
BEGIN
--
--build the concat values here
--
SET #ids := '';
SET #S = 'SELECT #ids := built_concat_of_values FROM table';
PREPARE n_StrSQL FROM #S;
EXECUTE n_StrSQL;
DEALLOCATE PREPARE n_StrSQL;
END
You can write a php script to do this for you:
<?php
//mysql connect
$conn = mysql_connect('localhost', 'mysql_user', 'mysql_password');
if (!$conn) {
echo 'Unable to connect to DB: ' . mysql_error();
exit;
}
//database connect
$db = mysql_select_db('mydb');
if (!$db) {
echo 'Unable to select mydb: ' . mysql_error();
exit;
}
//get the keep numbers you’re going to use.
//and change the number into string so, for example, instead of 5 you get 'SUBSTR(colm,5,1)'
$result = mysql_query("SELECT number FROM number_table WHERE keep_or_discard='keep'");
$numbers = array();
while ($row = mysql_fetch_assoc($result)) {
$row = 'SUBSTR(colm,' . $row . ',1)';
$numbers = $row;
}
//implode the array so you get one long string with all the substrings
//eg. 'SUBSTR(colm,1,1),SUBSTR(colm,5,1),SUBSTR(colm,12,1)'
$numbers = implode(",", $numbers);
//pull the numbers you need and save them to an array.
$result = mysql_query("SELECT " . $numbers . " FROM table");
$concat = array();
while ($row = mysql_fetch_assoc($result)) {
$concat= $row;
}
And there you have an array with the correct numbers.
I'm sorry if you can't/don't want to use PHP for this, I just don't really know how to do this without PHP, Perl, Python or some other similar language. Hopefully this solution will help somehow...
The source of your difficulty is that your schema does not represent the true relationships between the data elements. If you wanted to achieve this with "pure" SQL, you would need a schema more like:
table
ID Index Char
1 0 1
1 1 2
1 2 0
charsToKeep
ID Index Keep
1 0 false
1 1 true
1 2 true
Then, you could perform a query like:
SELECT Char FROM table t JOIN charsToKeep c ON t.ID = c.ID WHERE c.Keep = true
However, you probably have good reasons for structuring your data the way you have (my schema requires much more storage space per character and the processing time is also probably much longer from what I am about to suggest).
Since SQL does not have the tools to understand the schema you have embedded into your table, you will need to add them with a user-defined function. Kevin's example of dynamic SQL may also work, but in my experience this is not as fast as a user-defined function.
I have done this in MS SQL many times, but never in MySql. You basically need a function, written in C or C++, that takes a comma-delimited list of the indexes you want to extract, and the string from which you want to extract them from. Then, the function will return a comma-delimited list of those extracted values. See these links for a good starting point:
http://dev.mysql.com/doc/refman/5.1/en/adding-functions.html
http://dev.mysql.com/doc/refman/5.1/en/adding-udf.html
To build the concatenated list of indexes you want to extract from the char_index table, try the group_concat function:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat
Hope this helps!