I've a library of questions and answers and building an API in NodeJS which allows to search for answers based on the question passed as input. Following is my goal:
Split the question by space
Tokenize it and remove stopwords
Query database for records where question contains one or more words from the tokenized array
Ideally sort in descending order total number of matches in the question. For eg: If the question A contains 'module' and 'solution' and question B contains only 'solution', then question A should be shown before question B
I've been able to achieve 1 to 3, using the below code:
let question = req.query.question;
let arrQuestions = question.split(" ");
let tokenizedQuestion = stopwords.removeStopwords(arrQuestions);
let whereClause = tokenizedQuestion.join("%' OR answer LIKE '%");
whereClause = " answer LIKE '%" + whereClause + "%' ";
let query = "SELECT * FROM tbl_libraries WHERE " + whereClause;
I'm not able to figure out how to achieve 4. Can somebody provide pointers?
Thanks!
Are you sure that you do not want to use MySQL fulltext search for this?
If the answer is some flavour of 'No', you can continue reading...
In one of my project I was implementing something like this.
Query wise it looks like this (simplified version):
SELECT
name
FROM
table
WHERE
name REGEXP 'term1|term2|term3' -- you can use your OR + LIKE way
ORDER BY
SP_TermsWeitght(name, 'term1 term2 term3') DESC
All the magic is in my SP_TermsWieght function that returns "weight" (number) and I'm supplying a list of terms (cleaned and normalized) to the function.
The function:
CREATE FUNCTION `SP_TermsWeight`(
`sValue` TEXT,
`sTerms` VARCHAR(127)
)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE p INT DEFAULT 1;
DECLARE w INT DEFAULT 0;
DECLARE l INT;
DECLARE c CHAR(1);
DECLARE s VARCHAR(63);
DECLARE delimiters VARCHAR(15) DEFAULT ' ,';
SET sTerms = TRIM(sTerms);
SET l = LENGTH(sTerms);
IF (l > 0) THEN
-- checking is value matched terms exactly
IF (sTerms = sValue) THEN
SET w = 50000;
ELSE
-- supposing that "the terms" is one single term so it it match in full, the weight will be high
IF (l <= 63) THEN
SET w = w + SP_TermWeight(sValue, sTerms, 5000, 1000, 100);
END IF;
-- not processing it term by term if it is already matched as full
IF (w = 0) THEN
-- processing term by term using space or comma as delimiter
WHILE i <= l DO
BEGIN
SET c = SUBSTRING(sTerms, i, 1);
IF (LOCATE(c, delimiters) > 0) THEN
SET s = SUBSTRING(sTerms, p, i - p);
SET w = w + SP_TermWeight(sValue, s, 50, 10, 0);
SET p = i + 1;
END IF;
SET i = i + 1;
END;
END WHILE;
IF (p > 1 AND p < i) THEN
SET s = SUBSTRING(sTerms, p, i - 1);
SET w = w + SP_TermWeight(sValue, s, 50, 10, 0);
END IF;
END IF;
END IF;
END IF;
RETURN w;
END
Technically speaking it is 'separating' terms using delimiters and checking if the value "contains" the term.
It's a bit hard to explain everything what it does (I've added a few comments in the code for you).
Feel free to ask questions if you do not understand some bits.
In your case it can be simplified dramatically as you do not need to differentiate begin/end/middle matches.
Another helper function that used internally:
CREATE FUNCTION `SP_TermWeight`(
`sValue` TEXT,
`sTerm` VARCHAR(63),
`iWeightBegin` INT,
`iWeightEnd` INT,
`iWeightMiddle` INT
)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE r INT DEFAULT 0;
SET sTerm = TRIM(sTerm);
IF (LENGTH(sTerm) > 1) THEN
IF (iWeightBegin != 0 AND sValue REGEXP CONCAT('[[:<:]]', sTerm)) THEN
SET r = r + iWeightBegin;
END IF;
IF (iWeightEnd != 0 AND sValue REGEXP CONCAT(sTerm, '[[:>:]]')) THEN
SET r = r + iWeightEnd;
END IF;
IF (r = 0 AND iWeightMiddle != 0 AND sValue REGEXP sTerm) THEN
SET r = r + iWeightMiddle;
END IF;
END IF;
RETURN r;
END
This function used for assigning different weights if the term matched to value from the beginning of the string, at the end of the string or in the middle. It is important in my case. In your case it might be simple LIKE.
I ended up using Full Text Search. Following is the stored procedure I created to enable searching:
DROP PROCEDURE IF EXISTS SP_Search $$
CREATE PROCEDURE `SP_Search`(IN QuestionToSearch TEXT, IN TagsToSearch TEXT, IN CollectionsToSearch TEXT, IN ReturnRecordsFromIndex INT, IN TotalRecordsToReturn INT)
BEGIN
SET #MainQuery = CONCAT("SELECT *, MATCH(question, answer_content) AGAINST (", CONCAT("'", QuestionToSearch, "'"), " IN NATURAL LANGUAGE MODE) AS score ");
SET #MainQuery = CONCAT(#MainQuery, " FROM tbl_libraries ");
SET #MainQuery = CONCAT(#MainQuery, " WHERE MATCH(question, answer_content) AGAINST (", CONCAT("'", QuestionToSearch, "'"), " IN NATURAL LANGUAGE MODE) ");
IF F_IsNullOrEmpty(TagsToSearch) AND NOT F_IsNullOrEmpty(CollectionsToSearch) THEN
SET #MainQuery = CONCAT(#MainQuery, " AND collections LIKE '%", CollectionsToSearch, "%' ");
ELSEIF F_IsNullOrEmpty(CollectionsToSearch) AND NOT F_IsNullOrEmpty(TagsToSearch) THEN
SET #MainQuery = CONCAT(#MainQuery, " AND tags LIKE '", TagsToSearch, "' ");
ELSEIF NOT F_IsNullOrEmpty(TagsToSearch) AND NOT F_IsNullOrEmpty(CollectionsToSearch) THEN
SET #MainQuery = CONCAT(#MainQuery, " AND tags LIKE '", TagsToSearch, "' AND collections LIKE '", CollectionsToSearch, "' ");
END IF;
SET #MainQuery = CONCAT(#MainQuery, " ORDER BY score DESC ");
SET #MainQuery = CONCAT(#MainQuery, " LIMIT ", ReturnRecordsFromIndex, ", ", TotalRecordsToReturn);
PREPARE SqlQuery FROM #MainQuery;
EXECUTE SqlQuery;
END $$
DELIMITER ;
This uses a custom function I created F_IsNullOrEmpty, which is as shown below for completion:
CREATE FUNCTION F_IsNullOrEmpty(ValueToCheck VARCHAR(256)) RETURNS BOOL
DETERMINISTIC
BEGIN
IF((ValueToCheck IS NULL) OR (LENGTH(ValueToCheck) = 0) OR (ValueToCheck = 'null')) THEN
Return True;
ELSE
Return False;
END IF;
END;
Related
So, I have more or less this structure of columns in my table:
Name Age Operator
---- --- --------
Jhon 35 >
Michael 30 =
Jess 27 <
Based on that I want to make a query like this
SELECT * FROM mytable WHERE Name = 'John' AND Age > 40
obviosly this will return no results, and thats fine, but my problem is that I want to use Jhon's "Operator" value (> in this case) to make that condition.
Is it possible?
Thank you!
You can simply do it like this:
SELECT
*
FROM Table1
WHERE Name = 'Jhon'AND CASE
WHEN Operator = '>' THEN Age > 10
WHEN Operator = '<' THEN Age < 10
WHEN Operator = '=' THEN Age = 10
END
see it working live in an sqlfiddle
You also could use MySQL's PREPARE and EXECUTE statements to make dynamic SQL.
SET #name = 'Jhon';
SET #operator = NULL;
SET #age = 10;
SELECT
Operator
INTO
#operator
FROM
Table1
WHERE
Name = #name;
SET #SQL = CONCAT(
"SELECT"
, " * "
, " FROM "
, " Table1 "
, " WHERE "
, " name = '", #name, "' AND age ", #operator, ' ' , #age
);
SELECT #SQL; # not needed but you can see the generated SQL code which will be executed
PREPARE s FROM #SQL;
EXECUTE s;
see demo https://www.db-fiddle.com/f/3Z59Lxaoy1ZXC4kdNCtpsr/1
I have report that displays name (FirstName MiddleName LastName). Each field has a space between and works great when name has MiddleName. However, when MiddleName is NULL, space renders between FirstName and LastName. What is the best way to solve that problem?
I tried something like that:
=Fields!FirstName.Value & "" & iif (isNothing(Fields!MiddleName.Value), "", Fields!MiddleName.Value) & "" & Fields!LastName.Value
With your current expression there should be no spaces. If there are any spaces currently they must exist in your database... This can be confirmed by using TRIM on FirstName, MiddleName, LastName.
To properly format your string you need to use spaces between the "", and ensure the space following the middle name is contained within the iif statement
=TRIM(Fields!FirstName.Value) & " " &
iif (isNothing(TRIM(Fields!MiddleName.Value)),
"",
TRIM(Fields!MiddleName.Value) & " ")
& TRIM(Fields!LastName.Value)
Firstly your expression above doesn't returns a name with space.
I think you mean something like this with space:
=Fields!FirstName.Value & " " & iif (isNothing(Fields!MiddleName.Value), " ", Fields!MiddleName.Value) & " " & Fields!LastName.Value
Aside from Jonnus answer you could also use 'Len' to check its length.
If you only worry about the MiddleName you could use this expression using Len
=Fields!FirstName.Value & " " & IIF(Len(Fields!MiddleName.Value) < 1,"", Fields!MiddleName.Value) & " " & Fields!LastName.Value
But if I were you I would rather use sql function to handle this rather than do it in expression which is messy. And I could also reuse this for other reports than repeat the expression again.
Here is an example of function that returns a fullname with middle name of your format.
IF EXISTS (
SELECT *
FROM dbo.sysobjects
WHERE ID = OBJECT_ID(N'[dbo].[GetClientFullNameWithMiddleName]') AND
xtype in (N'FN', N'IF', N'TF'))
DROP FUNCTION [dbo].[GetClientFullNameWithMiddleName]
GO
CREATE FUNCTION [dbo].[GetClientFullNameWithMiddleName](#ClientID UNIQUEIDENTIFIER)
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE #FULLNAME NVARCHAR(MAX)
DECLARE #LASTNAME NVARCHAR(MAX)
DECLARE #FIRSTNAME NVARCHAR(MAX)
DECLARE #MIDDLENAME NVARCHAR(MAX)
SET #LASTNAME = ISNULL((SELECT Lastname FROM Client WHERE ClientID = #ClientID),'')
SET #FIRSTNAME = ISNULL((SELECT Firstname FROM Client WHERE ClientID = #ClientID),'')
SET #MIDDLENAME = ISNULL((SELECT Middlename FROM Client WHERE ClientID = #ClientID),'')
IF(#ClientID <> '00000000-0000-0000-0000-000000000000')
BEGIN
IF(#FIRSTNAME <> '')
BEGIN
SET #FULLNAME = CAST((#FIRSTNAME) AS NVARCHAR(MAX))
IF(#MIDDLENAME <> '' AND #FULLNAME <> '')
BEGIN
SET #FULLNAME = CAST((#FULLNAME + ' ' + #MIDDLENAME) AS NVARCHAR(MAX))
END
IF(#LASTNAME <> '' AND #FULLNAME <> '')
BEGIN
SET #FULLNAME = CAST((#FULLNAME + ' ' + #LASTNAME) AS NVARCHAR(MAX))
END
END
ELSE
BEGIN
SET #FULLNAME = CAST((#FIRSTNAME)AS NVARCHAR(MAX))
END
END
ELSE
BEGIN
SET #FULLNAME = ''
END
RETURN #FULLNAME
END
GO
Then you could use it like this
SELECT
[dbo].[GetClientFullNameWithMiddleInitial](ClientID)
FROM Client
My request payload from the client javascript/angular
active: 1
appId: "asdf"
description: "asdf"
from: "06/16/2015"
name: "gdsfg"
to: "06/18/2015"
Node.js code is
var query = "SET #start = '" + request.body.from + "'; \
SET #end = '" + request.body.to + "'; \
SET #event_id = " + rows.insertId + "; \
CALL day(#start, #end, #event_id);";
Error return is
{ [Error: ER_TRUNCATED_WRONG_VALUE: Incorrect date value: '06/16/2015' for column 'start' at row 2]
code: 'ER_TRUNCATED_WRONG_VALUE',
errno: 1292,
sqlState: '22007',
index: 3 }
stored procedure:
(in essence it takes the from and to dates and create the number of rows based on the difference).
CREATE DEFINER=`root`#`localhost` PROCEDURE `day`(start DATE, end DATE, event_id INT)
BEGIN
WHILE start <= end DO
INSERT INTO day(date, event_id) VALUES(start, event_id);
SET start = start + 1;
END WHILE;
END
Question:
not sure what's causing the error, can anyone please help
Edit - query output
SET #start = '06/16/2015'; SET #end = '06/18/2015'; SET #event_id = 3; CALL day(#start, #end, #event_id);
I am guessing that date is stored as a date. But start is not, possibly it is just an integer that looks like a date.
If so, this will fix your problem:
WHILE start <= end DO
INSERT INTO day(date, event_id) VALUES(start, event_id);
SET start = date_add(#tart, interval 1 day);
END WHILE;
EDIT:
That is not the problem. The problem is in the calling code. So try:
var query = "SET #start = str_to_date('" + request.body.from + "', '%m/%d/%Y'); \
SET #end = str_to_date('" + request.body.to + "', '%m/%d/%Y'); \
SET #event_id = " + rows.insertId + "; \
CALL day(#start, #end, #event_id);";
you should use 'yyyy-mm-dd' format instead of 'mm/dd/yyyy' in your query. as your procedure parameters are date type
your query should be like this
SET #start = '2015-06-16'; SET #end = '2015-06-18'; SET #event_id = 3; CALL day(#start, #end, #event_id);
I am trying to add the group_concat function to hsqldb so that I can properly test a query as a unit/integration test. The query works fine in mysql, so I need it to work in hsqldb (hopefully).
// GROUP_CONCAT
jdbcTemplate.update("DROP FUNCTION GROUP_CONCAT IF EXISTS;");
jdbcTemplate.update(
"create aggregate function group_concat(in val varchar(100), in flag boolean, inout buffer varchar(1000), inout counter int) " +
" returns varchar(1000) " +
" contains sql " +
"begin atomic " +
" if flag then" +
" return buffer;" +
" else" +
" if val is null then return null; end if;" +
" if buffer is null then set buffer = ''; end if;" +
" if counter is null then set counter = 0; end if;" +
" if counter > 0 then set buffer = buffer || ','; end if;" +
" set buffer = buffer + val;" +
" set counter = counter + 1;" +
" return null;" +
" end if;" +
"end;"
);
Adding this aggregation function solves most of the problem. It will correctly behave like mysql's group_concat. However, what it won't do is let me use the distinct keyword like this:
group_concat(distinct column)
Is there any way to factor in the distinct keyword? Or do I rewrite the query to avoid the distinct keyword altogether?
HSQLDB has built-in GROUP_CONCAT and accepts DISTINCT.
http://hsqldb.org/doc/2.0/guide/dataaccess-chapt.html#dac_aggregate_funcs
At the moment you cannot add DISTINCT to a user-defined aggregate function, but this looks like an interesting feature to allow in the future.
When I attempt to cast my FLOATS into CHARS in this procedure, I get null values in the database. Location is a Geospatial field. What am I doing wrong?
CREATE DEFINER=`me`#`%` PROCEDURE `UpdateLocationByObjectId`(IN objectId INT,
IN latitude FLOAT,
IN longitude FLOAT)
BEGIN
UPDATE Positions P
JOIN Objects O ON P.Id = O.PositionId
SET P.Location = GeomFromText('Point(' + CAST(latitude AS CHAR(10)) + ' ' + CAST(longitude AS CHAR(10)) +')')
WHERE O.ObjectId = objectId;
END
If I use this as a test, it works fine.
CREATE DEFINER=`me`#`%` PROCEDURE `UpdateLocationByObjectId`(IN objectId INT,
IN latitude FLOAT,
IN longitude FLOAT)
BEGIN
UPDATE Positions P
JOIN Objects O ON P.Id = O.PositionId
SET P.Location = GeomFromText('Point(10 10')')
WHERE O.ObjectId = objectId;
END
Change this line
SET P.Location = GeomFromText('Point(' + CAST(latitude AS CHAR(10)) + ' '
+ CAST(longitude AS CHAR(10)) +')')
To
SET P.Location = GeomFromText(concat('Point(' , CAST(latitude AS CHAR(10)) , ' '
, CAST(longitude AS CHAR(10)) ,')'))
The + operator is adding your text values ('10' + '10') = 20
So the center part evaluates to 'Point(' + 20 + ')', adding text that cannot be read as number + numbers evaluates to NULL.
Only the concat function can concatenate strings.
In fact this code will work just as well:
SET P.Location = GeomFromText(concat('Point(', latitude, ' ', longitude,')'))