sphinx search field weights - mysql

I am getting a syntax error on my query in my .conf file
Everything worked great until I added the OPTION field_weights. What am I doing wrong for defining my field weights?
here is the query for my sphinx index
source tx3nh_users : src {
sql_query_range = SELECT MIN(id), MAX(id) FROM tx3nh_users
sql_query = SELECT u.id, p.fullname, p.email, s.staff_title, s.bio FROM tx3nh_users AS u LEFT JOIN tx3nh_user_attributes AS p ON u.id=p.internalKey LEFT JOIN oxv5v_su_staff AS s ON u.id=s.user_id WHERE u.id>=$start AND u.id<=$end OPTION field_weights=(p.fullname=3, s.staff_title=2, s.bio=1)
}

sql_query is a SQL query that indexer runs against your actual database. So it needs to be a valid MySQL query. Its intrepreted and executed by MySQL, to return your actual data, which then indexer turns into a sphinx index.
On the other hand OPTION field_weights is from sphinxQL. So you add it to the SphinxQL query, when you make an actaul query against the index.
sphinxQL> SELECT id FROM tx3nh_users WHERE MATCH('keyword1')
OPTION field_weights=(p.fullname=3, s.staff_title=2, s.bio=1)
Because its a query time paramater the weights arent written to the index, and so you can choose the weights on a per query basis, rather than the same weights for all queries.

Related

MariaDB performance issue with "Where IN" clause

I got an issue with my SQL code. We developed an application which runs on MySQL, and there it runs fine. So I decided to give MariaDB a try and installed it on a dev machine. On a certain query Stmt, i have a performance issue I do not understand. The query is the following:
SELECT SAMPLES.*, UNIX_TIMESTAMP(SAMPLES.SAMPLE_DATE) as TIMESTAMP,RAWS.VALUE, DATAKEYS.RAW_ID, DATAKEYS.DATA_KEY_VALUE, DATAKEYS.DATA_KEY_ID, KEYDEF.KEY_NAME, KEYDEF.LDD_ID
FROM
PDS.TABLE_SAMPLES SAMPLES
RIGHT OUTER JOIN PDS.TABLE_RAW_VALUES RAWS ON SAMPLES.SAMPLE_ID = RAWS.SAMPLE_ID
RIGHT OUTER JOIN PDS.TABLE_SAMPLE_DATA_KEYS DATAKEYS ON(DATAKEYS.RAW_ID = RAWS.RAW_ID AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) OR
(DATAKEYS.RAW_ID = 0 AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID)
RIGHT OUTER JOIN PDS.TABLE_DATA_KEY_DEFINITION KEYDEF ON(DATAKEYS.DATA_KEY_ID = KEYDEF.DATA_KEY_ID)
WHERE
SAMPLES.SAMPLE_ID IN(1991331,1991637,1991941,2046105,2046411,2046717,2047023,2047635,2047941,2048247)
AND (SAMPLES.PARAMETER_ID = 9)
GROUP BY DATAKEYS.DATA_KEY_ID, RAWS.RAW_ID, DATAKEYS.DATA_KEY_ID
ORDER BY SAMPLES.SAMPLE_ID, DATAKEYS.RAW_ID;
As long as I got only ONE value in the "WHERE IN" condition, the query takes ~10ms to execute. That's about the same MySQL 5.6 took.
As soon as I add another value there, the query time raises to several minutes. In MySQL, it raises very slowly, the Query shown up tehre takes ~150ms on MySQL and about 140 seconds on the new MariaDB installation using exactly the same datasets.
I'm no SQL expert, can you give me some clues how to optimize the query to run as expected?
The right outer joins are being converted to inner joins by the where clause. So, just use the proper join type (I'm not sure if this affects the optimization of the query, but it could):
SELECT SAMPLES.*, UNIX_TIMESTAMP(SAMPLES.SAMPLE_DATE) as TIMESTAMP,RAWS.VALUE, DATAKEYS.RAW_ID, DATAKEYS.DATA_KEY_VALUE, DATAKEYS.DATA_KEY_ID, KEYDEF.KEY_NAME, KEYDEF.LDD_ID
FROM PDS.TABLE_SAMPLES SAMPLES JOIN
PDS.TABLE_RAW_VALUES RAWS
ON SAMPLES.SAMPLE_ID = RAWS.SAMPLE_ID JOIN
PDS.TABLE_SAMPLE_DATA_KEYS DATAKEYS
ON (DATAKEYS.RAW_ID = RAWS.RAW_ID AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) OR
(DATAKEYS.RAW_ID = 0 AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) JOIN
PDS.TABLE_DATA_KEY_DEFINITION KEYDEF
ON DATAKEYS.DATA_KEY_ID = KEYDEF.DATA_KEY_ID)
WHERE SAMPLES.SAMPLE_ID IN (1991331, 1991637, 1991941, 2046105, 2046411, 2046717, 2047023, 2047635, 2047941, 2048247) AND
(SAMPLES.PARAMETER_ID = 9)
GROUP BY DATAKEYS.DATA_KEY_ID, RAWS.RAW_ID, DATAKEYS.DATA_KEY_ID
ORDER BY SAMPLES.SAMPLE_ID, DATAKEYS.RAW_ID;
Next, the best index for this query -- regardless of the number of values in the IN is the composite index PDS.TABLE_SAMPLES(PARAMETER_ID, SAMPLE_ID). This handles the WHERE clause.
Because your query runs quickly under some circumstances, I assume the other tables have the appropriate indexes for the joins.
Instead of operator 'IN' try using 'exists' and the use the subquery
instead of using sample_id's.

Error Converting MySQL Query to SQL Server

Trying to convert below query into SQL, query works fine on MySQL. Problem seems to be the GROUP BY area. Even when I use just 1 GROUP BY field I get same error. Using query in InformaticaCloud.
ERROR
"the FROM Config_21Cent WHERE resp_ind = 'Insurance' GROUP BY
resp_Ind;;] is empty in JDBC connection:
[jdbc:informatica:sqlserver://cbo-aps-inrpt03:1433;DatabaseName=SalesForce]."
SELECT sum(Cast(Resp_Ins_Open_dol AS decimal(10,2))) as baltotal,
carrier_code,
carrier_name,
carrier_grouping,
collector_name,
dataset_loaded,
docnum,
envoy_payer_id,
loc,
market,
master_payor_grouping,
plan_class,
plan_name,
resp_ins,
resp_ind,
resp_payor_grouping,
Resp_Plan_Type,
rspphone,
state
FROM Config_21Cent
WHERE resp_ind = 'Insurance'
GROUP BY
(resp_ins + resp_payor_grouping +
carrier_code + state + Collector_Name);
Your entire query isn't going to work. The group by statement contains a single expression, the summation of a bunch of fields. The select statement contains zillions of columns without aggregates. Perhaps you intend for something like this:
select resp_ins, resp_payor_grouping, carrier_code, state, Collector_Name,
sum(Cast(Resp_Ins_Open_dol AS decimal(10,2))) as baltotal
from Config_21Cent
WHERE resp_ind = 'Insurance'
GROUP BY resp_ins, resp_payor_grouping, carrier_code, state, Collector_Name;
THis will work in both databases.
The columns in SELECT statement must be a subset (not proper subset but subset) of columns in 'GROUP BY' statement. There is no such restriction on aggregates in SELECT statement though. There could be any number of aggregates; aggregates even on columns not in GROUP BY statement can be included.

Optimizing MySQL query with nested select statements?

I've got read-only access to a MySQL database, and I need to loop through the following query about 9000 times, each time with a different $content_path_id. I'm calling this from within a PERL script that's pulling the '$content_path_id's from a file.
SELECT an.uuid FROM alf_node an WHERE an.id IN
(SELECT anp.node_id FROM alf_node_properties anp WHERE anp.long_value IN
(SELECT acd.id FROM alf_content_data acd WHERE acd.content_url_id = $content_path_id));
Written this way, it's taking forever to do each query (approximately 1 minute each). I'd really rather not wait 9000+ minutes for this to complete if I don't have to. Is there some way to speed up this query? Maybe via a join? My current SQL skills are embarrassingly rusty...
This is an equivalent query using joins. It depends what indexes are defined on the tables how this will perform.
If your Perl interface has the notion of prepared statements, you may be able to save some time by preparing once and executing with 9000 different binds.
You could also possibly save time by building one query with a big acd.content_url_id In ($content_path_id1, $content_path_id2, ...) clause
Select
an.uuid
From
alf_node an
Inner Join
alf_node_properties anp
On an.id = anp.node_id
Inner Join
alf_content_data acd
On anp.long_value = acd.id
Where
acd.content_url_id = $content_path_id
Try this extension to Laurence's solution which replaces the long list of OR's with an additional JOIN:
Select
an.uuid
From alf_node an
Join alf_node_properties anp
On an.id = anp.node_id
Join alf_content_data acd
On anp.long_value = acd.id
Join (
select "id1" as content_path_id union all
select "id2" as content_path_id union all
/* you get the idea */
select "idN" as content_path_id
) criteria
On acd.content_url_id = criteria.content_path_id
I have used SQL Server syntax above but you should be able to translate it readily.

mysql view super slow

this is the query for Unified Medical Language System(UMLS) to find a word related to normalized word. this query result is 165MS, but if I am running VIEW of this same query it is taking 70 sec. I m new to the mysql. Please help me.
Query:
SELECT a.nwd as Normalized_Word,
b.str as String,
c.def as Defination,
d.sty as Semantic_type
FROM mrxnw_eng a, mrconso b, mrdef c, mrsty d
WHERE a.nwd = 'cold'
AND b.sab = 'Msh'
AND a.cui = b.cui
AND a.cui = c.cui
AND a.cui = d.cui
AND a.lui = b.lui
AND b.sui = a.sui
group by a.cui
View definition:
create view nString_Sementic as
SELECT a.nwd as Normalized_Word,
b.str as String,
c.def as Defination,
d.sty as Semantic_type
FROM mrxnw_eng a, mrconso b, mrdef c, mrsty d
WHERE b.sab = 'Msh'
AND a.cui = b.cui
AND a.cui = c.cui
AND a.cui = d.cui
AND a.lui = b.lui
AND b.sui = a.sui
group by a.cui
Selection from view:
select * nString_Sementic
where nwd = 'phobia'
You may be able to get better performance by specifying the VIEW ALGORITHM as MERGE. With MERGE MySQL will combine the view with your outside SELECT's WHERE statement, and then come up with an optimized execution plan.
To do this however you would have to remove the GROUP BY statement from your VIEW. As it is, a temporary table is being created of the entire view first, before being filtered by your WHERE statement.
If the MERGE algorithm cannot be used, a temporary table must be used
instead. MERGE cannot be used if the view contains any of the
following constructs:
Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
DISTINCT
GROUP BY
HAVING
LIMIT
UNION or UNION ALL
Subquery in the select list
Refers only to literal values (in this case, there is no underlying
table)
Here is the link with more info. http://dev.mysql.com/doc/refman/8.0/en/view-algorithms.html
If you can change your view to not include the GROUP BY statement, to specify the view's algorithm the syntax is:
CREATE ALGORITHM = MERGE VIEW...
Edit: This answer was originally based on MySQL 5.0. I've updated the links to point to the current documentation, but I have not otherwise confirmed if the answer correct for versions >5.0.
Assuming that mrxnw_eng.nwd is functionally dependent on mrxnw_eng.cui, try changing the group by clause of the view to include a.nwd - like so:
group by a.cui, a.nwd

MySQL COUNT() causing empty array() return

MySQL Server Version: Server version: 4.1.14
MySQL client version: 3.23.49
Tables under discussion: ads_list and ads_cate.
Table Relationship: ads_cate has many ads_list.
Keyed by: ads_cate.id = ads_list.Category.
I am not sure what is going on here, but I am trying to use COUNT() in a simple agreggate query, and I get blank output.
Here is a simple example, this returns expected results:
$queryCats = "SELECT id, cateName FROM ads_cate ORDER BY cateName";
But if I modify it to add the COUNT() and the other query data I get no array return w/ print_r() (no results)?
$queryCats = "SELECT ads_cate.cateName, ads_list.COUNT(ads_cate.id),
FROM ads_cate INNER JOIN ads_list
ON ads_cate.id = ads_list.category
GROUP BY cateName ORDER BY cateName";
Ultimately, I am trying to get a count of ad_list items in each category.
Is there a MySQL version conflict on what I am trying to do here?
NOTE: I spent some time breaking this down, item by item and the COUNT() seems to cause the array() to disappear. And the the JOIN seemed to do the same thing... It does not help I am developing this on a Yahoo server with no access to the php or mysql error settings.
I think your COUNT syntax is wrong. It should be:
COUNT(ads_cate.id)
or
COUNT(ads_list.id)
depending on what you are counting.
Count is an aggregate. means ever return result set at least one
here you be try count ads_list.id not null but that wrong. how say Myke Count(ads_cate.id) or Count(ads_list.id) is better approach
you have inner join ads_cate.id = ads_list.category so Count(ads_cate.id) or COUNT(ads_list.id) is not necessary just count(*)
now if you dont want null add having
only match
SELECT ads_cate.cateName, COUNT(*),
FROM ads_cate INNER JOIN ads_list
ON ads_cate.id = ads_list.category
GROUP BY cateName
having not count(*) is null
ORDER BY cateName
all
SELECT ads_cate.cateName, IFNULL(COUNT(*),0),
FROM ads_cate LEFT JOIN ads_list
ON ads_cate.id = ads_list.category
GROUP BY cateName
ORDER BY cateName
Did you try:
$queryCats = "SELECT ads_cate.cateName, COUNT(ads_cate.id)
FROM ads_cate
JOIN ads_list ON ads_cate.id = ads_list.category
GROUP BY ads_cate.cateName";
I am guessing that you need the category to be in the list, in that case the query here should work. Try it without the ORDER BY first.
You were probably getting errors. Check your server logs.
Also, see what happens when you try this:
SELECT COUNT(*), category
FROM ads_list
GROUP BY category
Your array is empty or disappear because your query has errors:
there should be no comma before the FROM
the "ads_list." prefix before COUNT is incorrect
Please try running that query directly in MySQL and you'll see the errors. Or try echoing the output using mysql_error().
Now, some other points related to your query:
there is no need to do ORDER BY because GROUP BY by default sorts on the grouped column
you are doing a count on the wrong column that will always give you 1
Perhaps you are trying to retrieve the count of ads_list per ads_cate? This might be your query then:
SELECT `ads_cate`.`cateName`, COUNT(`ads_list`.`category`) `cnt_ads_list`
FROM `ads_cate`
INNER JOIN `ads_list` ON `ads_cate`.`id` = `ads_list`.`category`
GROUP BY `cateName`;
Hope it helps?