Can anyone tell me how can we achieve the functionality of SQL join in sphinx search?
I want to index few columns from table1 and few from table2.
Tables are in MySQL.
1° as Barryhunter answer
sql_query = SELECT t1.id,t1...., t2.... FROM table1 AS t1 INNER JOIN table2 AS t2 ON ....
2° if one-to-many
sql_query = SELECT t1.id,t1...., group_concat(t2.foo) ASt 2_foo, . FROM table1 AS t1 INNER JOIN table2 AS t2 ON .... GROUP BY t1.id
group_concat has length limitation but sphinx is marvellous
sql_query = SELECT t1.id,t1....,. FROM table1 AS t1;
sql_joined_field = t2_foo from query;\
SELECT t2.rel_t1_id , t2.foo\
FROM table2` AS t2\
ORDER by t2.rel_t1_id ASC
As my English is poor, read this is probably more clear
http://sphinxsearch.com/docs/current/conf-sql-joined-field.html
sql_joined_field = t2_foo would add one more "searchable" field called t2_foo. In this field you retrieve t2.foo content (like the group concat but seperated by space)
The first column must be the id matching to t1.id in your sql_query
ORDER by theID ASC is mandatory
In same idea you can use mva for multiple value in an attribute
http://sphinxsearch.com/docs/current/conf-sql-attr-multi.html
sql_attr_multi = uint tag from query; \
SELECT id, tag FROM tags
You can just use a join in the sql_query, its just a standard MySQL query, that indexer runs and indexes the output. the MySQL server just needs to run it.
sql_query = SELECT id,t.name,o.test FROM table1 t INNER JOIN other o USING (id)
Related
I have two tables.
table1 has a row id='12345'
table2 has a row where the id = that of table1 with added chars, e.g.,
table2.id = '12345-678qt'
table2 may have more than one id starting with '12345-' with different ending chars.
Yes, there is always a dash after table1's id that could be used in the query.
I need to get some data from both tables, say
SELECT table1.id, table1.field9,
table2.id, table2.fieldZ
FROM table1 and table2
WHERE (
table1.id=table2.id's characters before the dash
OR
table1.field1 = 'abcde'
)
AND table2.dataB='something'
ORDER BY table1.datefield DESC LIMIT 3;
Thank you.
The MySQL translation of:
"table1 and table2 WHERE" is table1 INNER JOIN table 2 ON
"table2.id's characters before the dash" is SUBSTRING_INDEX(table2.id, '-', 1)
The query should look like this:
SELECT table1.id, table1.field9,
table2.id, table2.fieldZ
FROM table1
INNER JOIN table2
ON (table1.id = SUBSTRING(table2.id, '-', 1) OR table1.field1 = 'abcde')
AND table2.dataB = 'something'
ORDER BY table1.datefield DESC
LIMIT 3;
This doesn't grant that your query will work if it's written like this. My fixes will remove the current errors inside your query. For more troubleshooting with a sql fiddle link, please provide tables and I'll be happy to help.
Check the official documentation about the JOIN operation and the SUBSTRING_INDEX function at the corresponding links.
I have two tables:
mytable1
UserId (int) (primary_key)
Save (blob)
mytable2
UserId (int) (primary_key)
Save (blob)
I make the following mysql command:
UPDATE mytable1 tb1, mytable2 tb2 SET tb1.Save='', tb2 .Save='' WHERE tb1.UserId=25 AND dbSv1.UserId=25
When both tables have a user with UserId = 25, then this works and Save is set to ''. However, if one table does not have a user with UserId = 25, but the other one does, then Save is not set to '' in the one that does. This is not the behaviour I want.
OR is not the thing to use, as other Saves will be set to '' which do not have an UserId of 25. So what do I need?
Your query is using the old-school comma syntax for a join operation. (There's some problems in the SQL... dbSv1 is used as a qualifier, but it doesn't appear as a table name or table alias. We're going to assume that was supposed to be tb2.
Your query is equivalent to:
UPDATE mytable1 tb1
JOIN mytable2 tb2
SET tb1.save=''
, tb2.save=''
WHERE tb1.userid=25
AND tb2.userid=25
If a matching row is not found in either tb1 or tb2, the the JOIN operation will produce an empty set. This is expected behavior.
Consider the result set returned from this query:
SELECT tb1.userid
, tb2.userid
FROM mytable1 tb1
JOIN mytable2 tb2
WHERE tb1.userid=25
AND tb2.userid=25
when there are no rows in tb2 that satisfy the predicates, the query won't return any rows.
You could use an "outer" join to make returning rows from one of the tables optional. For example, to update mytable1 even when no matching rows exist in mytable2...
UPDATE mytable1 tb1
LEFT
JOIN mytable2 tb2
ON tb2.userid=25
SET tb1.save=''
, tb2.save=''
WHERE tb1.userid=25
If there are no rows in mytable1 that have userid=25, then this won't update any rows.
MySQL doesn't support FULL OUTER JOIN. But you try something like this, using an inline view to return a row, and then performing outer joins to both mytable1 and mytable2...
UPDATE ( SELECT 25 + 0 AS userid ) i
LEFT
JOIN mytable1 tb1
ON tb1.userid = i.userid
LEFT
JOIN mytable2 tb2
ON tb2.userid = i.userid
SET tb1.save = ''
, tb2.save = ''
SQLFiddle demonstration: http://sqlfiddle.com/#!9/6f8598/1
FOLLOWUP
A "join" is a common SQL operation. You shouldn't have any trouble finding out information about what that is what it does.
The "+ 0" isn't strictly necessary. It's a convenient shorthand in MySQL to CAST to numeric. As a test, see what MySQL returns for this:
SELECT '25' + 0
, '25xyz' + 0
, 'abc' + 0
The purpose of the inline view was to return a single row. We could have written the query to hardcode the user_id two times, and ignore what's returned from the line view ....
SELECT t1.user_id AS t1_user_id
, t2.user_id AS t2_user_id
FROM ( SELECT 'foo' AS dontcare ) i
LEFT
JOIN mytable1 t1
ON t1.user_id = 25
LEFT
JOIN mytable t2
ON t2.user_id = 25
My preference is to make it more clear that our intent is for both of the values to be the same. We could code where one of them is 23 and the other is 27. That's syntactically valid to do that. When we convert this to a prepared statement with bind placeholders...
SELECT t1.user_id AS t1_user_id
, t2.user_id AS t2_user_id
FROM ( SELECT 'foo' AS dontcare ) i
LEFT
JOIN mytable1 t1
ON t1.user_id = ?
LEFT
JOIN mytable t2
ON t2.user_id = ?
We kind of "lose" the idea that those two values are the same. To get that hardcoded value specified only one time, I have the inline view return the value we want to "match" in the ON clause of the outer joins.
SELECT t1.user_id AS t1_user_id
, t2.user_id AS t2_user_id
FROM ( SELECT ? AS user_id ) i
LEFT
JOIN mytable1 t1
ON t1.user_id = i.user_id
LEFT
JOIN mytable t2
ON t2.user_id = i.user_id
Now my intent is more clear... I'm looking for "one" user_id value. Adding the "+ 0" indicates that whatever value gets passed in (e.g. '25', 'foo', or whatever), my statement is going to interpret that as a numeric value.
inline view
I used the term "inline view". That's just a SELECT query used in a context where we usually have a table.
e.g. if i have a table named mine, i can write a query...
SELECT m.id, m.name FROM mine m
test it and see that it returns rows, yada, yada.
I can also do this: wrap that query in parens and reference it in place of a table in another statement, like this...
SELECT t.*
FROM ( SELECT m.id, m.name FROM mine m ) t
MySQL requires that we assign an alias to that, like we can do if it were a table. We call that an inline view because it's similar to the pattern we use for a stored view. Let's look at a demonstration of doing that.
(This is just a demonstration of the pattern; there's some reasons we wouldn't want to do this.)
CREATE VIEW myview
AS
SELECT m.id, m.name FROM mine m
;
Then we can do this:
SELECT t.* FROM myview t
With the inline view we're following the same pattern, but we're bypassing a separate create view statement. (That's a DDL statement that causes an implicit commit, and creating a database object.) Bypassing that, we're effectively creating a view that exists only in the context of the statement, and doing that "inline", within the statement.
SELECT t.* FROM ( SELECT m.id, m.name FROM mine m ) t
The MySQL documentation refers to the inline view as a "derived table". If we (accidentally) forget the alias, the error we get back says something like "every derived table must have a alias". The more general term, used for databases other than MySQL is "inline view".
Due to its geographic capabilities I'm migrating my database from MySQL to PostgreSQL/PostGIS, and SQL that used to be so trivial is now are becoming painfully slow to overcome.
In this case I use a nested query to obtain the results in two columns, having in 1st column an ID and in the 2nd a counting result and insert those results in table1.
EDIT: This is the original MySQL working code that I need to be working in PostgreSQL:
UPDATE table1 INNER JOIN (
SELECT id COUNT(*) AS cnt
FROM table2
GROUP BY id
) AS c ON c.id = table1.id
SET table1.cnt = c.cnt
The result is having all rows with the same counting result, that being the 1st counting result of the nested select.
In MySQL this would be solved easily.
How would this work in PostgreSQL?
Thank you!
UPDATE table1 dst
SET cnt = src.cnt
FROM (SELECT id, COUNT (*) AS cnt
FROM table2
GROUP BY id) as src
WHERE src.id = dst.id
;
How can we reference the column value from one table as the reference for a join (see example below)?
SELECT t1.*, t2.*, t3.* FROM term_relationships as t1
INNER JOIN modules as t2 ON t2.module_id = t1.object_id
INNER JOIN t2.nextOfKinTable as t3 ON t3.module_id = t2.module_id;
I thought about using the information_schema but it is too much writing to accomplish something that might be easier by just maintaining a reference to the table you want joined to the current join result, only that I don't know how to make them join this way. Please help :(
Edit:
Essentially what we want is to join table1, table2, and table3 only that the name for table3 is a value stored in table2.
The common column in this case is module_id (object_id)
And the unknown table is t2.nextOfKinTable
Try this out
declare #tabName varchar(1000)
set #tabName = (select top 1 ProductName as tabName from products)
declare #query varchar(8000)
set #query = ' select
p.ProductName as '''+#tabName+'''
from Products p'
---print(#query)
exec (#query)
There's no standard way to do this in SQL. Consider if your t2 table contained 1000 rows, and each row has a distinct nextOfKinTable value. That would result explode the query into a 1002 table join. Not pretty. I'm not even aware of any proprietary syntax that would support it in any products I know of.
If the number of distinct column values are small, you can use LEFT JOINs, but each joined table will receive a different alias (example using 3 tables):
SELECT
t1.*, --TODO: List columns
t2.*, --TODO: List columns
COALESCE(t3.ColumnA,t4.ColumnA,t5.ColumnA) as ColumnA,
COALESCE(t3.ColumnB,t4.ColumnB,t5.ColumnB) as ColumnB
FROM
term_relationships as t1
INNER JOIN
modules as t2
ON
t2.module_id = t1.object_id
LEFT JOIN
Table3 as t3
ON
t3.module_id = t2.module_id AND
t2.nextOfKinTable = 'Table3'
LEFT JOIN
Table4 as t4
ON
t4.module_id = t2.module_id AND
t2.nextOfKinTable = 'Table4'
LEFT JOIN
Table5 as t5
ON
t5.module_id = t2.module_id AND
t2.nextOfKinTable = 'Table5'
You might also want to consider whether these separate tables ought actually to be a single table, with additional column(s) to distinguish the rows. This problem is sometimes referred to as attribute splitting. (See Joe Celko's example of Table splitting in an article from 2009)
Imagine I have table1 which has a column named 'table_name'. I use table1.table_name to store the name of another table in the database. The referenceable tables would all have a field 'target_id.
Is is possible to use table_name in a JOIN statement?
For example:
SELECT t1.*, t2.* FROM table1 AS t1
JOIN table1.table_name AS t2 ON t1.table1_id = t2.target_id
The obvious solution is to use the script (C++ in my case) to get the table name first, and construct a SQL query from it. The question is: can we bypass the script and do this directly in SQL (MySQL)?
Edit: What is dynamic SQL?
The only chance you have is to do 2 SQL statements:
select the tablename you need
use this table-name to dynamically build the secound query to get the data you need - what you want isn't possible to do with SQL directly (and it sounds like you've designed your database wrong in some way - but that's hard to say without knowing what's the goal of it).
I know I'm late to the party, but I wanted to offer a different solution. I see this sort of thing a lot in audit tables. The column table_name would refer to "what table was changed" and table1_id would refer to the ID of the row that changed in that table. In this case, the audit table is pointing back to many different tables that don't normally get joined.
Here goes:
SELECT t1.*, t2.*, t3.*, t4.*, t5.*
FROM table1 AS t1
left JOIN table2 AS t2
ON t1.table1_id = t2.target_id
and t1.table_name = 'table2'
left JOIN table3 AS t3
ON t1.table1_id = t3.target_id
and t1.table_name = 'table3'
left JOIN table4 AS t4
ON t1.table1_id = t4.target_id
and t1.table_name = 'table4'
left JOIN table5 AS t5
ON t1.table1_id = t5.target_id
and t1.table_name = 'table5'
Of course, the main drawback is that each table that can be possibly referenced needs to be explicitly included in the SQL command.
You can get more elegant output using this as your select list:
SELECT
t1.*,
coalesce(t2.fieldA, t3.fieldA, t4.fieldA, t5.fieldA) as fieldA,
coalesce(t2.fieldB, t3.fieldB, t4.fieldB, t5.fieldB) as fieldB
etc