Using Sphinx for the first time - configuring the sql_query key - mysql

I'm currently practicing using Sphinx, I've not far off done much, except the configuration what I'm trying to do. The sql_query key is leaving me somewhat confused what to put there, I read in the Sphinx documentation of sql_query but it doesn't seem to clear my mind from knowing what to do since I have many SELECTs in my web application, and I want to use Sphinx for my search and the SQL is often changed (upon user search filtering).
As of my search using MySQL, I want to integrate Sphinx to my web application, if the sql_key is not optional, do I have to expect to put the whole search SQL query into that field or do I pick out the necessary fields from tables to start a reindex?
Can someone point me to the right direction so I can get things going well with Sphinx and my web application.

sql_query is mandatory , it's run by sphinx to get the data you want to be indexed from mysql . You can have joins , conditions etc. , must be a valid sql query . You should have something like "SELECT id ,field1,field2,fieldx from table" . id must be a primary id .Each row returned by this query is considered a document ( which is returned by sphinx when you search ) .
If you have multiple tables ( that are very different by meaning - users , articles etc.) - you need to create an index for each .
Read tutorials from here : http://sphinxsearch.com/info/articles/ to understand how sphinx works .

You can create a sql query to get union set of records from the Database. If you do multiple table joining and query to select the best result set, you can do it with Sphinx too.
You may run into a few trouble with your existing table structure in the database.
Like :
Base table does not have integer primary key field
Create a new table which has two fields. One for the integer id field and the other field to hold the primary key of the base table. Do an inner join with that table and select the id field from that table.
Eg. SELECT t1.id, t2.name, t2.description, t2.content FROM table_new t1 INNER JOIN table_2 t2 WHERE t1.document_id = t1.thread_id INNER JOIN REST_OF_YOUR_SELECT_QUERY
The ta.id is for Sphinx search engine to do its internal indexing.
You filter data by placing WHERE clause and filtering
You can do that in Sphinx by setting filters dynamically based on the conditions.
You select and join different tables to get results
This also can be done by setting different sources and indexes based on your requirements.
Hope this would help you to get an understanding what you need to add and modify to start thinking how Sphinx search engine can be configured to your requirements. Just come here again if your need more help.

Related

Converting MySQL to a Sphinx Search Platform

Currently working on an in-house search engine for over 12 GB / month of MySQL data.
We currently have two tables, practice prescribing, and practice information.
Both tables contain a column, practice number which identifies the practice information with their prescribing information.
I'm trying to migrate the system from MySQL searching to Sphinx Search.
The issue I'm having is the format of the practice number is STR:NUM:NUM.
Sphinx Search says that is an invalid or Null ID format and a ID needs to be just NUM.
An example of our current ID's is YV0091 which will have corresponding data in both tables.
The ID's cannot be changed or manipulated due to them being a standardised ID in our industry.
What should I do to get around this?
Well the document-id itself, in effects Sphinxes 'primary key' does need to be a simple integer. But it doesnt need to match an actual column in your database. (bit like in innodb, if dont have in integer primary key, it will create a 'rowid' internally)
Alas sphinx doesnt have a 'autoincrement' style way of allocating the id, so need to contrive it yourself. For example using a mysql user-variable...
sql_query_pre = SET #rowid:=1
sql_query = SELECT #rowid:=#rowid+1 as id, practice_id, name, ...
sql_attr_string = practice_id
... also includes putting the your practice id as an attribute. This means can still get it in queries, eg rather than using SELECT id FROM ... in sphinxql, can just do SELECT practice_id FROM ... instead.

MySQL Command what does a point mean?

I'm a newbie in mysql and have to write a implemention for a custom mysql asp.net identity storage.
I follow this tutorial and the first steps are done.
https://learn.microsoft.com/en-us/aspnet/identity/overview/extensibility/implementing-a-custom-mysql-aspnet-identity-storage-provider
Now i have the follow mysql command:
"Select Roles.Name from UserRoles, Roles where UserRoles.UserId = #userId and UserRoles.RoleId = Roles.Id"
My problem is now that i dont know how the table have to look for this request?
I would say:
Tablename : Roles
Select: Roles and Name? or is it a name?
same with UserRoles.UserID and UserRoles.RoleId
What does the point mean?
Thanks a lot
You question is quite unclear, however, if I understood correctly, you can't figure out clearly how the database schema you are using is structured and what you'll get from this query.
The query you have written SELECTs the data field called Name from the table called Roles. In order to do this, the query uses data coming from two tables: one is the Roles table itself, the other is called UserRoles.
It will extract Names data from the Roles table only for the Roles entries that have the Id field matching with the RoleId field of the entries in the UserRoles table that have the UserId equal to the given #UserId.
In other words, this SELECT query in will give you as a result a list of Names coming from the entries in the Roles table which match the given conditional check, which is what is written after the where SQL condition: where UserRoles.UserId = #userId and UserRoles.RoleId = Roles.Id.
Finally, the point "." in SQL queries is used to disambiguate between fields (or columns, if you want to call it so) with same name but coming from different tables. It is quite common that all the tables have an Id field, for example. You can identify the correct Id field in your database by writing Table1.Id, Table2.Id, and so on. Even if you don't have naming conflicts in your tables columns, adding the table name can be very good for code readability.
Edit:
As other users correctly pointed out in the comments to your question, you should also have a look to what an SQL JOIN operation is. Since you are searching data using information coming from different tables, you are actually doing an implicit JOIN on those tables.

Mysql: query not giving accurate result with IN clause and inner query

I'm trying to get zip codes from zip_id's which are internally stored in companies service table below screens will give you clear idea
I have wrote this query
companies service table
Please suggest me your valuable views . Thanks in advance.
As already mentioned your database scheme is not very well designed, it violates even 1st normal form. You'd need another table where you'd store serv_area_id and zip_code (with possibly multiple rows for a signle serv_area_id) and search within this table and eventually join your original table.
Nevertheless, in order to get the result you describe you cannot use the IN operator as it operates on a value and multiple values in a form of table (either explicit via nested SELECT or enumeration literal (val1, ..., valN)). I would try some string matching as illustrated below. However, consider it rather an ugly hack than correct solution(!)
SELECT zip FROM cities_extended WHERE (
SELECT GROUP_CONCAT(',', serv_are_zipcodes)
FROM company_service_areas WHERE ...
) LIKE concat('%(', id, ')%')

mysql query and index how to do it

i need hand to index a large table ! and i have no idea about index mysql tables
this is the query when i order rows from table
SELECT "posts.* AS `posts` , user.nickname AS nickname
FROM `posts`
LEFT JOIN user AS user ON (user.userid = posts.userid )
WHERE
posts.userid= '" . intval($bbinfo['userid']) . "'
ORDER BY posts.timestamp DESC
LIMIT $start , $_limit
"
how i can use index to index this table after inser a new post to the table ? or by alert the table where and when i can use index table and how ? please help
Just create the index and define the way it works. Then you have nothing to do. If the SQL storage engine think your index should be used he will use it. And when you create or update data it will be maintained.
Now the hard part is the definition of the index.
You can see an index as an order, like when you use a phone book. Your phone book is ordered by city, then by lastName and then by first name. It's an oreder stored near the table that the engine can use to find the results faster than it would be if he needs to read the whole table data.
In a phone book there is only one index, so the data is ordered on, that index. In a database you can have several indexes, so they are stored near the table and contains pointers to the real data addresses.
Indexes are very important when you search data. You can easily find people names Smith in New York. It's harder to find all the Smith in all US cities (with a phone book).
In your query you have two instructions that may benefits from an index. You are filtering by user and then ordering by timestamp.
If you create an index by user and then timestamp the engine will already have the solution of your query by simply reading the index.
So I would create this one:
CREATE index posts_user_and_timestamp_idx ON posts(userid, timestamp DESC);
And this index could be reused for all queries where you are simply filtering by users (like the phone book. You can easily extract pages about one city). But not for queries where the only filter is the timestamp (you would need an index on the timestamp only, hard to extract all smith on all cities from the phone book).
So in fact the main problem of index is that they heavily depends on the queries you are usually using on the database. If you are never using the same sort of queries on a table then you will need a lot of different indexes. And an index is something which takes a looot of place. Most tables are using 3 or 4 more physical space for indexes than for the data.
You should find a MySQL admin tool that works for you since schema changes to your dbs, including adding indexes are a very common task.
I use MySQL Workbench to do most of the schema manipulation, including setting indexes on tables. This is a free admin app for mySQL dbs. If you dont have it, download it.
http://dev.mysql.com/downloads/workbench/5.1.html
Open your db in Workbench, right click on the table to add the index to and choose Alter Table... Then click on indexes at the bottom of the window, you should see something similar to:
You can also use PHPMyAdmin, which is a little more complex and a little harder to instal, IMHO.
I drilled down into my Program Files directory (Windows XP) to find the PHPMyAdmin executable file - which launched the app.
From PHPMyAdmin 3.2.1 - open your schema. Click on the table - which presents you with a GUI menu that will allow you to easily specify an index using the icon with the lighting bolt to the right of the column to be indexed.
You only need to add an index once. No need to worry about doing anything after every INSERT. Based-on what you have in your post, I would try something like this:
CREATE INDEX posts_userid_idx ON posts(userid);
If that doesn't seem to work very well, I would then advise you to check the MySQL Documentation on CREATE INDEX and see if any of the available options would apply to your situation.
Based-on your (revised) comment, you should also add a PRIMARY KEY on postid, as well.
ALTER TABLE posts ADD PRIMARY KEY (postid);
And yes, you should be able to run both of those commands in MySQL Workbench as you would any other query.

Joining a table stored within a column of the results

I want to try and keep this as one query and not use PHP, but it's proving to be tough.
I have a table called applications, that stores all the applications and some basic information about them.
Then, I have a table with all the types of applications in it, and that table contains a reference to another table which stores more specific data about the specific type of application in question.
select applications.id as appid, applications.category, type.title as type, type.id as tid, type.valuefld, type.tablename
from applications
left join type on applications.typeid=type.id
left join department on type.deptid=department.id
where not isnull(work_cat)
and work_cat != ''
and applications.deleted=0
and datei between '10-04-14' and '11-04-14'
order by type, work_cat
Now, in the old version, there is another query on every single result. Over hundreds of results... that sucks.
This is the query I'd like to integrate so I can get all the data in one result row. (Old is ASP, I'm re-writing it in PHP)
query = "select sum("&adors.fields("valuefld")&") as cost, description from "&adors.fields("tablename")&" where appid = '"&adors.fields("tablename")&"'"
Prepared statements, I'm aware, are the best solution, but for now they are not an option.
You can't do this with a plain SQL query - you need to have a defined set of tables that your query is based on. The fact that your current implementation queries from whatever table is named by tablename from the first result-set means that to get this all in one query, you will have to restructure your data. You have to know what tables you're querying from rather than having it dynamic.
If the reason for these different tables is the different information stored in each requiring different record (column) structures, you might want to look into Key/Value pair storage in a large table. Once you combine the dynamically named ones into a single location you can integrate your two queries together.