I'm actually working on a Symfony project at work and we are using Lucene for our search engine.
I was trying to use SQLite in-memory database for unit tests (we are using MySQL) but I stumbled upon something.
The search engine part of the project use Lucene indexing. Basically, you query it and you get an ordered list of ids, which you can use to query your database with a Where In() clause.
The problem is that there is an ORDER BY Field(id, ...) clause in the query, which order the result in the same order as the results returned by Lucene.
Is there any alternative to ORDER BY Field using SQLite ? Or is there another way to order the results the same way Lucene does ?
Thanks :)
Edit:
Simplified query might looks like this :
SELECT i.* FROM item i
WHERE i.id IN(1, 2, 3, 4, 5)
ORDER BY FIELD(i.id, 5, 1, 3, 2, 4)
This is quite nasty and clunky, but it should work. Create a temporary table, and insert the ordered list of IDs, as returned by Lucene. Join the table containing the items to the table containing the list of ordered IDs:
CREATE TABLE item (
id INTEGER PRIMARY KEY ASC,
thing TEXT);
INSERT INTO item (thing) VALUES ("thing 1");
INSERT INTO item (thing) VALUES ("thing 2");
INSERT INTO item (thing) VALUES ("thing 3");
CREATE TEMP TABLE ordered (
id INTEGER PRIMARY KEY ASC,
item_id INTEGER);
INSERT INTO ordered (item_id) VALUES (2);
INSERT INTO ordered (item_id) VALUES (3);
INSERT INTO ordered (item_id) VALUES (1);
SELECT item.thing
FROM item
JOIN ordered
ON ordered.item_id = item.id
ORDER BY ordered.id;
Output:
thing 2
thing 3
thing 1
Yes, it's the sort of SQL that will make people shudder, but I don't know of a SQLite equivalent for ORDER BY FIELD.
Related
I have a items table with category_id field.
There is a specific rule to order the items by category_id.
I usually sort the data like this:
SELECT * FROM items ORDER BY FIELD(category_id, 2, 5, 1, 4, 3)
-- In this example, the "rule" is sorting in order of (2, 5, 1, 4, 3)
In this case, simply creating an index on category_id field does not work to speed up sorting items, because the index sorts the category_id just ascending like (1, 2, 3, 4, 5).
Is it possible to specify the sorting rule when I CREATE INDEX on category_id field?
(And then simply SELECT * FROM items ORDER BY category_id works)
Or do I have to create another field like sorted_category_id which is sorted according to the order rule?
Adding the column to the items table, with an index on it, would indeed be a solution focused on speed. By making it a generated column, you ensure consistency, and by making it a virtual column, you can move the extra data into an index (if you create it). So proceed like this:
ALTER TABLE items ADD (
category_ord int GENERATED ALWAYS AS (FIELD(category_id, 2, 5, 1, 4, 3)) VIRTUAL
);
CREATE INDEX idx_items_category_ord ON items(category_ord);
SELECT * FROM items ORDER BY category_ord;
Alternative
Alternatively, the normalised way is to add a column to the category table. This will have a slight performance impact if you have many categories, but does not pose that consistency problem, and saves space. To implement that idea, proceed as follows:
If you don't have that category table, then create it:
CREATE TABLE category(
id int NOT NULL PRIMARY KEY,
ord int NOT NULL,
name varchar(100)
);
Populate the ord field (or whatever you want to call it) as desired:
INSERT INTO category(id, ord, name) VALUES
(1, 30, 'cat1'),
(2, 10, 'cat2'),
(3, 50, 'cat3'),
(4, 40, 'cat4'),
(5, 20, 'cat5');
And add an index on the ord column:
CREATE INDEX category_ord ON category(ord);
Now the query would be:
SELECT *
FROM items
INNER JOIN category
ON items.category_id = category.id
ORDER BY category.ord;
The database engine can now decide to use the index on the ord column (or not), depending on its own analysis. If you want to force the use of it, you can use FORCE INDEX:
SELECT *
FROM items
INNER JOIN category FORCE INDEX category(category_ord)
ON items.category_id = category.id
ORDER BY category.ord;
Note that the engine can use your index on the items.category_id as well, for value by value lookup.
Like Akina says, I can use Generated Columns.
https://dev.mysql.com/doc/refman/8.0/en/create-index.html
I have 2 tables in MySQL that looks like this:
Note that category_id is FOREIGN KEY of category_key in categories table
I would like to INSERT to a third table (named "catalog") catalog items, such that every catalog ID will be generated from it's category and sub-category ID with auto increment number for each of concatenation of category-sub-category ids.
For Example:
Suppose we have a row in "categories" table which the category_key is "ABC", and we also have a row in "sub_categories" table which the sub_category_key is "EFG", so one row in the catalog table will have the key "ABC-EFG-0001", the other on the same sub-category will have "ABC-EFG-0002", and so on.
And for other ID values, for example. category_id="OMG" and sub_category_id="YYY", the increment of the number will start from 1, so the ID will be "OMG-YYY-0001".
Can I have an example for an INSERT query to do that?
Thanks!
You can use ROW_NUMBER() (available in MySQL 8.0) to assign the numbers and LPAD() to format them. Note that you don't need to use table categories, since all relevant information is available in sub_categories (but if you do need it, you just have to add a JOIN).
INSERT INTO catalog
SELECT
sc.*,
CONCAT(
category_key,
'-',
sub_category_key,
'-',
LPAD(
ROW_NUMBER() OVER(PARTITION BY category_key, sub_category_key ORDER BY sub_category_key),
4,
0
)
)
FROM sub_categories sc
I am getting problem while adding multiple rows at the same time
I have a mysql table in which Replace multiple rows of the same id
Suppose I have 2 columns
1) offer_id
2)categories
By using php script I am replacing all the rows day by day,So I add unique key for the offer_id and categories
but the problem is that,when there are the two values containing
1) offer_id=2 and categories = ecomm
2) offer_id=2 and categories = market
my query will run as follows like
REPLACE INTO `affiliate_offer_get_categories_icube` (`offer_id`, `net_provider`, `categories`) VALUES
(2, 'icube', 'Marketplace');
REPLACE INTO `affiliate_offer_get_categories_icube` (`offer_id`, `net_provider`, `categories`) VALUES
(2, 'icube', 'Ecoommerce');
In above statements I have to add two rows of same 'offer_id' but different 'categories'.
but I am getting in result only one row(I have to add values of both categories.)
Sounds like you need your unique index to span over both columns. Drop the unique index you have and create a new one with
CREATE INDEX idx_whatever_name ON your_tablename (offer_id, categories);
I have a database schema with two tables, song and edited_song. These tables are identical, except for one extra column in edited_song called deleted. The edited_song-table contains a reference to the id in the song-table. I want to find all the songs which aren't deleted.
I have a UNION-statement in which I GROUP on the id of the result of two SELECT-statements. I want to exclude results where the deleted column has the value 1. An example of the setup can be seen here.
CREATE TABLE if not exists song
(
id int(11) NOT NULL auto_increment ,
title varchar(255),
PRIMARY KEY (id)
);
CREATE TABLE if not exists editedsong
(
id int(11) NOT NULL auto_increment ,
title varchar(255),
deleted tinyint(1),
PRIMARY KEY (id)
);
INSERT INTO song (id, title) VALUES
(1, 'Born in the USA');
INSERT INTO editedsong (id, title, deleted) VALUES
(1, 'Born in the USA', 1);
And the query is here:
SELECT * FROM
((SELECT *, 0 AS deleted FROM song WHERE id=1)
UNION
(SELECT * FROM editedsong WHERE id=1)) AS song
WHERE song.deleted!=1
GROUP BY song.id
The UNION-statement is used instead of a join as there is a LOT of text in these two tables and a join results in writing to disk. This is a simplified form of the real query, but it reproduces the problem I'm experiencing. I would expect the query to yield no results as the GROUP BY should preserve the first row and throw away all following. Why doesn't it do this? Is it because the WHERE is executed before the GROUP BY? If it is, what is a good way to overcome this problem?
http://sqlfiddle.com/#!2/5cdb6c/3
The reason that the code in the SQLFiddle doesn't work is that the WHERE clause is excluding the deleted record from editedsong before the GROUP BY is executed.
You can use HAVING to apply criteria after a GROUP BY clause.
This appears to work:
SELECT *, max(deleted) as md FROM
((SELECT *, 0 AS deleted FROM song)
UNION
(SELECT * FROM editedsong)) AS song
-- WHERE song.deleted!=1
GROUP BY song.id
HAVING md != 1
This returns the record from song, not the record from editedsong for records that haven't been deleted. If you want the other, reverse the order of the items in the UNION clause.
This syntax for GROUP BY is unusual, and I'm surprised it's supported. Most database systems I've worked with require every field in the output to have some treatment specified (MAX, COUNT, GROUP BY, etc). So a SELECT * is incompatible with GROUP BY. MySQL must be making some assumption or have some default behaviours here, but I think most servers wouldn't like it (me either).
I'm using MySQL and I have a database that I'm trying to use as the canonical version of data that I will be storing in different indexes. That means that I have to be able to retrieve a set of data quickly from it by primary key, but I also need to sort it on the way out. I can't figure out how to let MySQL efficiently do that.
My table looks like the following:
CREATE TABLE demo.widgets (
id INT AUTO_INCREMENT,
-- lots more information I need
awesomeness INT,
PRIMARY KEY (id),
INDEX IDX_AWESOMENESS (awesomeness),
INDEX IDX_ID_AWESOMENESS (id, awesomeness)
);
And I want to do something along the lines of:
SELECT *
FROM demo.widgets
WHERE id IN (
1,
2,
3,
5,
8,
13 -- you get the idea
)
ORDER BY awesomeness
LIMIT 50;
But unfortunately I can't seem to get good performance out of this. It always has to resort to a filesort. Is there a way to get better performance from this setup, or do I need to consider a different database?
This is explained in the documentation ORDER BY Optimization. In particular, if the index used to select rows in the WHERE clause is different from the one used in ORDER BY, it won't be able to use an index for ORDER BY.
In order to get an optimized query to fetch and retrieve like that, you need to have a key that orders by sort and then primary like so:
create table if not exists test.fetch_sort (
id int primary key,
val int,
key val_id (val, id)
);
insert into test.fetch_sort
values (1, 10), (2, 5), (3, 30);
explain
select *
from test.fetch_sort
where id in (1, 2, 3)
order by val;
This will give a query that only uses the index for searching/sorting.