Couchbase Secondary index on filter - couchbase

I am working with Couchbase 6.0. I know that we can create Secondary index on a filter.
So I have created a index like
CREATE INDEX idx_zipcode
ON userbucket(zipcode)
WHERE status = "active";
I have a question here:
Can I create an index on a filter clause if field is dynamic.
Something like this
CREATE INDEX idx_zipcode
ON userbucket(zipcode)
WHERE status = ? ;
Second question is,
which one is better in terms of performance:
Single index on 2 fields
CREATE INDEX idx_1 ON userbucket('fname','lname')
or
Separate Index on each field
CREATE INDEX idx_1 ON userbucket('fname')
CREATE INDEX idx_2 ON userbucket('lname')

No we can not create the index with the dynamic clause mentioned accepting a bind variable.
However, when it is sure that the status and zipcode are part of predicates and dynamic in nature the index like below would be handy.
CREATE INDEX idx_zipcode_status ON userbucket(zipcode, status);
Refer Couchbase Index Creation Blog - right performance.
Regarding the second query the same principle applies as
Index selection for a query solely depends on the filters in the WHERE
clause of your query
Secondary Composite index is also okay when you have both or leading columns in your query.
CREATE INDEX idx_1 ON userbucket('fname','lname')
The above index would be exploited by queries like:
SELECT * FROM userbucket WHERE fname= 'fnam' AND lname= 'lnam';

Related

How to index table for different single-column clauses

I have the following table:
CREATE TABLE Test (
device varchar(12),
pin varchar(4),
authToken varchar(32),
Primary Key (device)
);
At different points of the application I need to query this table by different single column clause. Meaning I have the following queries:
SELECT * FROM Test WHERE device = ?;
SELECT * FROM Test WHERE authToken = ?;
SELECT * FROM Test WHERE pin = ?;
As I understand it, in this scenario a combined index of (device, authToken, pin) makes no sense, because that would only speed up the first query, not the second or third.
Reading speed is more important than writing for this table, so would simply indexing each column individually be the optimal solution here?
The straightforward answer is to create separate single-column indexes for each query:
create index ix1 (device); -- no need to create it since it's the PK.
create index ix2 (pin);
create index ix3 (authToken);
The first index (from the PK) uses the primary index. The second and third ones could be slower since they suffer from the "secondary index" slowness: they always need to access the secondary index first, then access the primary index; this could becomes slow if you are selecting a high number of rows.
Now, if you want to go overboard in terms of SELECT speed at the expense of slowness on modifications (INSERT, UPDATE, and DELETE), you can use "covering indexes" tailored to each query. These should look like:
create index ix4 (device, pin, authToken); -- [non needed] optimal for WHERE device = ?
create index ix5 (authToken, device, pin); -- optimal for WHERE authToken = ?
create index ix6 (pin, device, authToken); -- optimal for WHERE pin = ?
Note: As indicated by Rick James ix4 is redundant with the primary key index InnoDB tables have. There's no need to create it. It's listed here only for completeness.
These "covering indexes" only use the secondary indexes, resolving the query without the need of accessing the primary index at all. They are much faster for high number of rows retrieved.
You don't need to index pin column as it's already indexed. For other 2 columns (i.e. device and authToken), yes as per your shared queries, it's better to keep them both indexed individually.
Please note that you will see a big performance improvement when you have high number of such queries hitting the server where you also have huge dataset on this table.
To answer:
"How to index table for different single-column clauses?"
CREATE INDEX Test_device_index ON Test(device);
CREATE INDEX Test_authToken_index ON Test(authToken DESC);
CREATE INDEX Test_pin_index ON Test(pin);
Here's the schema I'd suggest:
CREATE TABLE Test (
id SERIAL PRIMARY KEY,
device VARCHAR(255),
pin VARCHAR(255),
authToken VARCHAR(255),
UNIQUE KEY index_authToken (authToken),
UNIQUE KEY index_device (device),
KEY index_pin (pin)
);
Where you have an id type column that's not associated with any particular data, and you have UNIQUE constraints on authToken and device.
Remember to have any column used in a WHERE indexed and test your coverage with things like:
EXPLAIN SELECT ... FROM Test WHERE pin=?
If you see "table scan" in the plan then that's a problem of missing indexes.
It's also a good idea to use VARCHAR(255) as a default unless you have a very compelling reason to restrict it. Enforce length restrictions in your application layer where they can easily be relaxed later. For example, changing to 6-digit PIN vs. 4 is a simple code change and can even be rolled out incrementally, it's not a schema alteration.

CREATE UNIQUE INDEX using where clause

I'm having trouble using a where clause on CREATE UNIQUE INDEX on mysql. I know you cannot just add where clause at the end of CREATE UNIQUE INDEX. Example below.
CREATE UNIQUE INDEX FAKE_TABLE_INDEX ON TABLE_NAME (COLUMN_NAME) WHERE INACTIVE = 0;
and this query above gives me an error. So is there alternative I can do to fix this query.
MySQL doesn't have filtered indexes. If I understand what they do (from reading the Microsoft docs) I think the closest analogous feature is multi-column indexes:
CREATE INDEX fake_table_index ON table_name (inactive, column_name);
This is more expensive than the filtered index because it indexes all the values of inactive, not just where inactive = 0.
This also doesn't have the unique constraint that the filtered index does. It's only useful for optimizing queries, not enforcing uniqueness. You'll have to do that with a trigger if you need it.

What is the difference between single or composite column indexes? [duplicate]

This question already has answers here:
When should I use a composite index?
(9 answers)
Closed 7 years ago.
In any relational Databases, we can create indexes that boost query speed. But creating more index can damage update/insert speed because the Db system will have to update each index when new data coming (insert, update, merge etc)
We use an example.
we can create a index called index1
ADD INDEX index1 (order_id ASC, buyer_id ASC)
OR we can create 2 indexes, index2 and index3
ADD INDEX index2 (order_id ASC)
ADD INDEX index3 (buyer_id ASC)
In a query like this
select * from tablename where order_id>100 and buyer_id>100
Which one is faster? By using Index1 or index2 and index3?
On the other side of the equation, when inserting or updating, I assume it will be much faster to just use one index instead of 2 but I haven't tested it against MySql or MSSQL server so I can't be so sure. If anyone has experience on that matter, do share it please.
And the last thing is about int typed values, I thought it's not possible or relevant to create a index just for int type columns because it doesn't boost the query time, is it true?
The performance of an index are linked to its selectivity, the fact of using two indexes, or a composite index must be assessed, in the context of its application or query particularly critical as regards the performance just by virtue of where on the fields as a possible the index reduces the number of rows to process (and put into joins).
In your case, since an order usually only one buyer is not very selective index order_id, buyer_id (pleasant its usefulness to join operations) as it would be more the reverse, buyer_id, order_id for facilitating the search for orders of a buyer
For the exact query you mentioned I would personally go for index1 (you will have a seek operation for both conditions at once). The same index should also do the job even if you filter by order_id only (because order id is the first column of the index, so the same BTREE structure should still help even if you omit the buyer).
At the same time index1 would not help much if you filter by buyer_id only (because the BTREE will be structured firstly by the missing order_id as per the index creation statement). You will probably end up with index scan with index1, while having separate indices would still work in that scenario (a seek on index3 is what should be expected).

What happens if I put index on each column in a table

Let us consider I have a table with 60 columns , I need to perform all kind of queries on that table and need to join that table with other tables as well. And I almost using all rows for searching data in that table including other tables. This table is the like a primary table(like a primary key) in the database. So all table are in relation with this table.
By considering the above scenario can I create index on each column on the table (60 columns )
,is it good practice ?
In single sentence:
Is it best practice to create index on each column in a table ?
What might happens if I create index on each column in a table?
Where index might be "Primary key", "unique key" or "index"
Please comment, if this question is unclear for you people I will try to improve this question.
MySQL's documentation is pretty clear on this (in summary use indices on columns you will use in WHERE, JOIN, and aggregation functions).
Therefore there is nothing inherently wrong with creating an index on all columns in a table, even if it is 60 columns. The more indices there are the slower inserts and some updates will be because MySQL has to create the keys, but if you don't create the indices MySQL has to scan the entire table if only non-indexed columns are used in comparisons and joins.
I have to say that I'm astonished that you would
Have a table with 60 columns
Have all of those columns used either in a JOIN or WHERE clause without dependency on any other column in the same table
...but that's a separate issue.
It is not best practice to create index on each column in a table.
Indexes are most commonly used to improve query performance when the column is used in a where clause.
Suppose you use this query a lot:
select * from tablewith60cols where col10 = 'xx';
then it would be useful to have an index on col10.
Note that primary keys by default have an index on them, so when you join the table with other tables you should use the primary key to join.
Adding an index means that the database has to maintain it, that means that it has to be updated, so the more writes you have, the more the index will be updated.
Creating index out of the box is not a good idea, create an index only when you need it (or when you can see the need in the future... only if it is pretty obvious)
creating more index in SQL will increase only search speed while you will get slowness of insert and update and also it will take more storage.

MySQL Index + Query Processing

Assume I have this table:
create table table_a (
id int,
name varchar(25),
address varchar(25),
primary key (id)
) engine = innodb;
When I run this query:
select * from table_a where id >= 'x' and name = 'test';
How will MySQL process it? Will it pull all the id's first (assume 1000 rows) then apply the where clause name = 'test'?
Or while it looks for the ids, it is already applying the where clause at the same time?
As id is the PK (and no index on name) it will load all rows that satisfy the id based criterion into memory after which it will filter the resultset by the name criterion. Adding a composite index containing both fields would mean that it would only load the records that satisfy both criteria. Adding a separate single column index on the name field may not result in an index merge operation, in which case the index would have no effect.
Do you have indexes on either column? That may affect the execution plan. The other thing is one might cast the 'x'::int to ensure a numeric comparison instead of a string comparison.
For the best result, you should have a single index which includes both of the columns id and name.
In your case, I can't answer the affect of the primary index to that query. That depends on DBMS's and versions. If you really don't want to put more index (because more index means slow write and updates) just populate your table with like 10.000.000 random results, try it and see the effect.
you can compare the execution times by executing the query first when the id comes first in the where clause and then interchange and bring the name first. to see an example of mysql performance with indexes check this out http://www.mysqlperformanceblog.com/2006/06/02/indexes-in-mysql/
You can get information on how the query is processed by running EXPLAIN on the query.
If the idea is to optimize that query then you might want to add an index like:
alter table table_a add unique index name_id_idx (name, id);