Higher cardinality column first in an index when involving a range? - mysql

CREATE TABLE `files` (
`did` int(10) unsigned NOT NULL DEFAULT '0',
`filename` varbinary(200) NOT NULL,
`ext` varbinary(5) DEFAULT NULL,
`fsize` double DEFAULT NULL,
`filetime` datetime DEFAULT NULL,
PRIMARY KEY (`did`,`filename`),
KEY `fe` (`filetime`,`ext`), -- This?
KEY `ef` (`ext`,`filetime`) -- or This?
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;
There are a million rows in the table. The filetimes are mostly distinct. There are a finite number of ext values. So, filetimehas a high cardinality and ext has a much lower cardinality.
The query involves both ext and filetime:
WHERE ext = '...'
AND filetime BETWEEN ... AND ...
Which of those two indexes is better? And why?

First, let's try FORCE INDEX to pick either ef or fe. The timings are too short to get a clear picture of which is faster, but `EXPLAIN shows a difference:
Forcing the range on filetime first. (Note: The order in WHERE has no impact.)
mysql> EXPLAIN SELECT COUNT(*), AVG(fsize)
FROM files FORCE INDEX(fe)
WHERE ext = 'gif' AND filetime >= '2015-01-01'
AND filetime < '2015-01-01' + INTERVAL 1 MONTH;
+----+-------------+-------+-------+---------------+------+---------+------+-------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+-----------------------+
| 1 | SIMPLE | files | range | fe | fe | 14 | NULL | 16684 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+-------+-----------------------+
Forcing the low-cardinality ext first:
mysql> EXPLAIN SELECT COUNT(*), AVG(fsize)
FROM files FORCE INDEX(ef)
WHERE ext = 'gif' AND filetime >= '2015-01-01'
AND filetime < '2015-01-01' + INTERVAL 1 MONTH;
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
| 1 | SIMPLE | files | range | ef | ef | 14 | NULL | 538 | Using index condition |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
Clearly, the rows says ef is better. But let's check with the Optimizer trace. The output is rather bulky; I'll show only the interesting parts. No FORCE is needed; the trace will show both options then pick the better.
...
"potential_range_indices": [
...
{
"index": "fe",
"usable": true,
"key_parts": [
"filetime",
"ext",
"did",
"filename"
]
},
{
"index": "ef",
"usable": true,
"key_parts": [
"ext",
"filetime",
"did",
"filename"
]
}
],
...
"analyzing_range_alternatives": {
"range_scan_alternatives": [
{
"index": "fe",
"ranges": [
"2015-01-01 00:00:00 <= filetime < 2015-02-01 00:00:00"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 16684,
"cost": 20022, <-- Here's the critical number
"chosen": true
},
{
"index": "ef",
"ranges": [
"gif <= ext <= gif AND 2015-01-01 00:00:00 <= filetime < 2015-02-01 00:00:00"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 538,
"cost": 646.61, <-- Here's the critical number
"chosen": true
}
],
...
"attached_conditions_computation": [
{
"access_type_changed": {
"table": "`files`",
"index": "ef",
"old_type": "ref",
"new_type": "range",
"cause": "uses_more_keyparts" <-- Also interesting
}
}
With fe (range column first), the range could be used, but it estimated scanning through 16684 rows fishing for ext='gif'.
With ef (low cardinality ext first), it could use both columns of the index and drill down more efficiently in the BTree. Then it found an estimated 538 rows, all of which are useful for the query -- no further filtering needed.
Conclusions:
INDEX(filetime, ext) used only the first column.
INDEX(ext, filetime) used both columns.
Put columns involved in = tests first in the index regardless of cardinality.
The query plan won't go beyond the first 'range' column.
"Cardinality" is irrelevant for composite indexes and this type of query.
("Using index condition" means that the Storage Engine (InnoDB) will use columns of the index beyond the one used for filtering.)

Related

How to convert tinyint back to boolean when reading from SQL?

`mysql> select * from movies;
+----------+-------+---------+
| movie_id | title | watched |
+----------+-------+---------+
| 1 | bo | 0 |
| 2 | NEW | 0 |
| 3 | NEW 2 | 0 |
+----------+-------+---------+
CREATE TABLE MOVIES (
movie_id INTEGER NOT NULL AUTO_INCREMENT,
title VARCHAR(50) NOT NULL,
watched BOOLEAN NOT NULL,
PRIMARY KEY (movie_id)
);
`
I am having to store the "watched" field as a tiny int instead of typical boolean, I am trying to find a way of converting it back to boolean when reading from table, so I dont have to loop through all responses and convert manually.
ie. {movie_id: 1, title: 'bo', watched: 0} ---> {movie_id: 1, title: 'bo', watched: false}
I have tried select cast but am unfamiliar with the syntax
MySQL saves Boolean as 0 and 1 as it handles all Boolean that way.
It is very practical, then you can add true or false from a comparison in a SUM without CASE WHEN or a FILTER
You need still to make a condition to give bak True or False, but they only text of course
SELECT
movie_id , title ,
CASE WHEN watched = 0 THEN 'False' ELSE 'True' END IF
This is similar to 'IF' in 'SELECT' statement - choose output value based on column values
Borrowing from the answer there,
SELECT movie_id, IF (watched > 0, true, false) as bwatched, ...
Note that this assumes your schema still includes "NOT NULL" for watched. Without that (or some extra code) NULL would become false.
The way "IF()" works is IF(expression , value / expression if true, v /e if false)

jq select compare two max's of array within object

Let's say I have to following JSON file:
{
"1.1.1.1": {
"history_ban": [
"2021-05-02 14:30",
"2022-01-01 12:00"
],
"history_unban": [
"2021-05-09 14:30",
"2022-01-08 12:00"
]
},
"2.2.2.2": {
"history_ban": [
"2022-01-16 07:00"
],
"history_unban": []
},
"3.3.3.3": {
"history_ban": [
"2022-01-15 22:40"
]
}
}
My goals is to get all the keys where:
Max history_ban date is smaller than "2022-01-16 09:00"
Max history_unban date is empty/non-existent or smaller then Max history_ban date
I believe I have the majority of the query working as I wanted, but the 'Compare max unban with max ban' is not working. My current (not working) query is as follows:
to_entries[] | select((.value.history_ban != null) and (.value.history_ban | max < "2022-01-16 09:00") and ((.value.history_unban | length == 0 ) or (.value.history_unban | max < .value.history_ban | max))) | .key
I know my error is within (.value.history_unban | max < .value.history_ban | max) because, if I replace it with (.value.history_unban | max < "somedate") I get a working query.
The error I get is
jq: error (at :22): Cannot index array with string "value"
exit status 5
What do I need to do to select/compare these two max values?
Just to be sure, my expected result in this example is
"2.2.2.2"
"3.3.3.3"
You could use the update operator // to introduce another constraint if history_unban is existent and not empty.
jq -r '
to_entries[] | select(.value
| (.history_ban | max) as $maxban
| $maxban > "2022-01-16 09:00"
and (.history_unban | length == 0 // $maxban > max)
).key
'
2.2.2.2
3.3.3.3
Demo

Json hash save into relational database in rows and columns

I want to read the values from json and need to create a new json so is there any way that
we can save json in table and columns in oracle that will help to perform calculation on that. calculation is too complax.
Here is the json sample and json has many hash and
{
"agri_Expense": {
"input": 6000,
"max": 7500,
"check": 7500
},
"income3": {
"Hiring_income": 239750
},
"Operational_Cost1": [
{
"Field_input3": 10000,
"Minimum": "0.05",
"Check_Input": 26750,
"Tractor_Cost": "Maintenance"
}
]
}
You do not need PL/SQL, and can do it entirely in SQL.
I want to read the values from json [...] so is there any way that
we can save json in table and columns in oracle
Yes, use SQL to create a table:
CREATE TABLE table_name ( json_column CLOB CHECK ( json_column IS JSON ) )
and then INSERT the value there:
INSERT INTO table_name ( json_column ) VALUES (
'{'
|| '"agri_Expense": {"input": 6000,"max": 7500,"check": 7500},'
|| '"income3": {"Hiring_income": 239750},'
|| '"Operational_Cost1": [{"Field_input3": 10000,"Minimum": "0.05","Check_Input": 26750,"Tractor_Cost": "Maintenance"}]'
|| '}'
)
then, if you want individual values, SELECT using JSON_TABLE:
SELECT j.*
FROM table_name t
CROSS JOIN JSON_TABLE(
t.json_column,
'$'
COLUMNS (
agri_expense_input NUMBER PATH '$.agri_Expense.input',
agri_expense_max NUMBER PATH '$.agri_Expense.max',
agri_expense_check NUMBER PATH '$.agri_Expense.check',
income3_hiring_income NUMBER PATH '$.income3.Hiring_income',
NESTED PATH '$.Operational_Cost1[*]'
COLUMNS (
oc1_field_input3 NUMBER PATH '$.Field_input3',
oc1_minimum NUMBER PATH '$.Minimum',
oc1_check_input NUMBER PATH '$.Check_Input'
)
)
) j
Which outputs:
AGRI_EXPENSE_INPUT | AGRI_EXPENSE_MAX | AGRI_EXPENSE_CHECK | INCOME3_HIRING_INCOME | OC1_FIELD_INPUT3 | OC1_MINIMUM | OC1_CHECK_INPUT
-----------------: | ---------------: | -----------------: | --------------------: | ---------------: | ----------: | --------------:
6000 | 7500 | 7500 | 239750 | 10000 | .05 | 26750
db<>fiddle here

Search in mysql json column to check multiple criteria on same index of array

I'm using MYSql server 8.0.17.
I want to get record with uId= 'UR000001' and also with VIEW = 'Y' from the security column(Shown in Table).
Viewid Security
VW0000000002 {"security": [{"uId": "UR000001", "edit": "N", "view": "Y"}, {"uId": "UR000002", "edit": "N", "view": "Y"}]}
VW0000000013 {"security": [{"uId": "UR000001", "edit": "N", "view": "N"}, {"uId": "UR000002", "edit": "N", "view": "Y"}]}
VW0000000014 {"security": [{"uId": "UR000001", "edit": "N", "view": "Y"}, {"uId": "UR000002", "edit": "N", "view": "Y"}]}
JSON_SEARCH function searches all array elements of the record that I don't want.
Here is the query that I had tried but it returns the result with all matching (uID='UR000001' OR View='Y')
SELECT viewid,
Json_search(`security`, 'one', 'UR000001', NULL, '$.security[*].uId'),
Json_search(`security`, 'one', 'Y', NULL, '$.security[*].view')
FROM vw_viewmaster
WHERE Json_search(`security`, 'one', 'UR000001', NULL, '$.security[*].uId')
AND Json_search(`security`, 'one', 'Y', NULL, '$.security[*].view');
Actual Result:(uID='UR000001' OR View='Y')
VW0000000002 "$.security[0].uId" "$.security[0].view"
VW0000000013 "$.security[0].uId" "$.security[1].view"
VW0000000014 "$.security[0].uId" "$.security[0].view"
Expected result:(uID='UR000001' AND View='Y')
VW0000000002 "$.security[0].uId" "$.security[0].view"
VW0000000014 "$.security[0].uId" "$.security[0].view"
In MySQL 8.0, you can use handy JSON function json_table() to convert a json array to rows. You can then search the resultset.
The following query gives you all viewids whose at least one array element with attribute uId is equal to 'UR000001' and attribute view is 'Y':
select v.viewid
from vw_viewmaster v
where exists (
select 1
from json_table(
v.security -> '$.security',
'$[*]'
columns(
uid varchar(50) path '$.uId',
edit varchar(1) path '$.edit',
view varchar(1) path '$.view'
)
) x
where x.uid = 'UR000001' and x.view = 'Y'
);
For your dataset, this produces:
| viewid |
| ------------ |
| VW0000000002 |
| VW0000000014 |
If you want the details of the matching array object(s), then:
select v.viewid, x.*
from vw_viewmaster v
cross join json_table(
v.security -> '$.security',
'$[*]'
columns(
rowid for ordinality,
uid varchar(50) path '$.uId',
edit varchar(1) path '$.edit',
view varchar(1) path '$.view'
)
) x
where x.uid = 'UR000001' and x.view = 'Y'
As a bonus, rowid gives you the index of the matching object in the JSON array (the first object has index 1).
This yields:
| viewid | rowid | uid | edit | view |
| ------------ | ----- | -------- | ---- | ---- |
| VW0000000002 | 1 | UR000001 | N | Y |
| VW0000000014 | 1 | UR000001 | N | Y |
However please note that if more than one object in the array that satisfies the conditions, the above query would generate more than one row per row in the original table (this is why I used exists in the first query).

SQL Server - Creating Filtered Indexes

I have a table with a dateTime column on which I want to put a filtered index. This index will get rebuilt each week. Each time it's rebuilt, I want it to include rows two days old and newer, based on this column. Can I create such a filtered index? I've tried various approaches and I get syntax errors.
For example, the following Where clause on the index creation did not work:
WHERE (ReadTime > DateAdd(dd,-2,GetDate()))
You can't do this by referencing getdate() directly. You would need dynamic SQL.
The CREATE INDEX grammar only allows comparisons against a constant.
CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name
ON <object> ( column [ ASC | DESC ] [ ,...n ] )
[ INCLUDE ( column_name [ ,...n ] ) ]
[ WHERE <filter_predicate> ]
... /*Irrelevant grammar removed*/
<filter_predicate> ::=
<conjunct> [ AND <conjunct> ]
<conjunct> ::=
<disjunct> | <comparison>
<disjunct> ::=
column_name IN (constant ,...n)
<comparison> ::=
column_name <comparison_op> constant
<comparison_op> ::=
{ IS | IS NOT | = | <> | != | > | >= | !> | < | <= | !< }