SQL Dense and Spare Indexes - mysql

I am preparing for my Database final and I would like to understand these two questions. Could you please explain to me which is the correct answer and why it is correct.
Suppose that you have a relation with the schema R(X, Y, Z). Every value of X is unique, but the
other columns could have duplicate values. Assume that a sparse index is created for relation R on attribute X. Which of the following queries would use this index effectively?
(a) SELECT MAX(X)
FROM R
(b) SELECT MAX(Y)
FROM R
GROUP BY X
(c) SELECT *
FROM R
WHERE X <> 30
(d) SELECT MAX(Y)
FROM R
WHERE X = 23
(e) none of the above uses the index effectively
I believe (a) could be the correct answer as we have an index for X and they are all unique values.
Suppose that you have a relation with the schema R(X, Y, Z). Every value of X is unique, but the
other columns could have duplicate values. Assume that a dense index is created for relation R on attributes X and Y. Which of the following queries would use this index effectively?
(a) SELECT *
FROM R
WHERE X < Y
(b) SELECT DISTINCT X, Y
FROM R
WHERE X = 23 AND Y > 39
(c) SELECT X, Y
FROM R
(d) SELECT X
FROM R
WHERE Y = 23
(e) none of the above uses the index effectively
I believe (c) could be the correct answer as we have an indexes for both X and Y.

MySQL does not have "dense" vs "sparse". Here are the optimal indexes:
(a) SELECT MAX(X) FROM R -- INDEX(X)
(b) SELECT MAX(Y) FROM R GROUP BY X -- INDEX(x,y); INDEX(x) is not as good
(c) SELECT * FROM R WHERE X <> 30 -- No index is _likely_ to be useful
(d) SELECT MAX(Y) FROM R WHERE X = 23 -- Does not make sense if X is unique
(e) none of the above uses the index effectively -- Some: a,b,d
and
(a) SELECT * FROM R WHERE X < Y -- no index
(b) SELECT DISTINCT X, Y FROM R WHERE X = 23 AND Y > 39 -- Dumb due to uniqueness
(c) SELECT X, Y FROM R -- no index; just read the table
(d) SELECT X FROM R WHERE Y = 23 -- INDEX(Y,X), or, not as good, INDEX(Y)
(e) none of the above uses the index effectively -- ambiguous; note that (Y,X) is not same as (X,Y)

Related

MySQL GROUP BY if Multiple Numbered Columns are Close to Each Other (+/- 1)

I have a mysql table with a large list of coordinates (x, y, z). I want to find the most common spots, but when the same place is logged, it isn't identical. For example, x could be 496.0481 or 496.3904, but that is actually the same place.
When I do the following query I get a list of the absolute exact matches, but those are very few and far between:
SELECT x, y, z, COUNT(*) AS coords
FROM coordinates
GROUP BY x, y, z
ORDER BY coords DESC
LIMIT 10;
How can I adjust this to be grouped by each of x, y, and z to be +/- 1 to catch a larger area? I've tried a mix of IF and BETWEEN statements but can't seem to get anything to work.
If I do GROUP BY round(x), round(y), round(z), that gets a larger range but doesn't capture if the number goes from 496 to 497 even if they are just slightly different.
Thanks in advance for the help.
Very naive way:
select t1.x as x1, t1.y as y1, t1.z as z1, t2.x as x2, t2.y as y2, t2.z as z2
from coordinates t1
join coordinates t2 on sqrt(power(t2.x-t1.x, 2) + power(t2.y-t1.y, 2) + power(t2.z-t1.z, 2)) <= 1
For each coordinates (t1) query finds all other coordinates (t2) that dinstanced less or equal than 1 from each other.
But this query has very bad performance: O(n^2)

MySQL multiple columns in IN clause

I have a database with four columns corresponding to the geographical coordinates x,y for the start and end position. The columns are:
x0
y0
x1
y1
I have an index for these four columns with the sequence x0, y0, x1, y1.
I have a list of about a hundred combination of geographical pairs. How would I go about querying this data efficiently?
I would like to do something like this as suggested on this SO answer but it only works for Oracle database, not MySQL:
SELECT * FROM my_table WHERE (x0, y0, x1, y1) IN ((4, 3, 5, 6), ... ,(9, 3, 2, 1));
I was thinking it might be possible to do something with the index? What would be the best approach (ie: fastest query)? Thanks for your help!
Notes:
I cannot change the schema of the database
I have about 100'000'000 rows
EDIT:
The code as-is was actually working, however it was extremely slow and did not take advantage of the index (as we have an older version of MySQL v5.6.27).
To make effective use of the index, we could rewrite the IN predicate
example
(x0, y0, x1, y1) IN ((4, 3, 5, 6),(9, 3, 2, 1))
Like this:
( ( x0 = 4 AND y0 = 3 AND x1 = 5 AND y1 = 6 )
OR ( x0 = 9 AND y0 = 3 AND x1 = 2 AND y1 = 1 )
)
EDIT
Newer versions of MySQL optimizer fix the performance problem; generate execution plans that make more effective use of available indexes.
The (a,b) IN ((7,43),(7,44),(8,1)) syntax has been supported in MySQL many versions back, but there were performance problems with it (at least with with non-trivial sets) because of the suboptimal execution plan generated by the optimizer.
But the optimizer has been improved in newer versions of MySQL; the newer optimizer can generate more efficient execution plans.
Note a similar related problem with OR constructs. Here's an example query intended to get the "next page" of 20 rows ordered by columns seq and sub (unique tuple). The last fetched page (seq,sub)=(7,42)
With much older versions of MySQL, this syntax would not be accepted
WHERE (seq,sub) > (7,42)
ORDER BY seq, sub
LIMIT 20
And when MySQL did support the syntax, we would get an execution plan like if we had written
WHERE ( seq > 7 )
OR ( seq = 7 AND sub > 42 )
ORDER BY sub, seq
LIMIT 20
we would get a much more efficient the execution plan if we instead write something subtly different:
WHERE ( seq >= 7 )
AND ( seq > 7 OR sub > 42 )
ORDER BY sub, seq
LIMIT 20
and we would get a much better plan from the MySQL optimizer. we'd expect the optimizer plan to use available UNIQUE INDEX on (sub,seq), and return rows in index order from a range scan operation...
I do not understand your point. The following query is valid MySQL syntax:
SELECT *
FROM my_table
WHERE (x0, y0, x1, y1) IN ((4, 3, 5, 6), ... ,(9, 3, 2, 1));
I would expect MySQL to use the composite index that you have described. But, if it doesn't you could do:
SELECT *
FROM my_table
WHERE x0 = 4 AND y0 = 3 AND x1 = 5 AND y1 = 6
UNION ALL
. . .
SELECT *
FROM my_table
WHERE x0 = 9 AND y0 = 3 AND x1 = 2 AND y1 = 1
The equality comparisons in the WHERE clause will take advantage of an index.
MySQL allows row constructor comparisons like you show, but the optimizer didn't know how to use an index to help performance until MySQL 5.7.
https://dev.mysql.com/doc/refman/5.7/en/row-constructor-optimization.html
You can concatenate the four values into a string and check them like that:
SELECT *
FROM my_table
WHERE CONCAT_WS(',', x0, y0, x1, y1) IN ('4,3,5,6', ..., '9,3,2,1');
The way you are doing is giving correct results in the mysql version on my machine. I am using v5.5.55. Maybe you are using an older one. Please check that.
If you still want to solve this problem in your own version or the above mentioned solution doesn't work then only read the next solution.
I am still not clear about data types and range of all your columns here. So I am assuming that data type is integer and range is between 0 to 9. If this is the case you can easily do this as given below.
select * from s1 where x0+10*x1+100*y1+1000*y2 in (4356,..., 9321);

For Loop in MySQL, looping through a table and applying it to a where statement

My problem is how to loop through a table and extract information from another table.
I have a table - X with 470 records:
A B C
111 12 18
121 21 29
127 37 101
I would like to write the following query:
create or replace view NEW as
For j = 1-3
Select * from Y
where imei = X.A(j) and id > X.B(j) and id < X.C(j)
Apologies, I am a matlab programmer so I have used that syntax above to explain what I want. How can I do this in MySql? I have looked up For Loops but mostly it loops through within the same table. I need to loop through a different table and use those criteria in the where statement of a different table.
To get 3 rows from a table, use LIMIT 3 in a subquery. To get related rows in another table, use JOIN.
CREATE OR REPLACE VIEW new AS
SELECT Y.*
FROM Y
JOIN (SELECT *
FROM X
LIMIT 3) AS X ON Y.ime1 = X.a AND Y.id > X.b AND Y.id < X.c
To make LIMIT 3 produce predictable results, you should have an ORDER BY clause in the subquery. Otherwise, it will select an arbitrary set of 3 rows from X.

mySQL, two-dimensional into one-dimensional

I've forgotten whatever I used to know about pivots, but this seems to me the reverse. Suppose I have a set of items A, B, C, D, … and a list of attributes W, X, Y, Z. I have in a spreadsheet something like
A B C D
W 1 P 3 Q
X 5 R 7 S
Y T 2 U 4
Z D 6 F 7
where the value of attribute X for item B is 'P'. In order to do some statistics on comparisons, I'd like to change it from table to list, i.e.,
W A 1
X A 5
Y A T
Z A D
W B P
X B R
Y C U
Z C F
W D Q
X D S
Y B 2
Z B 6
Etc.
I can easily write a nested loop macro in the spreadsheet to do it, but is there an easy way to import it into mySQL in the desired format? Queries to get the statistics needed are simple in SQL (and formulas not very hard in a spreadsheet) if the data is in the second format.
Since there apparently isn't a "spreadsheet" tag, I used "excel." :-)
There are a lot of questions that looked similar at first glance, but the five I looked at all wanted to discard one of the indices (A-D or W-Z), i.e. creating something like
W 1
W P
X 5
X R
EDITED
You can use PowerQuery to unpivot tables. See the answer by teylyn for the following question. I have Office 365 and didn't need to install the plugin first. The functionality was already available.
Convert matrix to 3-column table ('reverse pivot', 'unpivot', 'flatten', 'normalize')
Another way to unpivot data without using VBA is with PowerQuery, a free add-in for Excel 2010 and higher, available here: http://www.microsoft.com/en-us/download/details.aspx?id=39379
...
Click the column header of the first column to select it. Then, on the Transform ribbon, click the Unpivot Columns drop-down and select Unpivot other columns.
...
OLD ANSWER
If you import the spreadsheet as is, you can run a query to output the correct format.
For a fixed, small number of items, you can use UNION for each of the columns.
SELECT attr, 'A' AS 'item', A AS 'value'
FROM sheet
UNION
SELECT attr, 'B' AS 'item', B AS 'value'
FROM sheet
UNION
SELECT attr, 'C' AS 'item', C AS 'value'
FROM sheet
UNION
SELECT attr, 'D' AS 'item', D AS 'value'
FROM sheet;
Working example: http://sqlfiddle.com/#!9/c274e7/7

Select query - Fetch all data between two age min and max age

Mind burning query for me.
I want to fetch the all record between two age condition. Any one please help me. My query is write below
SELECT tbl_trip.* FROM tbl_trip WHERE ((tbl_trip.minage >= '15' AND tbl_trip.maxage <= '15') OR (tbl_trip.maxage <= '28' AND tbl_trip.minage >= '28'))
In this query i want all record from database .. where age lie between 15 to 28 . In this all reords get, e.g. minage to maxage : 1 to 16,12 to 30 , 16 to 24, 27 to 28 But not get like 3 to 13 or 29 or 100 . Thanks in advance.
Here is the right way Rohit wrote a confusing Query.
select * from tbl_trip where ((minage>=12 and minage<=22) or (maxage>=12 and maxage<=22)) or ((minage<=12 and maxage>=22))
Thanks
As I understand, you need to get all the records from the DB where minage--maxage range intersects with the given one. Hereinafter the given range will be from X to Y, the row's range will be from A to B. Obviously, Y > X, B > A.
There are Three situations in which these ranges cross:
A--B is completely inside X--Y. That means that on the numeric line these numbers lie like this: A, X--Y, B. Therefore, the condition is
A <=X and B >= Y
A--B patially crosses the X--Y on the left side of X--Y. On the numeric line these numbers look like: A, X, B, Y. Therefore, the condition is
A <= X and X <= B and B <= Y
Same for A--B crossing the X--Y range on it's right side. Numeric line: X, A, Y, B. Condition:
X <= A and A <= Y and Y <= B
In conclusion, the final condition is:
(A <= X && B >= Y) || (A <= X && X <= B && B <= Y) || (X <= A && A <= Y && Y<= B)
Try with it, its working fine for me
select * from tbl_trip
Where GREATEST(GREATEST(minage,12)-LEAST(maxage,22),0)=0
This query checks your given range between minage and maxage, and vice versa.
select * from tbl_trip
where (((minage>=15 and minage<=22) or (maxage>=15 and maxage<=22))
or((15>=minage and 15<=maxage) or (22<=maxage and 22>=minage)))
This is also working, Please check it properly before you use:
select * from tbl_trip
where ((minage>=12 and minage<=22) or (maxage>=12 and maxage<=22))
or ((12>=minage and 22<=maxage))
Here is another way:
select * from tbl_trip
Where GREATEST(GREATEST(minage,12)-LEAST(maxage,22),0)=0