HOW to understand this EXPLAIN plan..? It is looking confusing to me - mysql

I have a table with 130Mill rows in it. When i EXPLAIN the query, it is showing 130Mill in the rowns column.
EXPLAIN SELECT * FROM TABLE1;
rows: 130 Mill
But when it add a where condition with TABLE1.time_dim_id between 1900 and 2000
EXPLAIN SELECT * FROM TABLE1 WHERE time_dim_id between 1900 and 2000 ;
rows: 60Mill
Why am is surprised is because, there are no NULL values for this time_dim_id value and
min(time_dim_id) is 1900 AND
max(time_dim_id) is 2000.
Why is this not showing 130Mill rows again under rows section in the 2nd EXPLAIN plan...?

Try
ANALYZE TABLE TABLE1
and then again you EXPLAIN - this assumes there is an index on TABLE1.time_dim_id

Related

SQL select single row with two matching values

I'm probably having a bad day, but this is somehow escaping me:
I want to return the second row in this table only.
userId val1 val2
1 11 12
2 13 14
3 13 15
4 16 17
Using SELECT * FROM table WHERE val1=13 AND val2=14 obviously returns 2 rows, the second and third. Whats the correct way to select ONLY the second row? Where val1 is 13 and val2 is 14?
EDIT: I'm an idiot.
Just use SELECT * FROM table WHERE val1=13 AND val2=14like you already mentioned in your question, because in fact, it actually returns only row number 2.
If it has been a very bad day & there is a typo in your question & val2 in third row also equals 14 - the only way your query would return two rows, then this would do what you want
SELECT *
FROM table
WHERE val1=13 AND val2=14
ORDER BY userId
LIMIT 1;
If you get 2 rows, then there must be 2 rows match the condition.
Maybe you could try:
select count(*)
from table
where val1=13 AND val2=14;
to show the size of result set.

MySQL HAVING & WHERE query different result

I just a MySQL beginner. This is first time for me asking you guys at STACKOVERFLOW about query using HAVING and WHERE:
SELECT
BOXNUMBER
,COUNT(BOXNUMBER) AS QTY
,CDATETIME
FROM
HSS_SNO
WHERE
year(CDATETIME) IN ('2008','2010','2014')
GROUP BY
BOXNUMBER ;
/* Affected rows: 0 Found rows: 13,928 Warnings: 0 Duration for 1 query: 0.031 sec. (+ 2.782 sec. network) */
SELECT
BOXNUMBER
,COUNT(BOXNUMBER) AS QTY
,CDATETIME
FROM
HSS_SNO
GROUP BY
BOXNUMBER
HAVING
year(CDATETIME) IN ('2008','2010','2014');
/* Affected rows: 0 Found rows: 13,922 Warnings: 0 Duration for 1 query: 0.047 sec. (+ 2.594 sec. network) */
I think these queries will give me same result, but 'found rows' different each other.
Could you tell me why like that ?
Thanks
Tobing
(Sorry for my English)
.......
WHERE
year(CDATETIME) IN ('2008','2010','2014')
GROUP BY
BOXNUMBER ;
The above query gives more rows because you are not applying any condition on Group by clause and the other query
......
GROUP BY
BOXNUMBER
HAVING
year(CDATETIME) IN ('2008','2010','2014');
here you are applying the condition on group by that what type of records you wanted, as a result u got less records when compared.
have a look at this link which will helps in understanding of sql query execution in detail http://social.msdn.microsoft.com/Forums/sqlserver/en-US/70efeffe-76b9-4b7e-b4a1-ba53f5d21916/order-of-execution-of-sql-queries

generating count data using group by in mysql

I have data like
column
1
1
57
57
57
1
1
57
57
I need to generate count data like
count
2
3
2
2
using mysql query. I have tried group by but it groups all the values also i couldn't group by any other fields. Is there any easy way to achieve this?
find the screenshot in this link
Update:
My end goal is to get the time spend by the user on each course . like the user who has ID 1 have spend time from 1177924991 to 1177925038 ( here 1177925038 is one second minus the next course visit time) on course id 1
Use this query:
SELECT COUNT(*)
FROM `tablename`
GROUP BY `info`
try using
SELECT COUNT(columnname)
FROM `tablename`
GROUP BY columnname
This will do the needful.
UPDATE
I am getting following result set on running this query :
Use This Query:
Select `course`, count(*) from `tablename`
group by `course`

Filter integer result from Mysql query

I have a Mysql query. I want to filter only integer result.
My query is-
SELECT * FROM table as p WHERE p.test between 0 AND 999
But result comes this-
747
748
749
FO4001
FO4002
750
751
I want to ask two things-
1)Is there any way to exclude below result-
FO4001
FO4002
2)Why are these coming in the result?
Try this one, use REGEXP to test if the value is all numeric.
SELECT *
FROM table1
WHERE x BETWEEN 0 AND 999
AND x REGEXP '^[0-9]+$';
SQLFiddle Demo

Need Help streamlining a SQL query to avoid redundant math operations in the WHERE and SELECT

*Hey everyone, I am working on a query and am unsure how to make it process as quickly as possible and with as little redundancy as possible. I am really hoping someone there can help me come up with a good way of doing this.
Thanks in advance for the help!*
Okay, so here is what I have as best I can explain it. I have simplified the tables and math to just get across what I am trying to understand.
Basically I have a smallish table that never changes and will always only have 50k records like this:
Values_Table
ID Value1 Value2
1 2 7
2 2 7.2
3 3 7.5
4 33 10
….50000 44 17.2
And a couple tables that constantly change and are rather large, eg a potential of up to 5 million records:
Flags_Table
Index Flag1 Type
1 0 0
2 0 1
3 1 0
4 1 1
….5,000,000 1 1
Users_Table
Index Name ASSOCIATED_ID
1 John 1
2 John 1
3 Paul 3
4 Paul 3
….5,000,000 Richard 2
I need to tie all 3 tables together. The most results that are likely to ever be returned from the small table is somewhere in the neighborhood of 100 results. The large tables are joined on the index and these are then joined to the Values_Table ON Values_Table.ID = Users_Table.ASSOCIATED_ID …. That part is easy enough.
Where it gets tricky for me is that I need to return, as quickly as possible, a list limited to 10 results where value1 and value2 are mathematically operated on to return a new_ value where that new_value is less than 10 and the result is sorted by that new_value and any other where statements I need can be applied to the flags. I do need to be able to move along the limit. EG LIMIT 0,10 / 11,10 / 21,10 etc...
In a subsequent (or the same if possible) query I need to get the top 10 count of all types that matched that criteria before the limit was applied.
So for example I want to join all of these and return anything where Value1 + Value2 < 10 AND I also need the count.
So what I want is:
Index Name Flag1 New_Value
1 John 0 9
2 John 0 9
5000000 Richard 1 9.2
The second response would be:
ID (not index) Count
1 2
2 1
I tried this a few ways and ultimately came up with the following somewhat ugly query:
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
ORDER BY New_Value
LIMIT 0,10
And then for the count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
Being able to filter on the different flags and such in my WHERE clause is important; that may sound stupid to comment on but I mention that because from what I could see a quicker method would have been to use the HAVING statement but I don't believe that will work in certain instance depending on what I want to use my WHERE clause to filter against.
And when filtering using the flags table :
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
ORDER BY New_Value
LIMIT 0,10
...filtered count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
That works fine but has to run the math multiple times for each row, and I get the nagging feeling that it is also running the math multiple times on the same row in the Values_table table. My thought was that I should just get only the valid responses from the Values_table first and then join those to the other tables to cut down on the processing; with how SQL optimizes things though I wasn't sure if it might not already be doing that. I know I could use a HAVING clause to only run the math once if I did it that way but I am uncertain how I would then best join things.
My questions are:
Can I avoid running that math twice and still make the query work
(or I suppose if there is a good way
to make the first one work as well
that would be great)
What is the fastest way to do this
as this is something that will
be running very often.
It seems like this should be painfully simple but I am just missing something stupid.
I contemplated pulling into a temp table then joining that table to itself but that seems like I would trade math for iterations against the table and still end up slow.
Thank you all for your help in this and please let me know if I need to clarify anything here!
** To clarify on a question, I can't use a 3rd column with the values pre-calculated because in reality the math is much more complex then addition, I just simplified it for illustration's sake.
Do you have a benchmark query to compare against? Usually it doesn't work to try to outsmart the optimizer. If you have acceptable performance from a starting query, then you can see where extra work is being expended (indicated by disk reads, cache consumption, etc.) and focus on that.
Avoid the temptation to break it into pieces and solve those. That's an antipattern. That includes temp tables especially.
Redundant math is usually ok - what hurts is disk activity. I've never seen a query that needed CPU work reduction on pure calculations.
Gather your results and put them in a temp table
SELECT * into TempTable FROM (SELECT INDEX, NAME, Type, ID, Flag1, (Value1 + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE New_Value < 10)
ORDER BY New_Value
LIMIT 0,10
Return Result for First Query
SELECT INDEX, NAME, Flag1, New_Value
FROM TempTable
Return Results for count of Types
Select ID, Count(Type)
FROM TempTable
GROUP BY TYPE
Is there any chance that you can add a third column to the values_table with the pre-calculated value? Even if the result of your calculation is dependent on other variables, you could run the calculation for the whole table but only when those variables change.