This being my first question in SO
I am using a query which is ::
SELECT column1, column2, COUNT(*)
FROM myTable
GROUP BY DATE(logged_date)
HAVING COUNT(*)>10
Mytable contaings 2 million records, and the column logged_date is of type datetime.
The above query is takink aroud 15 seconds to execute.
Any help will be appreciated.
Welcome.
It would be best to also provide the table schema. Nevertheless, I will make some guesses:
The logged_date is a TIMESTAMP column or a DATATIME -- is that so? Which is the reason for doing DATE() on that column.
Your best option, if this is a query you wish to optimize, is to add another column, which is logged_date_day (the first name is already confusing, the second as much :) )
This means supporting both at the same time (but my next guess is that you only INSERT it one, not to be update again -- so this is not too much of an effort).
You would then have to index the new column, and do the GROUP BY on that column.
PS
Technically speaking, SELECT column1 FROM some_table GROUP BY another_column is not a valid query. MySQL allows it when your sql_mode does not contain ONLY_FULL_GROUP_BY. I recommend that you look into this.
I am also worried about grouping on logged_date but showing column1 and 2, this might not give you the expected results, so better to group on all cols or use a function like max or min around column1 and 2
nevertheless, you might consider something like this :
make sure everything in the group by is keyed togther:
alter table myTable add key (logged_date (10), column1,column2);
changed query :
SELECT left(logged_date,10) as ldate , column1, column2, COUNT(*)
FROM myTable
GROUP BY ldate,column1,column2
HAVING COUNT(*)>10
Related
What I am trying to do is select each distinct column1 value from table1 and then select all the columns from those rows returned from the above. Is this possible at all?
What I have so far, however, nothing is returned:
SELECT * FROM (SELECT DISTINCT column1 FROM table1)
I've thought about putting a unique/distinct restriction in the where clause of the query:
SELECT * FROM table1 WHERE some_unique_determiner column1
Any ideas how I could go about achieving the desired output?
Ok so answering my own question. What I need to do was to group the data by column1, without use of a nested query. Many thanks to #VR46 for the help.
SELECT * FROM table1 GROUP BY column1
Returned all columns from each unique value from column1
In your next posts, it will be better if you post your table structures, input and desired out put so it will be easier for us to understand.
If I did understand, there is one of two options:
Either you have duplicates, and you want to eliminate them so your correct query should be
select distinct COLUMNa,COLUMNb,COLUMNc... ETC
which will drop duplicates(that the entire row is the same).
Or you want to eliminate rows that have the same column1 and it doesn't matter if all the rest is the same or not.
In that case, You need to tell us which one of the result you want to keep, The up to date one,the older, random ETC.. because right now its impossible to make you a query that selects all the columns after you distinct, since all the duplicates will return like this:
SELECT * FROM TABLE WHERE COLUMN1 IN(SELECT DISTINCT COLUMN1 FROM TABLE)
Which is a wrong query since it doesn't do anything.
I'm trying to find the most efficient way to determine if a table row exists.
I have in mind 3 options:
SELECT EXISTS(SELECT 1 FROM table1 WHERE some_condition);
SELECT 1 FROM table1 WHERE some_condition LIMIT 0,1;
SELECT COUNT(1) FROM table1 WHERE some_condition;
It seems that for MySQL the first approach is more efficient:
Best way to test if a row exists in a MySQL table
Is it true in general for any database?
UPDATE:
I've added a third option.
UPDATE2:
Let's assume the database products are mysql, oracle and sql-server.
I would do
SELECT COUNT(1) FROM table 1 WHERE some_condition.
But I don't think it makes a significant difference unless you call it a lot (in which case, I'd probably use a different strategy).
If you mean to use as a test if AT LEAST ONE row exists with some condition (1 or 0, true or false), then:
select count(1) from my_table where ... and rownum < 2;
Oracle can stop counting after it gets a hit.
Exists is faster because it will return the number of results that match the subquery and not the whole result.
The different methods have different pros and cons:
SELECT EXISTS(SELECT 1 FROM table1 WHERE some_condition);
might be the fastest on MySQL, but
SELECT COUNT(1) FROM table 1 WHERE some_condition
as in #Luis answer gives you the count.
More to the point I recommend you take a look at your business logic: Very seldom is it necessary to just see if a row exists, more often you will want to
either use these rows, so just do the select and handle the 0-rows case
or you will want to change these rows, in which case just do your update and check mysql_affected_rows()
If you want to INSERT a row if it doesn't already exist, take a look at INSERT .. ON DUPLICATE KEY or REPLACE INTO
The exists function is defined generally in SQL, it isn't only as a MySQL function : http://www.techonthenet.com/sql/exists.php
and I usually use this function to test if a particular row exists.
However in Oracle I've seen many times the other approach suggested before:
SELECT COUNT(1) FROM table 1 WHERE some_condition.
How can I get the minimum values! (plural) from a table without using a subquery? The table contains following data (sorry four the mouse):
As you can see, I always want to select the minimum values. If there are the same values (table 2 & 3) the query shall give all rows, because there is no minimum. I'm using MySQL. I don't want to use a subquery if possible because of performance reasons. A min(value) and group by id doesn't work either, because of the unique ids.
Thanks in advance
ninsky
As far as I know, this cannot be done without a subquery in MySQL. For example:
select *
from YourTable
where value =
(
select min(value)
from YourTable
)
if you do not trust MySQL in performance you can split query proposed by Andomar to 2 atomic subquery
In a database that has over 1 million entries, occasionally we need to find all rows that have a column name that starts with a number.
This is what currently is being used, but it just seems like there may be a more efficient manner in doing this.
SELECT * FROM mytable WHERE name LIKE '0%' OR name LIKE '1%' OR name ...
etc...
Any suggestions?
select * from table where your_field regexp '^[0-9]'
Hey,
you should add an index with a length of 1 to the field in the db. The query will then be significantly faster.
ALTER TABLE `database`.`table` ADD INDEX `indexName` ( `column` ( 1 ) )
Felix
My guess is that the indexes on the table aren't being used efficiently (if at all)
Since this is a char field of some type, and if this is the primary query on this table, you could restructure your indexes (and my mysql knowledge is a bit short here, somebody help out) such that this table is ordered (clustered index in ms sql) by this field, thus you could say something like
select * from mytable where name < char(57) and name > char(47)
Do some testing there, I'm not 100% on the details of how mysql would rank those characters, but that should get you going.
Another option is to have a new column that gives you a true/false on "starts_with_number". You could setup a trigger to populate that column. This might give the best and most predictable results.
If you're not actually using each and every field in the rows returned, and you really want to wring every drop of efficiency out of this query, then don't use select *, but instead specify only the fields you want to process.
I'm thinking...
SELECT * FROM myTable WHERE IF( LEFT( name, 1) = '#', 1,0)
I wanted to automate my table population for testing purposes.
I needed to edit some columns from a certain table but I must make sure that the values I put in that certain column does not simply come out of randomness.
So the values actually comes from another table on a certain condition.
How can I do that? Just like this code:
update table_one set `some_id`=(select some_id from another_table where another_table.primary_id=table_one.primary_id order by rand() limit 1)
It's something like my condition for the Subselect query. It should match the id of the current row I am updating.
I really forgot my SQL now. Thanks for the answers though.
You're almost there - all you need to do is qualify the column you're selecting in the subquery, so you know it comes from the correct table:
update table_one
set some_id=(
select another_table.some_id
from another_table
where another_table.primary_id=table_one.primary_id
order by rand()
limit 1
)