I got a table A and a table B (and a Table C which is not really relevant). The relation is 1:n.
Table A
- id
- c_foreign_key
Table B
- id
- A_id
- datetime
Table A has about 400'000 entries, table B about 20 million.
I have a time-range, lets say from 2014/01/01 to 2014/12/31.
What i want for each month in this range is:
Count all entries from table A, grouped by c_foreign_key, where table A has no entries in table B for (month - 1.year to month).
The Result should look like this:
date c_foreign_key count(*)
--------------------------------
14/01 1 2000
14/01 2 3000
...
14/02 1 4000
14/01 2 6000
...
I already tried left join and "not in select" for each month the performance wasn't really good.
You should debug your SQL queries with explain more info at Mysql Explain Syntax, also you should place index- es on your datetime fields for a better performance. Explain usualy is used to see which indexes does mysql use in your query.
Related
I think it can be very lame question but I am bit confused & I am not a DB expert. Let me come to the question:
Say, I have two tables A & B.
Table A has 4 columns but 4 Million rows.
ID(P), Key_id(Foreign Key), Value
Table B has 200 columns but 20K rows.
ID(P), Field 1,....N
Now, my question is which one will be faster:
If I fire a select query on table A to fetch set of records, based on Key_id values & we need to do one join as well with another table say C to get the value of Key_id
Output:
Key_Value| Key_ID| Value
If I fire a select query to fetch a record with all 200 columns?
Output:
ID| all 200 columns...
Thanks in advance.
Ok so i was learning sql joins and was curious to try all joins on the following table:
Table name Demo1:
A
1
1
1
1
1
Table name Demo2:
B
1
1
1
1
1
To my amazement no matter which join i apply i end up with same 25 entries. I am sure about cross join since it gives all combination but what about the other joins how are they returning the same answers for these two tables.
How join statement work is it pick up all entries from the first table
the for every entry, it pick all entries from the second table that is sastified by the on condition.
Hence, the number of result in this case = number of records in A * number of records in B = 25.
SELECT column_1 FROM table_1,table_2;
When I ran this on my database it returned huge number of rows with duplicate column_1 values. I could not understand why I got these results. Please explain what this query does.
it gives you a cross product from table 1 and table 2
In more layman's terms, it means that for each record in Table A, you get every record from Table B (all possible combinations).
TableA with 3 records and Table B with 3 records gives 9 total records in the result:
TableA-1/B-1
TableA-1/B-2
TableA-1/B-3
TableA-2/B-1
TableA-2/B-2
TableA-2/B-3
TableA-3/B-1
TableA-3/B-2
TableA-3/B-3
Often used as a basis for Cartesian Queries (which themselves are the means to generate, say, a list of future dates based on a recurrence schedule: give me all possible results for the next 6 months, then restrict that set to those whose factor matches my day of the week)
This is 'valid' way of cross joining two tables; it is not the preferred way though. Cross Join would be much clearer. An on condition would then be helpful to limit results,
Imagine that i have 3 friends named Jhon, Ana, Nick; then i have in the other table 2 are T-shirts a red and a yellow and i wanna know witch is from.
So in the query being tableA:Friends and tableB:Tshirts returns:
1|JHON | t-shirt_YELLOW
2|JHON | t-shirt_RED
3|ANA | t-shirt_YELLOW
4|ANA | t-shirt_RED
5|NICK | t-shirt_YELLOW
6|NICK | t-shirt_RED
As you see this join has no relational logic between friends and Tshirts so by evaluating all the posible combination generates what you call duplicates.
I have a MySQL table of bouts set up like this.
|------------------------|
|bouts |
|------------------------|
|boutID |
|recording_athlete |
|boutdate (timestamp) |
|opponent |
|recording_athlete_points|
|------------------------|
Each actual meeting between two people is recorded twice in the table, with a unique boutID and boutdate (reflecting the moment when it was actually entered, but within 5 minutes of the other) and the recording athlete of one is the opponent of the other, and visa versa. The two records are not necessarily consecutive. There are additional meetings for the two participants each day, separated by longer time intervals: we're looking for the two closest in both timestamp and ID number (assuming that these are the two that belong together).
I'm trying to SELECT records that belong together into one row (and realize and want it will be done twice) such that it will output matched rows something like this:
boutID|recording_athlete|boutdate|opponent|recording_athlete_points|boutID_b|boutdate_b|opponent_points
01|John|2012-05-10 20:33:04|Jane|15|04|2012-05-10 20:36:12|10
04|Jane|2012-05-10 20:36:12|John|10|01|2012-05-10 20:33:04|15
Here is what I have so far, and where I think I need to go, but just can't figure out what to use. Some sort of interval statement? Or do I need a totally different structure?
SELECT
A.`boutID`,
A.`recording_athlete`,
A.`boutDate`,
A.`opponent`,
A.`recording_athlete_points`,
B.`boutID` as `boutID_b`,
B.`boutDate` as `boutdate_b`,
B.`recording_athlete_points`as `opponent_points`
FROM bouts A
INNER JOIN bouts B on(A.`fullName` = B.`opponent` AND ????? )
ORDER by A.`boutDate`
SELECT
...
FROM bouts A
JOIN bouts B
on A.fullName = B.opponent
AND B.boutdate between subdate(A.boutdate, interval 5 minute)
and adddate(A.boutdate, interval 5 minute)
If you create an index on boutdate:
create index bouts_boutdate_index on bouts(boutdate);
this query will perform well
i've two tables
Table A: a_id,timemarker (datetime)
Table B: b_id,start (datetime), stop(datetime), cat(varchar)
table A
149|2010-07-19 07:43:45
150|2010-07-19 08:01:34
151|2010-07-19 07:49:12
table B
565447|2010-07-19 07:30:00|2010-07-19 08:00:00
565448|2010-07-19 08:00:00|2010-07-19 08:20:00
i want select all rows from Table A who are in the range of Table B
thanks
Select any A that is within ANY [B.start, B.end]
select a.*
from
table a
where exists ( select * from table b where a.timemarker between b.start and b.stop)
;
The OP writes
i have trouble with my keys! the query executes very long. i have got in table a over 40k rows and in table b over 1.4 million rows... there is no relation within the tables – Norman 3 secs ago
Yes, because you're potentially comparing each A with every B = 40k * 1.4M comparisons.
But your question was "how do I do this", not "here's how I'm doing it, how can I make it faster".
if you want it faster, you'll need to add an index on B(start, end);
SELECT a.* FROM a
INNER JOIN b ON
a.timemarker BETWEEN b.start AND b.end
GROUP BY a.id
I think this should be less expensive. Of course an extra index will help, too :)