need to query 2 MySQL tables with COUNT(*) condition - mysql

I have 2 tables (cycles and merged_cycles). "cycles" has 2 fields I need to target (userid and cycleid) and "merged_cycles" also has 2 targeted fields (cycleid1 and cycleid2). I need to know all cycles.userid that have more than one record in "cycles", so long as the corresponding cycles.cycleid for any matching record does not appear in any record in "merged_cycles" in either merged-cycles.cycleid1 OR merged_cycles.cycleid2. I currently have it working using 2 different queries, but i was curious if it could be done in one. Here's what i have tried so far:
SELECT cycles.cycleid, cycles.userid, cycles.COUNT(*),
merged_cycles.cycleid1, merged_cycles.cycleid2
FROM cycles,merged_cycles
WHERE merged_cycles.cycleid1 != cycles.cycleid && merged_cycles.cycleid2 != cycles.cycleid
GROUP BY cycles.userid
HAVING cycles.count(*) > 1
Thanks for any suggestions!

I think this does what you want:
SELECT c.cycleid
FROM cycles c
WHERE NOT EXISTS (SELECT 1
FROM merged_cycles mc
WHERE c.cycleid IN (mc.cycleid1, mc.cycleid2)
)
GROUP BY c.userid
HAVING count(*) > 1;

Related

Get Multi Columns Count in Single Query

I am working on a application where I need to write a query on a table, which will return multiple columns count in a single query.
After research I was able to develop a query for a single sourceId, but what will happen if i want result for multiple sourceIds.
select '3'as sourceId,
(select count(*) from event where sourceId = 3 and plateCategoryId = 3) as TotalNewCount,
(select count(*) from event where sourceId = 3 and plateCategoryId = 4) as TotalOldCount;
I need to get TotalNewCount and TotalOldCount for several source Ids, for example (3,4,5,6)
Can anyone help, how can I revise my query to return a result set of three columns including data of all sources in list (3,4,5,6)
Thanks
You can do all source ids at once:
select source_id
sum(case when plateCategoryId = 3 then 1 else 0 end) as TotalNewCount,
sum(case when plateCategoryId = 4 then 1 else 0 end) as TotalOldCount
from event
group by source_id;
Use a where (before the group by) if you want to limit the source ids.
Note: The above works in both Vertica and MySQL, and being standard SQL should work in any database.

SQL unwanted results in NOT query

This looks like it should be really easy question, but I've been looking for an answer for the past two days and can't find it. Please help!
I have two tables along the lines of
texts.text_id, texts.other_stuff...
pairs.pair_id, pairs.textA, pairs.textB
The second table defines pairs of entries from the first table.
What I need is the reverse of an ordinary LEFT JOIN query like:
SELECT texts.text_id
FROM texts
LEFT JOIN text_pairs
ON texts.text_id = text_pairs.textA
WHERE text_pairs.textB = 123
ORDER BY texts.text_id
How do I get exclusively the texts that are not paired with A given textB? I've tried
WHERE text_pairs.textB != 123 OR WHERE text_pairs.textB IS NULL
However, this returns all the pairs where textB is not 123. So, in a situation like
textA TextB
1 3
1 4
2 4
if I ask for textB != 3, the query returns 1 and 2. I need something that will just give me 1.
The comparison on the second table goes in the ON clause. Then you add a condition to see if there is no match:
SELECT t.text_id
FROM texts t LEFT JOIN
text_pairs tp
ON t.text_id = tp.textA AND tp.textB = 123
WHERE tp.textB IS NULL
ORDER BY t.text_id ;
This logic is often expressed using NOT EXISTS or NOT IN:
select t.*
from texts t
where not exists (select 1
from text_pairs tp
where t.text_id = tp.textA AND tp.textB = 123
);

How to Find First Valid Row in SQL Based on Difference of Column Values

I am trying to find a reliable query which returns the first instance of an acceptable insert range.
Research:
some of the below links adress similar questions, but I could get none of them to work for me.
Find first available date, given a date range in SQL
Find closest date in SQL Server
MySQL difference between two rows of a SELECT Statement
How to find a gap in range in SQL
and more...
Objective Query Function:
InsertRange(1) = (StartRange(i) - EndRange(i-1)) > NewValue
Where InsertRange(1) is the value the query should return. In other words, this would be the first instance where the above condition is satisfied.
Table Structure:
Primary Key: StartRange
StartRange(i-1) < StartRange(i)
StartRange(i-1) + EndRange(i-1) < StartRange(i)
Example Dataset
Below is an example User table (3 columns), with a set range distribution. StartRanges are always ordered in a strictly ascending way, UserID are arbitrary strings, only the sequences of StartRange and EndRange matters:
StartRange EndRange UserID
312 6896 user0
7134 16268 user1
16877 22451 user2
23137 25142 user3
25955 28272 user4
28313 35172 user5
35593 38007 user6
38319 38495 user7
38565 45200 user8
46136 48007 user9
My current Query
I am trying to use this query at the moment:
SELECT t2.StartRange, t2.EndRange
FROM user AS t1, user AS t2
WHERE (t1.StartRange - t2.StartRange+1) > NewValue
ORDER BY t1.EndRange
LIMIT 1
Example Case
Given the table, if NewValue = 800, then the returned answer should be 23137. This means, the first available slot would be between user3 and user4 (with an actual slot size = 813):
InsertRange(1) = (StartRange(i) - EndRange(i-1)) > NewValue
InsertRange = (StartRange(6) - EndRange(5)) > NewValue
23137 = 25955 - 25142 > 800
More Comments
My query above seemed to be working for the special case where StartRanges where tightly packed (i.e. StartRange(i) = StartRange(i-1) + EndRange(i-1) + 1). This no longer works with a less tightly packed set of StartRanges
Keep in mind that SQL tables have no implicit row order. It seems fair to order your table by StartRange value, though.
We can start to solve this by writing a query to obtain each row paired with the row preceding it. In MySQL, it's hard to do this beautifully because it lacks the row numbering function.
This works (http://sqlfiddle.com/#!9/4437c0/7/0). It may have nasty performance because it generates O(n^2) intermediate rows. There's no row for user0; it can't be paired with any preceding row because there is none.
select MAX(a.StartRange) SA, MAX(a.EndRange) EA,
b.StartRange SB, b.EndRange EB , b.UserID
from user a
join user b ON a.EndRange <= b.StartRange
group by b.StartRange, b.EndRange, b.UserID
Then, you can use that as a subquery, and apply your conditions, which are
gap >= 800
first matching row (lowest StartRange value) ORDER BY SB
just one LIMIT 1
Here's the query (http://sqlfiddle.com/#!9/4437c0/11/0)
SELECT SB-EA Gap,
EA+1 Beginning_of_gap, SB-1 Ending_of_gap,
UserId UserID_after_gap
FROM (
select MAX(a.StartRange) SA, MAX(a.EndRange) EA,
b.StartRange SB, b.EndRange EB , b.UserID
from user a
join user b ON a.EndRange <= b.StartRange
group by b.StartRange, b.EndRange, b.UserID
) pairs
WHERE SB-EA >= 800
ORDER BY SB
LIMIT 1
Notice that you may actually want the smallest matching gap instead of the first matching gap. That's called best fit, rather than first fit. To get that you use ORDER BY SB-EA instead.
Edit: There is another way to use MySQL to join adjacent rows, that doesn't have the O(n^2) performance issue. It involves employing user variables to simulate a row_number() function. The query involved is a hairball (that's a technical term). It's described in the third alternative of the answer to this question. How do I pair rows together in MYSQL?

MySQL take duplicate data and combine unique data

With my MySQL database, I want to take data from my temporary table and insert it into my main table, while removing any duplicate data but also taking into consideration the data I already have. This seems to require an update and/or an insert depending on what exists in "data_table" so I really have no idea how to write it or if it is even possible. If this isn't possible, I'd like to know how to accomplish this while not considering what is already in "data_table" which I would think is possible. Thank you for your help!
Existing data_table before running query:
data_table
+-----id-----+-----age-----+-----gender-----+-----color-----+
=============+==============+=================+================+
1 5 m pink,red,purple
data_table_temp
+-----id-----+-----age-----+-----gender-----+-----color-----+
=============+==============+=================+================+
1 5 m red
2 5 m blue
3 5 m red
4 5 m orange
5 6 m red
6 6 m green
7 6 m blue
After query:
data_table
+-----id-----+-----age-----+-----gender-----+-----color-----+
=============+==============+=================+================+
1 5 m pink,red,purple,blue,orange
2 6 m red,green,blue
Here is an approach to this problem which turned out to be harder than I expected.
The idea is to concat the colors that don't match and put them together. There is a bit of a problem assigning ids. Getting the "2" for the second row is a problem, so this just assigned the id sequentially:
select #id := #id + 1 as id,
coalesce(dt.age, dtt.age) as age,
coalesce(dt.gender, dtt.gender) as age,
concat_ws(dt.color,
group_concat(case when find_in_set(dtt.color, dt.color) > 0
then dtt.color
end)
)
from data_table_temp dtt left outer join
data_table dt join
on dt.age = dtt.age and
dt.gender = dtt.gender cross join
(select #id := 0) var
group by coalesce(dt.age, dtt.age), coalesce(dt.gender, dtt.gender);
MySQL doesn't have any string functions to (easily) split a delimited string (like data_table.color).
However, if you have all of the data in data_table_temp's format (one color per row), you can generate the desired results like this:
SELECT DISTINCT age, GROUP_CONCAT(DISTINCT color)
FROM table WHERE [condition]
GROUP BY age;
Optionally adding in gender, as necessary.
Apologies for the half-answer

Join Two tables multiple times

I need result from two tables , where one is parent table and other is child table as well as parent table it self for sub level child entries.
if i do the sql query like:
SELECT cc.collection_id, cc.title, cc.type, cc.alias as forum_alias,
SUBSTRING(cc.description,1,200) as short_desc,
COUNT(b1.boardmessage_id) as total_threads,
COUNT(b2.boardmessage_id) as total_replies
FROM contentcollections cc
JOIN boardmessages b1 ON b1.parent_id = cc.collection_id
JOIN boardmessages b2 ON b2.collection_id = cc.collection_id
WHERE cc.type=1
AND cc.is_active=1
AND b1.parent_type='collection'
AND b1.is_active=1
AND b2.parent_type IN('message','reply','reply_on_reply')
GROUP BY cc.collection_id
ORDER BY cc.created DESC;
it gives me the wrong out put with same number of total threads and same number of total replies.How ever if i do something like this
SELECT cc.collection_id, cc.title,cc.type, cc.alias as forum_alias,
SUBSTRING(cc.description,1,200) as short_desc,
(SELECT COUNT(boardmessage_id)
FROM boardmessages
WHERE parent_type='collection'
AND collection_id=cc.collection_id
AND is_active=1) as total_threads,
(SELECT count(boardmessage_id)
FROM boardmessages
WHERE parent_type IN('message','reply','reply_on_reply')
AND collection_id=cc.collection_id AND is_active=1) as total_replies
FROM contentcollections cc
WHERE cc.type=? AND cc.is_active=?
ORDER BY cc.created DESC
It gives me the correct answer.
I suspect i am using sub queries in the second option so it may slow down the performance of the page rendering.
Please suggest me for the same.Any help or suggestion would be greatly appreciated.
Thanks
Replace:
COUNT(b1.boardmessage_id) as total_threads,
COUNT(b2.boardmessage_id) as total_replies
With:
COUNT(DISTINCT b1.boardmessage_id) as total_threads,
COUNT(DISTINCT b2.boardmessage_id) as total_replies
if you only want each row to count once, instead of the default, counting all combinations.
If you have 3 rows in b1 and 5 rows in b2, you get a total of 15 rows, and both counts return that there are 15 rows, with the distinct flag you get the answers 3 and 5 instead, as its 3 distinct values in b1, and 5 distinct values in b2.