Fetching crossed records from database - mysql

I have the following table structure (simplified):
id | structure_id | structure_hash_id
1 1 1
2 1 2
3 1 3
4 2 4
5 2 1
6 3 2
As you can see, all structures contain many structure hashes. What I want to fetch is information for each structure id, how many structure hashes it contains exist in other structures. So for this example it'd be:
structure_id #1: 2
structure_id #2: 1
structure_id #3: 1
The query I wrote for this is:
SELECT contains.structure_id, COUNT(contains.structure_hash_id)
FROM (
SELECT *
FROM structureTable st
WHERE structure_id = 1
) AS contains
INNER JOIN (
SELECT *
FROM structureTable st
WHERE structure_id != 1
) AS notcontains
ON contains.structure_hash_id = notcontains.structure_hash_id
GROUP BY contains.structure_id;
It works, I wrote it from memory, I don't remember how I wrote it earlier as I deleted it, but you got the idea.
But the problem is that in real table I've got ~500mln records and some other columns, so for each structure_id the query execution time is huge (> 15min).
Also, I have type structure_id manually, while I'd like to have them all as a result like I gave an example at the top of this post.
How can I solve this problem?

You can achieve this with self join and group by.
Here is the way to do that:
select
t1.structure_id ,
count(t1.structure_id ) as count
from structure t1
inner join structure t2 on t1.structure_id !=t2.structure_id
and t1.structure_hash_id=t2.structure_hash_id
group by t1.structure_id
SQL Fiddle Example: http://sqlfiddle.com/#!9/678bf7/1/0

Related

Get records from multiple tables, but only show 1 per ID?

First of all im sorry for the title, it's difficult to explain what I'm trying to achieve.
I have 2 tables, a table for property records, and a table for the images uploaded for each property.
In my listing_details table I enter 1 record per property that has a unique ID and property slug. I have a prop_gallery table where I can have hundreds of records that share the same property slug so I can relate it back to my my property.
I'm trying to write a query to pull the records from both tables, but I only want to show each property once, at the moment it's looping through all the records in the gallery and showing that property for as many records their are in the gallery. Hope this makes sense?
My query is...
$listings = $db->query('
SELECT *
FROM listing_details
JOIN prop_gallery
ON prop_gallery.prop_gallery_id = listing_details.prop_slug
WHERE (prop_slug LIKE prop_gallery_id OR prop_gallery_id LIKE prop_slug)
AND listing_details.prop_mandate = 1'
)->fetchAll();
If there's a property called Liams house then there will be a record for that in listing_details and if I've uploaded 10 pictures, there will be 10 records for that in prop_gallery.
When I loop through my results this means I'm now showing Liams house 10 times, when I want to show it just the once.
EDIT
Result of the above query
prop_id prop_agent prop_title prop_slug prop_mandate id prop_gallery_id prop_gallery
37 2 House in switzerland house-in-switzerland 1 4 6 main1.png
37 2 House in switzerland house-in-switzerland 1 4 6 main2.png
37 2 House in switzerland house-in-switzerland 1 4 6 main3.png
You can use the ROW_NUMBER() function. Assuming you have a [any] property in the table listting_details you can sort rows by you can do it cleanly; I assumed the property recorded_at.
For example:
SELECT *
FROM (
SELECT *,
row_number() over(partition by prop_slug order by recorded_at) as rn
FROM listing_details d
JOIN prop_gallery g
ON g.prop_gallery_id = l.prop_slug
WHERE prop_slug LIKE prop_gallery_id OR prop_gallery_id LIKE prop_slug
AND d.prop_mandate = 1
) x
where rn = 1

sql sort by int, except for one int value which has a different order

I want to sort a list of record from the database. The records retrieved is sorted based on a column with int values. The possible int values are 1,2,3,4,5. The sorting order required is 1,3,2,4,5
so i cannot use Order by table.a asc. What should be my query to retrieve the desired order? for example my table has the following record
---------------------
name | to_order |
--------------------
n1 | 1
--------------------
n2 | 2
--------------------
n3 | 3
--------------------
The result of my query should be (n1,1),(n3,3),(n2,2).
NOTE: I am using mysql
you can add case statements to your order by clause
SELECT name, to_order
FROM Table1
ORDER BY
(
CASE
WHEN to_order = 1 THEN 0
WHEN to_order = 3 THEN 1
ELSE 2
END
),
to_order
Add case for 3 so it would be prioritize againts other numbers. I included 1 as case since 1 should still be prioritize before 3
I would write this as:
order by (case when to_order = 2 then 3
when to_order = 3 then 2
else to_order
end)
For the tightly constrained data you show, a case statement is by far the most efficient solution. If, however, you have significantly more than 5 values, or if you might extend to more than 5 values in the future, or if you may have to change the ordering at some time then you might want to decouple the entire process by moving the ordering criteria to a separate table.
OldOrder NewOrder
======== ========
1 1
2 3
3 2
4 4
5 5
So your query would join on the first field and sort by the second.
select ...
from ...
inner join SortTable st
on st.OldOrder = to_order
where ...
order by st.NewOrder, ...;
Then to add new values or change the sorting order, you just make changes to the table instead of rewriting all your code.

MySQL select only new records

How to write a MySQL query to achieve this task?
Table: writers
w_id w_name
---------------
1 Michael
2 Samantha
3 John
---------------
Table: articles
a_id w_id timestamp a_name
----------------------------------------
1 1 0000000001 PHP programming
2 3 0000000003 Other programming languages
3 3 0000000005 Another article
4 2 0000000015 Web design
5 1 0000000020 MySQL
----------------------------------------
Need to SELECT only those writers who published their first article not earlier than 0000000005. (only writers who published at least one article can be selected)
In this example the result would be:
2 Samantha
SQL code can be tested here http://sqlfiddle.com/#!2/7a308
Untested, but close:
SELECT w_id, MIN(timestamp) as min_time
from writers w
JOIN articles a on w.w_id = a.w_id
GROUP BY 1
HAVING min_time > 5
Here's one approach, using an inline view (or "derived table" as MySQL calls it) to get the earliest timestamp for each writer:
SELECT w.w_id
, w.w_name
-- , e.earliest_timestamp
FROM writers w
LEFT
JOIN ( SELECT a.w_id
, MIN(a.timestamp) AS earliest_timestamp
FROM articles a
GROUP BY a.w_id
) e
ON e.w_id = w.w_id
WHERE e.earliest_timestamp >= '0000000005'
ORDER BY w.w_id
This may not be the most efficient approach, but you can run just the query in the inline view (aliased as e) to see what it returns. We can then reference the result set from that query like we do a table (with some restrictions.)
(Other approaches can make better use of suitable indexes.)
I'm unclear on the datatype of earliest_timestamp column. The SQL above assumes it's character datatype. If it's integer rather than character, the WHERE clause could look like this:
WHERE e.earliest_timestamp >= 5

GROUP BY does not remove duplicates

I have a watchlist system that I've coded, in the overview of the users' watchlist, they would see a list of records, however the list shows duplicates when in the database it only shows the exact, correct number.
I've tried GROUP BY watch.watch_id, GROUP BY rec.record_id, none of any types of group I've tried seems to remove duplicates. I'm not sure what I'm doing wrong.
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN members usr ON rec.user_id = usr.user_id
)
WHERE watch.user_id = 1
GROUP BY watch.watch_id
LIMIT 0, 25
The watchlist table looks like this:
+----------+---------+-----------+------------+
| watch_id | user_id | record_id | watch_date |
+----------+---------+-----------+------------+
| 13 | 1 | 22 | 1314038274 |
| 14 | 1 | 25 | 1314038995 |
+----------+---------+-----------+------------+
GROUP BY does not "remove duplicates". GROUP BY allows for aggregation. If all you want is to combine duplicated rows, use SELECT DISTINCT.
If you need to combine rows that are duplicate in some columns, use GROUP BY but you need to to specify what to do with the other columns. You can either omit them (by not listing them in the SELECT clause) or aggregate them (using functions like SUM, MIN, and AVG). For example:
SELECT watch.watch_id, COUNT(rec.street_number), MAX(watch.watch_date)
... GROUP by watch.watch_id
EDIT
The OP asked for some clarification.
Consider the "view" -- all the data put together by the FROMs and JOINs and the WHEREs -- call that V. There are two things you might want to do.
First, you might have completely duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 3
3 4 5
Then simply use DISTINCT
SELECT DISTINCT * FROM V;
a b c
- - -
1 2 3
3 4 5
Or, you might have partially duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 6
3 4 5
Those first two rows are "the same" in some sense, but clearly different in another sense (in particular, they would not be combined by SELECT DISTINCT). You have to decide how to combine them. You could discard column c as unimportant:
SELECT DISTINCT a,b FROM V;
a b
- -
1 2
3 4
Or you could perform some kind of aggregation on them. You could add them up:
SELECT a,b, SUM(c) "tot" FROM V GROUP BY a,b;
a b tot
- - ---
1 2 9
3 4 5
You could add pick the smallest value:
SELECT a,b, MIN(c) "first" FROM V GROUP BY a,b;
a b first
- - -----
1 2 3
3 4 5
Or you could take the mean (AVG), the standard deviation (STD), and any of a bunch of other functions that take a bunch of values for c and combine them into one.
What isn't really an option is just doing nothing. If you just list the ungrouped columns, the DBMS will either throw an error (Oracle does that -- the right choice, imo) or pick one value more or less at random (MySQL). But as Dr. Peart said, "When you choose not to decide, you still have made a choice."
While SELECT DISTINCT may indeed work in your case, it's important to note why what you have is not working.
You're selecting fields that are outside of the GROUP BY. Although MySQL allows this, the exact rows it returns for the non-GROUP BY fields is undefined.
If you wanted to do this with a GROUP BY try something more like the following:
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN est8_records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN est8_members usr ON rec.user_id = usr.user_id
)
WHERE watch.watch_id IN (
SELECT watch_id FROM watch WHERE user_id = 1
GROUP BY watch.watch_id)
LIMIT 0, 25
I Would never recommend using SELECT DISTINCT, it's really slow on big datasets.
Try using things like EXISTS.
You are grouping by watch.watch_id and you have two results, which have different watch IDs, so naturally they would not be grouped.
Also, from the results displayed they have different records. That looks like a perfectly valid expected results. If you are trying to only select distinct values, then you don't want ot GROUP, but you want to select by distinct values.
SELECT DISTINCT()...
If you say your watchlist table is unique, then one (or both) of the other tables either (a) has duplicates, or (b) is not unique by the key you are using.
To suppress duplicates in your results, either use DISTINCT as #Laykes says, or try
GROUP BY watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
It sort of sounds like you expect all 3 tables to be unique by their keys, though. If that is the case, you are simply masking some other problem with your SQL by trying to retrieve distinct values.

How to create a mysql join query with hierarchical data

I need to create a join query for hierarchical data across 2 tables. These tables can have unlimited amounts of data and their structures are as follows:
group_id group_name group_order
1 group 1 2
2 group 2 1
field_id field_name parent_group field_order
1 field 1 1 1
2 field 2 2 2
3 field 3 2 1
I am currently able to get the correct format of data using 2 select queries with the second query inside a loop created from the results of the first query on the groups table.
The structure of the data I require from the result is as follows:
-group 2
- field 3
- field 2
- group 1
- field 1
Is it possible to get these results from one mysql query? I have read through the mysql document on hierarchical data by I am confused about how to incorporate the join.
Thanks for looking
You shouldn't need to think about it in terms of hierarchical data, you should just be able to select your fields and join on your group information. Try something like:
SELECT *
FROM Fields AS F
INNER JOIN Groups AS G
ON G.group_id = F.parent_group
ORDER BY group_order, field_order
Then you will get each fields as a row with the applicable group, also in the correct group order. Your loop should be able to handle the display you need.
one method
something that may convince you change your db schema