I have a table that goes something like this:
samples
sample_id | field | value | list_id
1 country US 10
2 state tx 10
3 country US 11
4 state tx 11
5 emp_size 100 11
I have a query that retrieves list_ids 10 and 11 using the ff code;
select * from samples where (field='country' and value='US') OR (field='state' and value='tx')
However I realized later on that this is not the setup that I want. Let say I include in my clause (field='emp_size' and value='100') because I want to get list_id 11 only, it still includes list_id 10 because I use OR in my query. And right now I'm not sure if there's a workaround for this using plain mysql only or if I should just manipulate the data using php.
Edit
For clarification, I want to get the list_ids based on the given parameters, say, I want US and TX, it should return list_ids 10 and 11. But if I add another parameter, say, emp_size, it should only return list_id 11.
You've got an EAV style data structure, so the best solution here is to self-join the table for each parameter/value combination that you are searching on.
SELECT s1.list_id
FROM samples s1
INNER JOIN samples s2
ON s1.list_id = s2.list_id
AND s2.field = 'state'
AND s2.value = 'tx'
INNER JOIN samples s3
ON s1.list_id = s3.list_id
AND s3.field = 'emp_size'
AND s3.value = '100'
WHERE s1.field = 'country'
AND s1.value = 'US';
Related
First of all im sorry for the title, it's difficult to explain what I'm trying to achieve.
I have 2 tables, a table for property records, and a table for the images uploaded for each property.
In my listing_details table I enter 1 record per property that has a unique ID and property slug. I have a prop_gallery table where I can have hundreds of records that share the same property slug so I can relate it back to my my property.
I'm trying to write a query to pull the records from both tables, but I only want to show each property once, at the moment it's looping through all the records in the gallery and showing that property for as many records their are in the gallery. Hope this makes sense?
My query is...
$listings = $db->query('
SELECT *
FROM listing_details
JOIN prop_gallery
ON prop_gallery.prop_gallery_id = listing_details.prop_slug
WHERE (prop_slug LIKE prop_gallery_id OR prop_gallery_id LIKE prop_slug)
AND listing_details.prop_mandate = 1'
)->fetchAll();
If there's a property called Liams house then there will be a record for that in listing_details and if I've uploaded 10 pictures, there will be 10 records for that in prop_gallery.
When I loop through my results this means I'm now showing Liams house 10 times, when I want to show it just the once.
EDIT
Result of the above query
prop_id prop_agent prop_title prop_slug prop_mandate id prop_gallery_id prop_gallery
37 2 House in switzerland house-in-switzerland 1 4 6 main1.png
37 2 House in switzerland house-in-switzerland 1 4 6 main2.png
37 2 House in switzerland house-in-switzerland 1 4 6 main3.png
You can use the ROW_NUMBER() function. Assuming you have a [any] property in the table listting_details you can sort rows by you can do it cleanly; I assumed the property recorded_at.
For example:
SELECT *
FROM (
SELECT *,
row_number() over(partition by prop_slug order by recorded_at) as rn
FROM listing_details d
JOIN prop_gallery g
ON g.prop_gallery_id = l.prop_slug
WHERE prop_slug LIKE prop_gallery_id OR prop_gallery_id LIKE prop_slug
AND d.prop_mandate = 1
) x
where rn = 1
I have two tables. These two tables may have ID's that do not match. However, also they may have names or addresses that do not match as well. I need to be able to filter out not only ID's but first_name, last_name and street_1 from my list. I can do a JOIN on match ID's but sometimes they match but the other columns may have records that do not match which I would need to show.
Find ID's that do not match. If they do match see if any of the other fields do not match.
Here are my expect results:
id first_name_2 last_name_2 street_1 street_2
3 Teresa White 834 Green Ridge Hill 43 Arapahoe Park
6 Rebecca George 39157 Nelson Hill 7467 Acker Center
7 Ann Hawkins 341 Tennessee Street 8 Bunting Street
8 Joyce Moreno 0277 Bunker Hill Drive 6 Nancy Center
9 Kimberly Alvarez 57332 Di Loreto Lane 0437 Waubesa Avenue
ID 3 & 6 is in the list because the Last Name does not match. ID 7 is last name and street_1. ID 8 & 9 ID's do not match.
Here is my sample data for reference: http://sqlfiddle.com#!9/928568/2
I would do the following: Left joining and treating nulls as blank strings. If you have a legitimate empty string, street_2 for example, it may return false positives:
SELECT *
FROM information I1
LEFT JOIN information_2 I2 ON I1.id = I2.id
WHERE ( I1.first_name_2 <> ifnull(I2.first_name_2, '')
OR I1.last_name_2 <> ifnull(I2.last_name_2, '')
OR I1.street_1 <> ifnull(I2.street_1, '')
OR I1.street_2 <> ifnull(I2.street_2, '')
);
Hi I went through the sample data reference and i feel your requirement is To find all the tuples whose exact copy is not there in there in the second table
You can use the following SQL code I tested this on your feedle and it is giving the expected result
SELECT
i.id, i.first_name_2, i.last_name_2, i.street_1, i.street_2
FROM
information i
LEFT JOIN
information_2 i2
ON
i.id=i2.id AND i.first_name_2=i2.first_name_2 AND i.last_name_2=i2.last_name_2
AND i.street_1=i2.street_1 AND i.street_2 = i2.street_2
where
i2.id is null
There is also a simple way to do this if your database supports MINUS set operator just write
SELECT * FROM information
MINUS
SELECT * FROM information_2
and you will get the same answer
I have a watchlist system that I've coded, in the overview of the users' watchlist, they would see a list of records, however the list shows duplicates when in the database it only shows the exact, correct number.
I've tried GROUP BY watch.watch_id, GROUP BY rec.record_id, none of any types of group I've tried seems to remove duplicates. I'm not sure what I'm doing wrong.
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN members usr ON rec.user_id = usr.user_id
)
WHERE watch.user_id = 1
GROUP BY watch.watch_id
LIMIT 0, 25
The watchlist table looks like this:
+----------+---------+-----------+------------+
| watch_id | user_id | record_id | watch_date |
+----------+---------+-----------+------------+
| 13 | 1 | 22 | 1314038274 |
| 14 | 1 | 25 | 1314038995 |
+----------+---------+-----------+------------+
GROUP BY does not "remove duplicates". GROUP BY allows for aggregation. If all you want is to combine duplicated rows, use SELECT DISTINCT.
If you need to combine rows that are duplicate in some columns, use GROUP BY but you need to to specify what to do with the other columns. You can either omit them (by not listing them in the SELECT clause) or aggregate them (using functions like SUM, MIN, and AVG). For example:
SELECT watch.watch_id, COUNT(rec.street_number), MAX(watch.watch_date)
... GROUP by watch.watch_id
EDIT
The OP asked for some clarification.
Consider the "view" -- all the data put together by the FROMs and JOINs and the WHEREs -- call that V. There are two things you might want to do.
First, you might have completely duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 3
3 4 5
Then simply use DISTINCT
SELECT DISTINCT * FROM V;
a b c
- - -
1 2 3
3 4 5
Or, you might have partially duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 6
3 4 5
Those first two rows are "the same" in some sense, but clearly different in another sense (in particular, they would not be combined by SELECT DISTINCT). You have to decide how to combine them. You could discard column c as unimportant:
SELECT DISTINCT a,b FROM V;
a b
- -
1 2
3 4
Or you could perform some kind of aggregation on them. You could add them up:
SELECT a,b, SUM(c) "tot" FROM V GROUP BY a,b;
a b tot
- - ---
1 2 9
3 4 5
You could add pick the smallest value:
SELECT a,b, MIN(c) "first" FROM V GROUP BY a,b;
a b first
- - -----
1 2 3
3 4 5
Or you could take the mean (AVG), the standard deviation (STD), and any of a bunch of other functions that take a bunch of values for c and combine them into one.
What isn't really an option is just doing nothing. If you just list the ungrouped columns, the DBMS will either throw an error (Oracle does that -- the right choice, imo) or pick one value more or less at random (MySQL). But as Dr. Peart said, "When you choose not to decide, you still have made a choice."
While SELECT DISTINCT may indeed work in your case, it's important to note why what you have is not working.
You're selecting fields that are outside of the GROUP BY. Although MySQL allows this, the exact rows it returns for the non-GROUP BY fields is undefined.
If you wanted to do this with a GROUP BY try something more like the following:
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN est8_records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN est8_members usr ON rec.user_id = usr.user_id
)
WHERE watch.watch_id IN (
SELECT watch_id FROM watch WHERE user_id = 1
GROUP BY watch.watch_id)
LIMIT 0, 25
I Would never recommend using SELECT DISTINCT, it's really slow on big datasets.
Try using things like EXISTS.
You are grouping by watch.watch_id and you have two results, which have different watch IDs, so naturally they would not be grouped.
Also, from the results displayed they have different records. That looks like a perfectly valid expected results. If you are trying to only select distinct values, then you don't want ot GROUP, but you want to select by distinct values.
SELECT DISTINCT()...
If you say your watchlist table is unique, then one (or both) of the other tables either (a) has duplicates, or (b) is not unique by the key you are using.
To suppress duplicates in your results, either use DISTINCT as #Laykes says, or try
GROUP BY watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
It sort of sounds like you expect all 3 tables to be unique by their keys, though. If that is the case, you are simply masking some other problem with your SQL by trying to retrieve distinct values.
I have a MySQL database where one column contains status codes. The column is of type int and the values will only ever be 100,200,300,400. It looks like below; other columns removed for clarity.
id | status
----------------
1 300
2 100
3 100
4 200
5 300
6 300
7 100
8 400
9 200
10 300
11 100
12 400
13 400
14 400
15 300
16 300
The id field is auto-generated and will always be sequential. I want to have a third column displaying a comma-separated string of the frequency distribution of the status codes of the previous 10 rows. It should look like this.
id | status | freq
-----------------------------------
1 300
2 100
3 100
4 200
5 200
6 300
7 100
8 400
9 300
10 300
11 100 300,100,200,400 -- from rows 1-10
12 400 100,300,200,400 -- from rows 2-11
13 400 100,300,200,400 -- from rows 3-12
14 400 300,400,100,200 -- from rows 4-13
15 300 400,300,100,200 -- from rows 5-14
16 300 300,400,100 -- from rows 6-15
I want the most frequent code listed first. And where two status codes have the same frequency it doesn't matter to me which is listed first but I did list the smaller code before the larger in the example. Lastly, where a code doesn't appear at all in the previous ten rows, it shouldn't be listed in the freq column either.
And to be very clear the row number that the frequency string appears on does NOT take into account the status code of that row; it's only the previous rows.
So what have I done? I'm pretty green with SQL. I'm a programmer and I find this SQL language a tad odd to get used to. I managed the following self-join select statement.
select *, avg(b.status) freq
from sample a
join sample b
on (b.id < a.id) and (b.id > a.id - 11)
where a.id > 10
group by a.id;
Using the aggregate function avg, I can at least demonstrate the concept. The derived table b provides the correct rows to the avg function but I just can't figure out the multi-step process of counting and grouping rows from b to get a frequency distribution and then collapse the frequency rows into a single string value.
Also I've tried using standard stored functions and procedures in place of the built-in aggregate functions, but it seems the b derived table is out of scope or something. I can't seem to access it. And from what I understand writing a custom aggregate function is not possible for me as it seems to require developing in C, something I'm not trained for.
Here's sql to load up the sample.
create table sample (
id int NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
status int
);
insert into sample(status) values(300),(100),(100),(200),(200),(300)
,(100),(400),(300),(300),(100),(400),(400),(400),(300),(300),(300)
,(100),(400),(100),(100),(200),(500),(300),(100),(400),(200),(100)
,(500),(300);
The sample has 30 rows of data to work with. I know it's a long question, but I just wanted to be as detailed as I could be. I've worked on this for a few days now and would really like to get it done.
Thanks for your help.
The only way I know of to do what you're asking is to use a BEFORE INSERT trigger. It has to be BEFORE INSERT because you want to update a value in the row being inserted, which can only be done in a BEFORE trigger. Unfortunately, that also means it won't have been assigned an ID yet, so hopefully it's safe to assume that at the time a new record is inserted, the last 10 records in the table are the ones you're interested in. Your trigger will need to get the values of the last 10 ID's and use the GROUP_CONCAT function to join them into a single string, ordered by the COUNT. I've been using SQL Server mostly and I don't have access to a MySQL server at the moment to test this, but hopefully my syntax will be close enough to at least get you moving in the right direction:
create trigger sample_trigger BEFORE INSERT ON sample
FOR EACH ROW
BEGIN
DECLARE _freq varchar(50);
SELECT GROUP_CONCAT(tbl.status ORDER BY tbl.Occurrences) INTO _freq
FROM (SELECT status, COUNT(*) AS Occurrences, 1 AS grp FROM sample ORDER BY id DESC LIMIT 10) AS tbl
GROUP BY tbl.grp
SET new.freq = _freq;
END
SELECT id, GROUP_CONCAT(status ORDER BY freq desc) FROM
(SELECT a.id as id, b.status, COUNT(*) as freq
FROM
sample a
JOIN
sample b ON (b.id < a.id) AND (b.id > a.id - 11)
WHERE
a.id > 10
GROUP BY a.id, b.status) AS sub
GROUP BY id;
SQL Fiddle
Firstly I'd like to start by apologizing for the potentially miss-leading title... I am finding it difficult to describe what I am trying to do here.
With the current project I'm working on, we have setup a 'dynamic' database structure with MySQL that looks something like this.
item_details ( Describes the item_data )
fieldID | fieldValue | fieldCaption
1 | addr1 | Address Line 1
2 | country | Country
item_data
itemID | fieldID | fieldValue
12345 | 1 | Some Random Address
12345 | 2 | United Kingdom
So as you can see, if for example I wanted to lookup the address for the item 12345 I would simply do the statement.
SELECT fieldValue FROM item_data WHERE fieldID=1 and itemID=12345;
But here is where I am stuck... the database is relatively large with around ~80k rows and I am trying to create a set of search functions within PHP.
I would like to be able to perform a query on the result set of a query as quickly as possible...
For example, Search an address name within a certain country... ie: Search for the fieldValue of the results with the same itemID's as the results from the query:
'SELECT itemID from item_data WHERE fieldID=2 and fieldValue='United Kingdom'..
Sorry If I am unclear, I have been struggling with this for the past couple of days...
Cheers
You can do this in a couple of ways. One is to use multiple joins to the item_data table with the fieldID limited to whatever it is you want to get.
SELECT *
FROM
Item i
INNER JOIN item_data country
ON i.itemID = country.itemID
and fieldid = 2
INNER JOIN item_data address
ON i.itemID = country.itemID
and fieldid = 1
WHERE
country.fieldValue= 'United Kingdom'
and address.fieldValue= 'Whatever'
As an aside this structure is often referred to as an Entry Attribute Value or EAV database
Sorry in advance if this sounds patronizing, but (as you suggested) I'm not quite clear what you are asking for.
If you are looking for one query to do the whole thing, you could simply nest them. For your example, pretend there is a table named CACHED with the results of your UK query, and write the query you want against that, but replace CACHED with your UK query.
If the idea is that you have ALREADY done this UK query and want to (re-)use its results, you could save the results to a table in the DB (which may not be practical if there are a large number of queries executed), or save the list of IDs as text and paste that into the subsequent query (...WHERE ID in (...) ... ), which might be OK if your 'cached' query gives you a manageable fraction of the original table.