Get most unique combinations of 2 pictures - mysql

I have a mySQL table that holds n number of pictures.
+------------+--------------+
| picture_id | picture_name |
+------------+--------------+
| 1 | ben.jpg |
| 2 | nick.jpg |
| 3 | mark.jpg |
| 4 | james.jpg |
| .. | ... |
| n | abraham.jpg |
+------------+--------------+
For a web application, i need to display 2 pictures simultaneously where the user can vote for one picture or the other. After voting, the user gets a new set of two pictures.
(application use interface)
+---------------------+--------------------+
| Vorte for picture 1 | Vote for picture 2 |
+---------------------+--------------------+
I would like to avoid displaying the same combinations as much as possible. I can create a helper table that will hold all possible combinations.
+----------------+--------------+--------------+
| combination_id | picture_id_1 | Picture_id_2 |
+----------------+--------------+--------------+
| 1 | 1 | 2 |
| 2 | 1 | 3 |
| 3 | 1 | 4 |
| 4 | 1 | 5 |
| .. | .. | .. |
| (n^2-n)/2 | .. | .. |
+----------------+--------------+--------------+
but for 100 pictures, that would be (100^2 - 100)/2 = 4950 (edit) rows, and with every added picture the table would grow exponentially. (which is not a big issue in todays computing i suppose)
But how do i query this table in a way that the user always sees as little duplicates as possible.
Expected result:
run 1: picture_id's = 4,5 (any numbers between 1 and n)
run 2: picture_id's = 2,7
run 3: picture_id's = 5 and 20
...

DEMO:http://rextester.com/VNWIOA4679 (added 100 pic samples) 2 sec query for 1 user w/o any indexes.
I see no need for a helper table as the data can easily be constructed on the fly with proper indexes. at 1000 pictures you're looking at 499,500 combinations a user could vote upon. still easily managed within a database construct as we operate on a set level, not a record level.
Here's one way assuming my own table structures. I can't think of a more efficient way to store/process the data.
Using this approach as new pictures are added the query will generate a larger and larger combination set but always exclude those on which a user has already voted. no code changes for new pics, no regenerating sets just processing each time the ones a user hasn't made a selection upon.
Create table SO46205797_Pics(
PICID int);
Insert into SO46205797_Pics values (1);
Insert into SO46205797_Pics values (2);
Insert into SO46205797_Pics values (3);
Insert into SO46205797_Pics values (4);
Insert into SO46205797_Pics values (5);
Insert into SO46205797_Pics values (6);
Insert into SO46205797_Pics values (7);
Create table SO46205797_UserPicResults (
USERID int,
PICID int,
PICID2 int,
PICChoiceID int);
Insert into SO46205797_UserPicResults values (1,1,2,1);
Insert into SO46205797_UserPicResults values (1,1,3,1);
Insert into SO46205797_UserPicResults values (1,1,4,4);
magic happens here the above was just data setup.
SELECT A.PICID, B.PICID, C.PICChoiceID
FROM SO46205797_Pics A
INNER JOIN SO46205797_Pics B
on A.PICID < B.PICID
LEFT JOIN SO46205797_UserPicResults C
on A.PICID = C.PicID
and B.PICID = C.PICID2
and C.USERID = 1
WHERE C.userID is null;
Note that if we eliminate the C.userID is null part then we see all of the possible combinations (for user1) (note I treat ID 1, 2 the same as ID 2,1 which I think youw ant) for the 2 photos and which ones the user has selected. Since we don't want to display that choice again, we use the c.userID is null to exclude combinations the user already made a choice for.
Also when saving data to the userPicResults, you need to ensure PICID1 is always less than PICID2.
A different way to do this is using a not exists which may be slightly faster.
obviously indexes on USERID, PICID, PICID2 and in that order would be beneficial (i'd probably make it the a combined PK) for SO46205797_UserPicResults and an index on PICID for SO46205797_Pics as the PK.
SELECT A.PICID, B.PICID
FROM SO46205797_Pics A
INNER JOIN SO46205797_Pics B
on A.PICID < B.PICID
WHERE not exists (SELECT *
FROM SO46205797_UserPicResults C
WHERE A.PICID = C.PicID
and B.PICID = C.PICID2
and C.USERID = 1);
I considered maintaining a parent/child relationship for each image for each user; but this approach doesn't store the choices for all combinations.

The goal of this application is to let people vote for one picture against another, right? Then you need to have some kind of vote results table:
vote_results:
| vote_id | user_id | vote_up_picture_id | vote_down_picture_id | ...
Then, based on data from this table you can easily show to a user picture pairs, which he haven't seen yet:
select first.picture_id, second.picture_id
from pictures as first, pictures as second
where not exists(
select * from vote_results v
where (v.vote_up_picture_id = first.picture_id and v.vote_down_picture_id = second.picture_id)
or (v.vote_up_picture_id = second.picture_id and v.vote_down_picture_id = first.picture_id)
) and first.picture_id != second.picture_id
order by rand()
limit 1
PS. As you see, there is no need in helper table with combination_id

Related

search data location wise in PHP SQL

I have some parent and daughter design-wise locations-id in the MySQL database.
Where the daughter linked to the parent. I will show the database design below -
I can able to fetch the data when I search it through daughter location id wise but I don't have any idea how I combined the daughter value when I click parent location.
For example -
MainLocation (123) //total stock 23+10+56= 89
|
|
|---- DaughterLoc1 (456) //suppose stock 23
|
|---- DaughterLoc2 (789) //suppose stock 10 and total stock 10+56 = 66
|
|
|---DaughterLocA (963) //suppose stock 56
SQL : SELECT stock FROM table WHERE location = '456'
OUTPUT = 23 (Corrent)
But I want when searching location 123 I want output 89
My table design is like this below -
table: LocParent
-------------------------
| ID | stock | loc_id |
-------------------------
| 1 | 10 | 789 |
-------------------------
`location`
--------------------------------------------------------------------------------
| ID | main_loc | main_loc_id | loc_under | loc_under_id | stock |
--------------------------------------------------------------------------------
| 1 | MainLocation | 123 | DaughterLoc1 | 456 | 23 |
--------------------------------------------------------------------------------
| 2 | MainLocation | 123 | DaughterLoc2 | 789 | 10 |
--------------------------------------------------------------------------------
It is hard to tell from your sample structure what things actually look like still, and it is further complicated by multiple things called an "id". But, generally speaking, if your depth is finite, you can make small sub-queries, and if your depth is infinite (or unbound) you can make a recursive query.
Here is a sample database. It doesn't match yours, but hopefully it make sense still. If it doesn't, it would help if you provided an actual schema and data (excluding irrelevant columns).
This table is self-referencing to make things easier for demo.
CREATE TABLE sample
(
id int AUTO_INCREMENT NOT NULL,
parent_id INT NULL,
stock int NOT NULL,
PRIMARY KEY (`id`),
CONSTRAINT FOREIGN KEY (`parent_id`) REFERENCES `sample` (`id`)
);
And here's some sample data. There are two records that are "root" and don't have parent values (IDs 1 and 5), two child values (IDs 2 and 3) and one grandchild value (ID 4)
INSERT INTO sample VALUES (1, null, 11);
INSERT INTO sample VALUES (2, 1, 22);
INSERT INTO sample VALUES (3, 1, 33);
INSERT INTO sample VALUES (4, 2, 4);
INSERT INTO sample VALUES (5, null, 55);
Finite/bound
If you have a finite/bound depth, you can make use of subqueries like the below. This one goes to a depth of 3 and sums to 70. Hopefully it is fairly easy to read, but I've included a couple of comments.
SELECT
s.id,
s.stock -- root
+
(
(
SELECT
SUM(c.stock) -- child
FROM
sample c
WHERE
c.parent_id = s.id
)
+
(
SELECT
SUM(p.stock) -- grandchild
FROM
sample c
JOIN
sample p
ON
p.parent_id = c.id
WHERE
c.parent_id = s.id
)
)
as three_level_sum
FROM
sample s
WHERE
s.id = 1;
Infinite/unbound
If you have an infinite hierarchy, however, things get more complicated. MySQL and other database platforms have a thing called "Common Table Expressions" (CTEs) that allow you to make recursive queries. These can be harder to wrap your head around because of the recursion, but it basically does the same as the previous version, just with infinite depth. This version also returns the sum of 70.
WITH RECURSIVE sample_rec AS
(
SELECT
id AS root_id,
id,
parent_id,
stock
FROM
sample
WHERE
parent_id IS NULL
UNION ALL
SELECT
R.root_id,
E.id,
E.parent_id,
E.stock
FROM
sample E
INNER JOIN
sample_rec R
ON
E.parent_id = R.id
)
SELECT
SUM(stock)
FROM
sample_rec
WHERE
root_id = 1

MYSQL - Select Unique Common Columns between two tables - Most Efficient Query

I have two tables:
db_contacts
Phone | Name | Last_Name
--------------------
111 | Foo | Foo
222 | Bar | Bar
333 | John | Smith
444 | Tomy | Smith
users_contacts
User_ID | Phone
--------------------
1 | 123
1 | 111
2 | 222
2 | 333
3 | 111
3 | 333
4 | 444
Notice from above that:
User with ID 2 is the only one that have the phone number 222
User with ID 4 is the only one that have the phone number 444
I need to obtain these results with a MySQL query.
In other words: How can I select all the users that have a unique phone number in condition that this number exists in the db_contacts.
I need my end result to be something like that:
User_ID | Phone | Name | Last_Name
------------------------------------
2 | 222 | Bar | Bar
4 | 444 | Tomy | Smith
PS: There is no Foreign key between the Phone columns, as a User can have a phone that is not in the db_contacts.
In real life, db_contacts contains about 1 million records and users_contacts about 5 million records.
What I tried and failed and taking a lot of time to execute:
SELECT *
FROM users_contacts
WHERE users_contacts.phone IN (
SELECT users_contacts.phone
FROM `users_contacts`
JOIN db_contacts ON db_contacts.phone = users_contacts.phone
GROUP BY users_contacts.phone
HAVING COUNT(users_contacts.phone) = 1
)
Update:
Thank you for your replies, I have provided my solution that fits my case perfectly.
I think you want:
select uc.*
from user_contacts uc
where not exists (select 1
from user_contacts uc2
where uc2.phone = uc.phone and uc2.user_id <> uc.user_id
);
For performance, you want an index on user_contacts(phone, user_id).
Another method is:
select max(user_id) as user_id, phone
from user_contacts
group by phone
having count(*) = 1;
The not exists version is probably going to be faster.
I would use a simple JOIN with a NOT EXISTS condition. This is usually the most efficient way to check that something has no duplicates ; compared to your solution, this has the advantage of avoiding aggregation.
SELECT uc.User_ID, dc.*
FROM users_contacts uc
INNER JOIN db_contacts dc ON uc.Phone = dc.Phone
WHERE NOT EXISTS (
SELECT 1
FROM users_contacts uc1
WHERE uc1.Phone = dc.Phone AND uc1.User_ID != uc2.User_ID
)
Hint: consider setting the following indexes:
users_contacts(Phone, User_ID)
db_contacts(Phone)
I first would like to thank everyone that posted solutions, they all worked.
But I was a bit crucial on response times, and solutions provided by the fellows took a lot of time to execute, couple of seconds.
In case anyone was having a similar problem, I ended up by creating a new table calling it users_unique_contacts, and created a trigger AFTER INSERT on users_contacts that checks if the newly created contact existed in the users_unique_contacts, if it didn't exist, add it, else remove it as it means the number is not unique anymore.
My Trigger went like this:
BEGIN
IF EXISTS (SELECT 1 = 1 FROM users_unique_contacts WHERE phone = new.phone LIMIT 1) THEN
BEGIN
DELETE FROM users_unique_contacts WHERE phone = new.phone LIMIT 1;
END;
ELSE
BEGIN
INSERT INTO users_unique_contacts (user_id,phone) VALUES (new.user_id, new.phone);
END;
END IF;
END
Now everytime I want the unique numbers of a user, I query the users_unique_contacts and execution time is milliseconds.

SQL join category from timeline into timestamp table

Consider the following structure:
create table timestamps(id int, stamp timestamp);
insert into timestamps values
(1,'2017-10-01 10:05:01'),
(2,'2017-10-01 11:05:01'),
(3,'2017-10-01 12:05:01'),
(4,'2017-10-01 13:05:01');
create table category_timeline(begin timestamp,end timestamp, category varchar(100));
insert into category_timeline values
('2017-10-01 10:01:03','2017-10-01 12:01:03','Cat1'),
('2017-10-01 12:01:03','2017-10-01 12:42:43','Cat3'),
('2017-10-01 12:42:43','2017-10-01 14:01:03','Cat2');
Sqlfiddle of same: SQL Fiddle
I have two tables, one (timestamps) containing timestamps, and one (category_timeline) containing a timeline of categories, that is, we assume the records in category_timeline form a continuous non-overlapping timeline assigning a category to each time period.
I want to assign the categories to the timestamps table, resulting in:
| id | stamp | category |
|----|----------------------|----------|
| 1 | 2017-10-01T10:05:01Z | Cat1 |
| 2 | 2017-10-01T11:05:01Z | Cat1 |
| 3 | 2017-10-01T12:05:01Z | Cat3 |
| 4 | 2017-10-01T13:05:01Z | Cat2 |
which is the result of the following query:
SELECT id, stamp, category FROM timestamps ts
LEFT JOIN category_timeline tl
ON ts.stamp >= tl.begin
AND ts.stamp < tl.end
However, as soon as the tables get bigger, this operation seems to get exponentially slower, is there a better way to do this, using the assumption that any timestamp only falls within a unique period in the other table.
I would suggest this approach:
SELECT ts.id, ts.stamp,
(SELECT tl.category
FROM category_timeline tl
WHERE tl.end > ts.stamp
ORDER BY tl.end ASC
LIMIT 1
) as category
FROM timestamps ts ;
Be sure you have an index on category_timeline(end, category).

Reorder rows in a MySQL table

I have a table:
+--------+-------------------+-----------+
| ID | Name | Order |
+--------+-------------------+-----------+
| 1 | John | 1 |
| 2 | Mike | 3 |
| 3 | Daniel | 4 |
| 4 | Lisa | 2 |
| 5 | Joe | 5 |
+--------+-------------------+-----------+
The order can be changed by admin hence the order column. On the admin side I have a form with a select box Insert After: to entries to the database. What query should I use to order+1 after the inserted column.
I want to do this in a such way that keeps server load to a minimum because this table has 1200 rows at present. Is this the correct way to save an order of the table or is there a better way?
Any help appreciated
EDIT:
Here's what I want to do, thanks to itsmatt:
want to reorder row number 1 to be after row 1100, you plan to leave 2-1100 the same and then modify 1 to be 1101 and increment 1101-1200
You need to do this in two steps:
UPDATE MyTable
SET `Order` = `Order` + 1
WHERE `Order` > (SELECT `Order`
FROM MyTable
WHERE ID = <insert-after-id>);
...which will shift the order number of every row further down the list than the person you're inserting after.
Then:
INSERT INTO MyTable (Name, `Order`)
VALUES (Name, (SELECT `Order` + 1 FROM MyTable WHERE ID = <insert-after-id>));
To insert the new row (assuming ID is auto increment), with an order number of one more than the person you're inserting after.
Just add the new row in any normal way and let a later SELECT use ORDER BY to sort. 1200 rows is infinitesimally small by MySQL standards. You really don't have to (and don't want to) keep the physical table sorted. Instead, use keys and indexes to access the table in a way that will give you what you want.
you can
insert into tablename (name, `order`)
values( 'name', select `order`+1 from tablename where name='name')
you can also you id=id_val in your inner select.
Hopefully this is what you're after, the question isn't altogether clear.

Update one MySQL table with values from another

I'm trying to update one MySQL table based on information from another.
My original table looks like:
id | value
------------
1 | hello
2 | fortune
3 | my
4 | old
5 | friend
And the tobeupdated table looks like:
uniqueid | id | value
---------------------
1 | | something
2 | | anything
3 | | old
4 | | friend
5 | | fortune
I want to update id in tobeupdated with the id from original based on value (strings stored in VARCHAR(32) field).
The updated table will hopefully look like:
uniqueid | id | value
---------------------
1 | | something
2 | | anything
3 | 4 | old
4 | 5 | friend
5 | 2 | fortune
I have a query that works, but it's very slow:
UPDATE tobeupdated, original
SET tobeupdated.id = original.id
WHERE tobeupdated.value = original.value
This maxes out my CPU and eventually leads to a timeout with only a fraction of the updates performed (there are several thousand values to match). I know matching by value will be slow, but this is the only data I have to match them together.
Is there a better way to update values like this? I could create a third table for the merged results, if that would be faster?
I tried MySQL - How can I update a table with values from another table?, but it didn't really help. Any ideas?
UPDATE tobeupdated
INNER JOIN original ON (tobeupdated.value = original.value)
SET tobeupdated.id = original.id
That should do it, and really its doing exactly what yours is. However, I prefer 'JOIN' syntax for joins rather than multiple 'WHERE' conditions, I think its easier to read
As for running slow, how large are the tables? You should have indexes on tobeupdated.value and original.value
EDIT:
we can also simplify the query
UPDATE tobeupdated
INNER JOIN original USING (value)
SET tobeupdated.id = original.id
USING is shorthand when both tables of a join have an identical named key such as id. ie an equi-join - http://en.wikipedia.org/wiki/Join_(SQL)#Equi-join
It depends what is a use of those tables, but you might consider putting trigger on original table on insert and update. When insert or update is done, update the second table based on only one item from the original table. It will be quicker.