Mysql - ranking based on occurrence of a value - mysql

I have a table in MySQL database to which records get added continuously.I want to rank the occurrence of a value and save it to a new field.
I want to populate the FruitRank field, based on the occurrence of the fruit, for that particular Person/Name.
Here is the expected table - Name and Fruit get added to the table in real time.
FruitRank field should be calculated and updated in real time. How to go about this?
Name | Fruit | FruitRank(new field)
Amy | apple | 1
Amy | apple | 1
Amy | apple | 1
Amy | orange| 2
Amy | orange| 2
Tom | grapes | 1
Tom | grapes | 1
Amy | kiwi | 3
Amy | kiwi | 3

Here are two possible approaches, at the database level, depending on your needs:
If there's a reason to store a record in the database for every occurrence of a Person/Fruit (i.e. You need to save the time the fruit was eaten, etc.), then there is no reason to store the rank value in the database, as that would require an UPDATE with each INSERT. You can get the rank with a simple query using COUNT(*).
If there is no reason to store every occurrence, then you should have only one entry per Person/Fruit combination with a rank value which is updated on every subsequent occurrence.
Rank Retrieved with Aggregate Function
Query to get the rank: (Assuming every occurrence is stored in Person_Fruit table)
SELECT person, fruit, COUNT(*)
FROM person_fruit
WHERE person = 'the_person'
AND fruit = 'the_fruit'
GROUP BY 1, 2;
Rank Stored in Database
Assuming table Fruits (id, person, fruit, rank), with a multi-column index on person and fruit, so you have only one occurrence of each unique combination.
Prior to INSERT/UPDATE, check if Person/Fruit already exists:
SELECT id
FROM fruits
WHERE name = 'the_name' AND fruit = 'the_fruit';
If it doesn't, INSERT Person/Fruit with rank value of 1, as this is the first occurrence:
INSERT INTO fruits (id, person, fruit, rank)
VALUES (NULL, 'the_person', 'the_fruit', 1); /* NULL should be replaced by auto-generated value, if set up for that */
If it does, UPDATE the rank:
UPDATE fruits
SET rank = rank +1
WHERE id = id /* You can use id returned from previous `SELECT`, or Person AND Fruit */

You can write an AFTER INSERT trigger on current table that calculates fruit rank and inserts the records into another table. You can use the following query to calculate the rank:
SELECT COUNT(*) into rank
FROM table
WHERE person NEW.person and fruit = NEW.fruit;
Once you get the rank, you can execute the INSERT query to insert the records into another table.
Here's an example of AFTER INSERT trigger.

From your sample data (the only available way to somehow define what table structure you have in mind) it is not clear why should you retain different records with absolutely same payload.
It basically seems that the only thing you're updating - is rank.
So in this case, the rank come naturally with the updates like (rather than inserts):
UPDATE fruitRanks SET FruitRank=FruitRank+1 WHERE Fruit = 'apple' AND Name='Amy';
If you do have difference in your payloads - then use the separate table (in conjunction with AFTER UPDATE trigger), there's no need to retain the Rank in each and every row then.
Or just drop the field and calculate it whenever you need it with grouping & aggregation function.

Related

How to generate a unique id based on different id category?

I have a table as shown below
| id | name | doc_no |
|:-----------|------------:|:------------:|
| 1 | abc | D11710001
| 2 | efg | D21710001
| 3 | hij | D31710001
| 4 | klm | D41710001
| 5 | nop | D51710001
| 1 | qrs | D11710002
I want to generate an unique id based on the id given. For example, when i have item to be stored in this table, it will generate an unique id based on the id of the table.
Note: The id in this table is a foreign key. The doc no can be modified by user into their own format manually.
The id format - D 'id' 'year' 'month' 0001(auto increment)
How can i write the sql to generate unique id during storing data?
Continuing with the comment by #strawberry I might recommend not storing the ID in your database. Besides the fact that accessing the auto increment ID at the same time you are inserting the record might be tricky, storing this generated ID would be duplicating the information already stored elsewhere in your table. Instead of storing your ID, just generate it when you query, e.g.
SELECT
id, name, doc_no,
CONCAT('D', id, STR_TO_DATE(date, '%Y-%m'), auto_id) AS unique_id
FROM yourTable;
This assumes that you would be storing the insertion date of each record in a date column called date. It also assumes that your table has an auto increment column called auto_id. Note that having the date of insertion stored may be useful to you in other ways, e.g. if you want to search for data in your table based on date or time.
You could create Trigger and update the column or you can write the update state just after your INSERT
insert into <YOUR_TABLE>(NAME,DOC_NO) values('hello','dummy');
update <YOUR_TABLE> set DOC_NO=CONCAT('D',
CAST(YEAR(NOW()) AS CHAR(4)),
CAST(MONTH(NOW()) AS CHAR(4)),
LAST_INSERT_ID())
WHERE id=LAST_INSERT_ID();
Please note, as above SQL may cause race condition, when simultaneously server get multiple requests.
#Tim Biegeleisen has good point though, as it is better to construct the id when you are SELECTing the data.

MySQL complex subquery formulation

I have two tables - books and images. books has columns like id, name, releasedate, purchasecount. images has bookid (which is same as the id in books, basically one book can have multiple images. Although I haven't set any foreign key constraint), bucketid, poster (each record points to an image file in a certain bucket, for a certain bookid).
Table schema:
poster is unique in images, hence it is a primary key.
Covering index on books: (name, id, releasedate)
Covering index on images: (bookid,poster,bucketid)
My query is, given a name, find the top ten books (sorted by number of purchasecount) from the books table whose name matches that name, and for that book, return any (preferably the first) record (bucketid and poster) from the images table.
Obviously this can be solved by two queries by running the first, and using its results to query the images table, but that will be slow, so I want to use 'join' and subquery to do it in one go. However, what I am trying is not giving me correct results:
select books.id,books.name,year(releasedate),purchasecount,bucketid,poster from books
inner join (select bucketid,bookid, poster from images) t on
t.bookid = books.id where name like "%foo%" order by purchasecount desc limit 2;
Can anybody suggest an optimal query to fetch the result set as desired here (including any suggestion to change the table schema to improve search time) ?
Updated fiddle: http://sqlfiddle.com/#!9/17c5a8/1.
The example query should return two results - fooe and fool, and one (any of the multiple posters corresponding to each book) poster for each result. However I am not getting correct results. Expected:
fooe - 1973 - 459 - 11 - swt (or fooe - 1973 - 459 - 11 - pqr)
fool - 1963 - 456 - 12 - xxx (or fool - 1963 - 456 - 111 - qwe)
I agree with Strawberry about the schema. We can discuss ideas for better performance and all that. But here is my take on how to solve this after a few chats and changes to the question.
Note below the data changes to deal with various boundary conditions which include books with no images in that table, and tie-breaks. Tie-breaks meaning using the max(upvotes). The OP changed the question a few times and added a new column in the images table.
Modified quetion became return 1 row make per book. Scratch that, always 1 row per book even if there are no images. The image info to return would be the one with max upvotes.
Books table
create table books
( id int primary key,
name varchar(1000),
releasedate date,
purchasecount int
) ENGINE=InnoDB;
insert into books values(1,"fool","1963-12-18",456);
insert into books values(2,"foo","1933-12-18",11);
insert into books values(3,"fooherty","1943-12-18",77);
insert into books values(4,"eoo","1953-12-18",678);
insert into books values(5,"fooe","1973-12-18",459);
insert into books values(6,"qoo","1983-12-18",500);
Data Changes from original question.
Mainly the new upvotes column.
The below includes a tie-break row added.
create table images
( bookid int,
poster varchar(150) primary key,
bucketid int,
upvotes int -- a new column introduced by OP
) ENGINE=InnoDB;
insert into images values (1,"xxx",12,27);
insert into images values (5,"pqr",11,0);
insert into images values (5,"swt",11,100);
insert into images values (2,"yyy",77,65);
insert into images values (1,"qwe",111,69);
insert into images values (1,"blah_blah_tie_break",111,69);
insert into images values (3,"qwqqe",14,81);
insert into images values (1,"qqawe",8,45);
insert into images values (2,"z",81,79);
Visualization of a Derived Table
This is just to assist in visualizing an inner piece of the final query. It demonstrates the gotcha for tie-break situations, thus the rownum variable. That variable is reset to 1 each time the bookid changes otherwise it increments. In the end (our final query) we only want rownum=1 rows so that max 1 row is returned per book (if any).
Final Query
select b.id,b.purchasecount,xDerivedImages2.poster,xDerivedImages2.bucketid
from books b
left join
( select i.bookid,i.poster,i.bucketid,i.upvotes,
#rn := if(#lastbookid = i.bookid, #rn + 1, 1) as rownum,
#lastbookid := i.bookid as dummy
from
( select bookid,max(upvotes) as maxup
from images
group by bookid
) xDerivedImages
join images i
on i.bookid=xDerivedImages.bookid and i.upvotes=xDerivedImages.maxup
cross join (select #rn:=0,#lastbookid:=-1) params
order by i.bookid
) xDerivedImages2
on xDerivedImages2.bookid=b.id and xDerivedImages2.rownum=1
order by b.purchasecount desc
limit 10
Results
+----+---------------+---------------------+----------+
| id | purchasecount | poster | bucketid |
+----+---------------+---------------------+----------+
| 4 | 678 | NULL | NULL |
| 6 | 500 | NULL | NULL |
| 5 | 459 | swt | 11 |
| 1 | 456 | blah_blah_tie_break | 111 |
| 3 | 77 | qwqqe | 14 |
| 2 | 11 | z | 81 |
+----+---------------+---------------------+----------+
The significance of the cross join is merely to introduce and set starting values for 2 variables. That is all.
The results are the top ten books in descending order of purchasecount with the info from images if it exists (otherwise NULL) for the most upvoted image. The image selected honors tie-break rules picking the first one as mentioned above in the Visualization section with rownum.
Final Thoughts
I leave it to the OP to wedge in the appropriate where clause at the end as the sample data given had no useful book name to search on. That part is trivial. Oh, and do something about the schema for the large width of your primary keys. But that is off-topic at the moment.

Update table row value to a random row value from another table

I have 2 MySQL tables.
One table has a column that lists all the states
colStates | column2 | column 3
------------------------------
AK | stuff | stuff
AL | stuff | stuff
AR | stuff | stuff
etc.. | etc.. | etc..
The second table has a column(randomStates) with all NULL values that need to be populated with a randomly selected state abbreviation.
Something like...
UPDATE mytable SET `randomStates`= randomly selected state value WHERE randomStates IS NULL
Can someone help me with this statement. I have looked around at other posts, but I don't understand them.
this works for me with trial data in SQLite:
UPDATE mytable
SET randomStates = (SELECT colStates FROM
(SELECT * FROM first_table ORDER BY RANDOM())
WHERE randomStates IS NULL)
without the first SELECT portion, you end up with the same random value inserted into all the NULL randomStates field. (i.e. if you just do SELECT StateValue FROM counts ORDER BY RANDOM() you don't get what you want).

How to SELECT same ID of values occuring in the same column?

I have got a table names in MySQL with following columns ID, type, row, value
The composite primary key is ID, type, row
The purpose of this table is to save all names and professions of a specified person in multiple rows - one data per row.
For example: Commonly in Spain people have two first names and two last names, like José Anastacio Rojas Laguna.
In germany, there are many persons having one first name but two last names. And even persons with wide profession, like teaching on university and working as a doctor in a hospital at the same time. In this case, in germany people would have trailing Prof. Dr. in their names. For example: Prof. Dr. José Anastacio Rojas Laguna
In this case, I would store all these information in the table like this:
ID | type | row | value
1 | 0 | 1 | Prof.
1 | 0 | 2 | Dr.
1 | 1 | 1 | José
1 | 1 | 2 | Anastacio
1 | 2 | 1 | Rojas
1 | 2 | 2 | Laguna
An ID is given for one single person. Every people in the table have one unique ID and even one person is identified by his ID. type defines as it says the type of the name. 0 means profession, 1 means first name and 2 means last name. row defines the position in the name. 1 means 1st first name, 2 means 2nd firstname, 3 means 3rd firstname, etc... The same for profession and last name.
Now I would like to find out, how i can SELECT the ID of a specified person by just passing some of the names of that person? How can I determine the ID by only giving a few of the values, which occur all in (or have) the same ID?
This will return users that have the name José Laguna with the same ID:
select t1.id, t1.name, t2.name
from yourTable t1
join (select * from yourTable
where name = 'Laguna') t2
on t1.id = t2.id
where t1.name = 'José'
I use José You could use a variable #searchText
SELECT *
FROM YourTable
WHERE ID IN (SELECT DISTINCT ID
FROM YourTable
WHERE value = 'José')
Or maybe use an IN if multiple parameters
WHERE value IN ('José', 'Laguna')
So here's something using GROUP_CONCAT. Tested with your sample data and works.
It groups together all of the person's titles into a single column, their given name into another single column, and all their family names into a third column. It wraps each of those columns with commas to ensure finding a particular name is accurate.
The snippet below will find anyone who:
Has atleast one given name of "José" and
Has atleast one family name of "Rojas"
All you have to do to find a different user is change the WHERE clause.
SELECT n.ID,n.type,n.row,n.value
FROM names n
INNER JOIN (
SELECT ID
FROM (
SELECT ID
,CONCAT(',',GROUP_CONCAT((CASE WHEN type=0 THEN value ELSE NULL END) ORDER BY value ASC),',') AS titles
,CONCAT(',',GROUP_CONCAT((CASE WHEN type=1 THEN value ELSE NULL END) ORDER BY value ASC),',') AS givenNames
,CONCAT(',',GROUP_CONCAT((CASE WHEN type=2 THEN value ELSE NULL END) ORDER BY value ASC),',') AS familyNames
FROM `names`
GROUP BY ID
) grouped
WHERE grouped.givenNames LIKE '%,Jose,%' AND grouped.familyNames LIKE '%,rojas,%'
) people ON n.ID = people.ID
Before edit, this may have not worked as intended. The extra commas ensure the name searched for is not found as a substring

MySQL pull data from same table, SUM multiple columns with conditions

I have created a users table that holds names and phone numbers (users).
id| name | phone
1 | Frank | 0345221234
2 | Sara | 0342555939
I got another table that holds a log with user's calls to different numbers (call_logs):
number | destination | price
0345221234 | destination | x /// This is Frank
0345221234 | destination | y /// This is also Frank
0342555939 | destination | z /// This is Sara
And then I have a table that holds numbers that Frank and Sara are allowed to call (allowed_numbers):
number
033485733
045727728
082358288
I would like to loop through my users table and based on their number to check the calls log table and select the SUM of price column for log records where destination does not match the allowed numbers table so that I know the cost for calls not in the allowed list.
Then I want to select SUM of price column for log records where destination DO match
the allowed numbers table so that I know how much did the allowed calls cost.
Is there any way I can do this in a single query with sub-queries and all needed in order to achieve this result set:
users number | SUM(price) of allowed calls | SUM(price) of calls not allowed
Thank you!
SELECT call_logs.number
,SUM(IF(allowed_numbers.number IS NOT NULL,call_logs.price,0)) AS AllowedPrice
,SUM(IF(allowed_numbers.number IS NULL,call_logs.price,0)) AS NotAllowedPrice
FROM call_logs
LEFT JOIN allowed_numbers
ON call_logs.destination = allowed_numbers.number
GROUP BY call_logs.number;