Get count of a value that repeats ONLY in different users - mysql

I would like to get the number of times a fruit is repeated BUT only when the user is different
fruitid
user
fruit
180
217
watermelon
1a6
2dd
apple
1cf
2ac
orange
1da
2dd
orange
1f3
2dd
banana
1a6
2dd
apple
220
1da
banana
254
2dd
apple
2a0
2ac
apple
2a5
229
apple
I tried with this query, but the output is not the expected one for several reasons
SELECT
user,
fruit,
count(fruit) appearancesOfTheFruitAtDifferentUsers
FROM fruitBox
GROUP BY user
HAVING appearancesOfTheFruitAtDifferentUsers > 0
However:
It does not show me how many times a fruit was repeated with a different user in the tables
Suppresses several user-fruit rows; they should all appear with their respective count
user
fruit
appearancesOfTheFruitAtDifferentUsers
1da
banana
1
217
watermelon
1
229
apple
1
2ac
orange
2
2dd
apple
4
I have tried some suggestions from comments, but there was no success.
The output I would like to obtain is:
user
fruit
appearancesOfTheFruitAtDifferentUsers
217
watermelon
1
2dd
apple
3
2ac
orange
2
2dd
orange
2
2dd
banana
2
1da
banana
2
2dd
apple
3
2ac
apple
3
229
apple
3
Here, the count for apple is 3, because it appears for 3 different users. watermelon is 1 because it only appears once. Orange and Banana are each used by 2 users, so their count is 2.
In addition, I would also like to delete row #6 (excluding the table header) as it is a duplicate fruit in the same ID
In a few words, I want the table to show how many times a fruit is repeated in different users and if there is a user with two equal fruits, it only shows one in the table.

If your MySQL version support window function, you can try COUNT window function.
Query #1
SELECT *
FROM (
SELECT user_id,
type,
count(*) OVER(PARTITION BY type ORDER BY type) c
FROM logs
)t1
WHERE c > 1;
user_id
type
c
17ea9b33e6f
signup
5
17ea9c0e9ce
signup
5
17ea9d21366
signup
5
17ea9e04dc7
signup
5
17ea9e04df8
signup
5
17ea27674d1
work
2
17ea27674d8
work
2
View on DB Fiddle

The database term for "different" is distinct, and this term is also an SQL keyword.
Join the table to a subquery that provides the count of distinct users having that fruit:
select
user,
t.fruit,
fruit_user_count
from mytable t
join (
select
fruit,
count(distinct user) fruit_user_count
from mytable
group by 1
) c on c.fruit = t.fruit
See live demo.

Related

Is it possible to insert data based on data that needs to be inserted in a single statement in a SQL database?

So, let's say I have a database of fruits, along with recipes based on those fruits.
id
fruit name
1
apple
2
pear
id
recipe
fruit_id
1
apple pie
1
2
baked apple
1
3
poached pear
2
Is it possible, using a single SQL insert in MySQL, to add a new fruit (banana) and 1 or more associated recipes ('bananas foster', 'banana bread')?
Thanks

Splitting a row depending of known variable stored in an other database

I'm in SAS so the code is not pure MySQL.
I have a table like this
ID Fruits
1 Apple-Water-melon
2 Pine-Apple-Kiwi
But also another one with every different fruits
ID Fruits
x Apple
x Kiwi
x Pine-apple
x Water-melon
How can I have a final table like this ?
ID Fruits
1 Apple
1 Water-melon
2 Pine-apple
2 Kiwi
Is it possible to parse the first table and splitting the variable if they match one found in the second table ?
Thanks,
You can use the FINDW() function to check if the FRUIT value appears in the FRUITS list.
data have ;
input ID Fruits $40.;
cards;
1 Apple-Water-melon
2 Pine-Apple-Kiwi
;
data list;
fruitid +1 ;
input fruit $40. ;
cards;
Apple
Kiwi
Pine-apple
Water-melon
;
proc sql ;
create table want as
select a.*,b.*
from have a
left join list b
on 0 ne findw(Fruits,fruit,'-','ti')
;
quit;
Results
Obs ID Fruits fruitid fruit
1 1 Apple-Water-melon 1 Apple
2 2 Pine-Apple-Kiwi 1 Apple
3 2 Pine-Apple-Kiwi 2 Kiwi
4 2 Pine-Apple-Kiwi 3 Pine-apple
5 1 Apple-Water-melon 4 Water-melon
How do you want to eliminate the match for APPLE embedded in the middle of PINE-APPLE? Do you want give priority to multiple word fruit names first and then eliminate those fruit from the fruit list string?

MySQL, perform a query that selects and filters the results

I have a table with this structure:
id seller class ref color pricekg
1 manta apple apple-red red 0.147
2 manta pear pear-green green 0.122
3 poma apple apple-red red 0.111
4 arnie melon melon-green green 0.889
5 pinas pineapple pinneaple-brown brown 0.890
6 gordon apple apple-red red 0.135
I would need to get some fruits from some sellers, with some preferences.
My first objective is to know who sells what im looking for, and after I know that, pick the best one.
When I do the first query I get this:
Query ->
SELECT *
FROM `fruits`
WHERE `seller`
IN ("manta", "poma", "pinas", "gordon")
AND `class` IN ("apple", "pineapple")
ORDER BY id
Result 1 ->
1 manta apple apple-red red 0.147
3 poma apple apple-red red 0.111
5 pinas pineapple pinneaple-brown brown 0.890
6 gordon apple apple-red red 0.135
So far so good, however i get 3 sellers who have red apple's with the apple-red ref.
Now this is the part that i can't resolve...
With this result, I would like to filter the duplicated apples refs ( since i want to buy from one seller ).
If there's duplicates, select the one with the seller manta.
If there's duplicates, and no one of them is seller manta, then select the one with the lowest cost per kilogram.
So after the result 1, the second query (or subquery, or if there's a way to do it all in one query i really don't know what would be the best way) expected result would be:
1 manta apple apple-red red 0.147
5 pinas pineapple pinneaple-brown brown 0.890
Or in case manta didn't sell these it would be:
3 poma apple apple-red red 0.111
5 pinas pineapple pinneaple-brown brown 0.890
Is it possible to do this with only one query?
Or may I somehow do a view from the result or temporal table and then execute one more query to filter the duplicates.
How could I do this?
This is a prioritization query.
I think you want:
select f.*
from fruits f
where f.seller in ('manta', 'poma', 'pinas', 'gordon') and
f.class in ('apple', 'pineapple') and
f.id = (select f2.id
from fruits f2
where f2.seller in ('manta', 'poma', 'pinas', 'gordon') and
f2.class in ('apple', 'pineapple') and
f2.ref = f.ref
order by (f2.seller = 'manta') desc, -- put manta sellers first
f2.price asc -- then order by lowest price
limit 1
);

use a transaction database to calculate the probability of an item appearing in a future transaction using R or SQL

I have a database of transactions like in the table below
user_id order_id order_number product_name n
<int> <int> <int> <fctr> <int>
1 11878590 3 Pistachios 1
1 11878590 3 Soda 1
1 12878790 4 Yogurt 1
1 12878790 4 Cheddar Popcorn 1
1 12878790 4 Cinnamon Toast Crunch 1
2 12878791 11 Milk Chocolate Almonds 1
2 12878791 11 Half & Half 1
2 12878791 11 String Cheese 1
11 12878792 19 Whole Milk 1
11 12878792 19 Pistachios 1
11 12878792 19 Soda 1
11 12878792 19 Paper Towel Rolls 1
The table has multiple users who each have multiple transactions. Some users only have 3 transactions, other users have 15, etc. This is all in one table.
I'm trying to calculate a transition matrix for a markov model. I want to find the probability that an item will be in a new basket given that it was present in the previous basket of transactions.
I want my final table to look something like this
user_id product_name probability_present probability_absent
1 Soda .5 .5
1 Pistachios .5 .5
I'm having trouble figuring out how to get the data into a form so that I can calculate the probabilities and specifically coming up with a way to compare all of the t,t-1 combinations.
I have code that I've written to get things into this form, but I'm stuck at this point. I've written my code using the dplyr R package, but I could translate something in SQL into the R code. I can post my code in R if it will be helpful, but it is pretty simple at this point as I just had to do a few joins to get the table into this shape.
What else do I have to do to get the table/values that I'm trying to calculate?
This seems to give you the desired probabilities:
SELECT user_id,
product_name,
COUNT(DISTINCT order_number) / COUNT(*) AS prob_present,
1 - COUNT(DISTINCT order_number) / COUNT(*) AS prob_absent
FROM tbl
WHERE user_id = 1
GROUP BY user_id, product_name;
Or at least it gives you the numbers you have. If this is not right, please provide a slightly more complex example dataset.

Edit product selling location using mysql

I'm building a e-Commerce platform (PHP + MySQL) and I want to add a attribute (feature) to products, the ability to specify (enable/disable) the selling status for specific city.
Here are simplified tables:
cities
id name
==========
1 Roma
2 Berlin
3 Paris
4 London
products
id name cities
==================
1 TV 1,2,4
2 Phone 1,3,4
3 Book 1,2,3,4
4 Guitar 3
In this simple example is easy to query (using FIND_IN_SET or LIKE) to check the availability of product for specific city.
This is OK for 4 city in this example or even 100 cities but will be practical for a large number of cities and for very large number of products?
For better "performance" or better database design should I add another table to table to JOIN in query (productid, cityid, status) ?
availability
id productid cityid status
=============================
1 1 1 1
2 1 2 1
3 1 4 1
4 2 1 1
5 2 3 1
6 2 4 1
7 3 1 1
8 3 2 1
9 3 3 1
10 3 4 1
11 4 3 1
For better "performance" or better database design should I add
another table
YES definitely you should create another table to hold that information likewise you posted rather storing in , separated list which is against Normalization concept. Also, there is no way you can gain better performance when you try to JOIN and find out the details pf products available in which cities.
At any point in time if you want to get back a comma separated list like 1,2,4 of values then you can do a GROUP BY productid and use GROUP_CONCAT(cityid) to get the same.