How to do a COUNTIF formula with mysql [duplicate] - mysql

This question already has an answer here:
MySQL pivot row into dynamic number of columns
(1 answer)
Closed 5 years ago.
I'm looking at some keyword rankings and need to bring back data on the number of keywords a URL has increased rank for, decreased rank for, stayed the same for. My data looks like this:
URL | Keyword | Position | Previous position | Keyword Movement
example.com/page1 | things to do london | 38 | 101| Up
example.com/page2 | mens shoes size 8 | 48 | 94 | Up
example.com/page3 | notebooks | 22 | 2 | Down
example.com/page4 | macbook pros for sale | 52 | 52 | Same
example.com/page1 | homebrew supplies | 56 | 46 | Down
example.com/page2 | sql tutorials | 70 | 39 | Down
example.com/page3 | random seo keywords | 88 | 36 | Down
example.com/page4 | best albums of 2017 | 94 | 95 | Up
example.com/page5 | fender stratocaster | 19 | 9 | Down
example.com/page6 | qotsa | 91 | 34 | Down
I'd like to have a table showing the URL, number of keyword increases, no. of keyword decreases, and no. of keywords staying the same. In Excel this can be done with a countif formula but not sure how to do this with mysql. I'd like a table looking like the following:
URL |Keywords Up |Keywords Down |Keywords remain
example.com/page1 | 1 | 1 | 0
example.com/page2 | 1 | 1 | 0
example.com/page3 | 0 | 2 | 0
example.com/page4 | 1 | 0 | 1
example.com/page5 | 0 | 1 | 0
example.com/page6 | 0 | 1 | 0
I'm looking for a way of doing a countif on the "Movement" column.
Thanks.

You can use condtional aggregation here on the Movement column and tally the three types of movements, grouping by URL.
SELECT
URL,
SUM(CASE WHEN Movement = 'Up' THEN 1 ELSE 0 END) AS keywords_up,
SUM(CASE WHEN Movement = 'Down' THEN 1 ELSE 0 END) AS keywords_down,
SUM(CASE WHEN Movement = 'Same' THEN 1 ELSE 0 END) AS keywords_remain
FROM yourTable
GROUP BY URL

Related

SSRS Report Builder calculated fields with SQL table (attribute-value)

I have a SQL database for monthly values with some valuetypeline and value for each month and the problem is when i tried to aggregate the information in the tablix matrix inside an ssrs report
Example of the database
CustomerId | Year | Month | ValueTypeId | Value
---------------------------------------------
1 | 2020 | 1 | 1 | 500
1 | 2020 | 1 | 2 | 10
1 | 2020 | 2 | 1 | 200
1 | 2020 | 2 | 2 | 15
2 | 2020 | 1 | 1 | 100
2 | 2020 | 1 | 2 | 10
2 | 2020 | 2 | 1 | 1500
2 | 2020 | 2 | 2 | 15
And i have created an example of datalabel field for a specific valuetypeid taking into account 2 parameters that i have (Year and Month)
=IIF(Fields!ValueTypeId Value="1" and Fields!Year.Value=Parameters!Year.Value and
Fields!Month.Value=Parameters!Month.Value,CDbl(Fields!ReportingValue.Value),CDbl(0))
When i create a tablix matrix and I put the fields with [Sum(SalesMTD)] imagine Year =2020 and month = 2, the result will be 200+1500 = 1700
And if I create the same table than the example adding the nuew field, the result is something like this.
CustomerId | Year | Month | ValueTypeId | Value | SalesMTD
-----------------------------------------------------------
1 | 2020 | 1 | 1 | 500 | 0
1 | 2020 | 1 | 2 | 10 | 0
1 | 2020 | 2 | 1 | 200 | 200
1 | 2020 | 2 | 2 | 15 | 0
2 | 2020 | 1 | 1 | 100 | 0
2 | 2020 | 1 | 2 | 10 | 0
2 | 2020 | 2 | 1 | 1500 | 1500
2 | 2020 | 2 | 2 | 15 | 0
The problem is when we tried to calculate an average [Avg(SalesMTD)] becasue the system instead of doing this (200+1500)/2 it seams to doing (0+0+200+0+0+0+1500+0)/8. Something that it is wrong for me.
Can someone helpme on this? I have tried averageiff innoring 0 values but i cant finde it, in any case 0 could be a possible value if it is real, i think that the problem is more in the calculation of the field that put 0 in the rows that are not with the conditions when in reality should be ignored or nulls. For sums is valid but i have found that for average or other calculations is not correct.
Thanks in advance
Your problem, are you pointed out is that the AVG() function includes zeros. However, it does not include Null values.
So, if we convert zeros to null values (Nothing in SSRS expressions) then we can fix your issue.
Use the following expression
=Avg(
IIF(Fields!SalesMTD.Value=0, Nothing, Fields!SalesMTD.Value)
)
In the screen shot below, I took your sample data and added a SUM, a standard AVG and finally the expression above (labelled AGV2).

Automatically inserting additional columns in MySQL 8.0 [duplicate]

This question already has an answer here:
MySQL pivot row into dynamic number of columns
(1 answer)
Closed 3 years ago.
Say I have a table like so
+----+----------+------+
| id | name | type |
+----+----------+------+
| 1 | apple | F |
| 1 | pear | F |
| 1 | cucumber | V |
| 2 | orange | F |
| 2 | grass | NULL |
| 2 | broccoli | V |
| 3 | flower | NULL |
| 3 | tomato | NULL |
+----+----------+------+
I want to end up with a table that counts the number of elements for each type (including NULL types) AND for each id, like this:
+----+-----------------+--------------+--------------+
| id | type_NULL_count | type_F_count | type_V_count |
+----+-----------------+--------------+--------------+
| 1 | 0 | 2 | 1 |
| 2 | 1 | 1 | 1 |
| 3 | 2 | 0 | 0 |
+----+-----------------+--------------+--------------+
This is rather easy to do, but is there a way (a query I can write or something else) such that when I go back and edit one of the type fields in the first table, I end up with a properly updated count table?
For example, let's say I want to add a new type (type X) and change the type field for flower from NULL to X. Is there a way to end up with the following table without having to rewrite the query or add more statements?
+----+-----------------+--------------+--------------+--------------+
| id | type_NULL_count | type_F_count | type_V_count | type_X_count |
+----+-----------------+--------------+--------------+--------------+
| 1 | 0 | 2 | 1 | 0 |
| 2 | 1 | 1 | 1 | 0 |
| 3 | 1 | 0 | 0 | 1 |
+----+-----------------+--------------+--------------+--------------+
I'm not sure if this is the best way to do this, so I am open to suggestions
Having a secondary table which it's number of columns changes based on your first table is not a viable option.
Do you need to keep the result in a table or it will be displayed as a report?
I think a better way to do this is using the SQL below calculate counts by id plus type and display using your data display tool the way you like it.
select id, type, count(*) count
from d
group by 1,2
order by 1,2
The output would be
id type count
1 F 2
1 V 1
2 F 1
2 V 1
2 1
3 X 1
3 1

Get amount of records with specific value, but only once per unique field

I'm not looking for a complete answer, but maybe some pointers as to what kind of mysql commands I should look at to figure this out.
I have a series of sensors (30+) connected to my network. At different intervals I request their status and each of the devices replies with n-amount of booleans, where n can be anything from zero to 120 (so the response can be an empty object, a list of 120 booleans, or any amount in between).
Per received boolean I create a new record, together with the device's mac address and a timestamp. For example (see also this sqlfiddle):
+----+-------------------+---------------------+--------+
| id | device_address | timestamp | status |
+----+-------------------+---------------------+--------+
| 1 | f2:49:d2:17:5d:8d | 2018-09-22 15:54:51 | 0 |
| 2 | fd:30:ec:08:67:9a | 2018-09-22 15:54:56 | 0 |
| 3 | f8:8d:d9:64:a4:7c | 2018-09-22 15:54:58 | 1 |
| 4 | f2:49:d2:17:5d:8d | 2018-09-22 15:55:51 | 0 |
| 5 | f2:49:d2:17:5d:8d | 2018-09-22 15:55:52 | 0 |
| 6 | fd:30:ec:08:67:9a | 2018-09-22 15:55:56 | 1 |
| 7 | f8:8d:d9:64:a4:7c | 2018-09-22 15:55:58 | 1 |
| 8 | f2:49:d2:17:5d:8d | 2018-09-22 15:56:52 | 0 |
| 9 | f2:49:d2:17:5d:8d | 2018-09-22 15:57:52 | 1 |
| 10 | f2:49:d2:17:5d:8d | 2018-09-22 15:58:52 | 1 |
+----+-------------------+---------------------+--------+
Or, with the mac address replaced for better readability:
+----+-------------------+---------------------+--------+
| id | device_address | timestamp | status |
+----+-------------------+---------------------+--------+
| 1 | A | 2018-09-22 15:54:51 | 0 |
| 2 | BB | 2018-09-22 15:54:56 | 0 |
| 3 | CCC | 2018-09-22 15:54:58 | 1 |
| 4 | A | 2018-09-22 15:55:51 | 0 |
| 5 | A | 2018-09-22 15:55:52 | 0 |
| 6 | BB | 2018-09-22 15:55:56 | 1 |
| 7 | CCC | 2018-09-22 15:55:58 | 1 |
| 8 | A | 2018-09-22 15:56:52 | 0 |
| 9 | A | 2018-09-22 15:57:52 | 1 |
| 10 | A | 2018-09-22 15:58:52 | 1 |
+----+-------------------+---------------------+--------+
In the end I want to be able to graph these values, grouped in intervals. For example, when I graph the last 2 hours worth of data, I want to use 5 minute intervals. Per interval I want to know how many (unique) devices had a status of 1 at least once in that period, and how many only had zeroes. Devices that don't appear within the timeblock at all (because they didn't return a boolean) are not relevant to that timeblock
The above records would fall within two of such 5 minute timeblocks:
15:50:00 to 15:54:59 - ids 1 2 3
15:55:00 to 15:59:59 - ids 4 5 6 7 8 9 10
The kind of response I'd like is something like this:
+---------------------+---------------------------------+-------------------------+
| timeblock start | dev w/ at least one status of 1 | dev w/ only status of 0 |
+---------------------+---------------------------------+-------------------------+
| 2018-09-22 15:50:00 | 1 | 2 |
| 2018-09-22 15:55:00 | 2 | 1 |
+---------------------+---------------------------------+-------------------------+
The final result does not have to be like this exactly, other results that can help me deduce these numbers would also work. The same is true for the timestamp field; this 2018-09-22 15:50:00 format would be great but other formats can also allow me to deduct what the timeblock was.
Doing something like this gets me the different timeblocks and the amount of unique devices within each timeblock, but it counts the total amount of 1s and 0s instead of combining the results of each unique device.
SELECT timestamp,
SUM(status) as ones, COUNT(status)-SUM(status) as zeroes,
COUNT(DISTINCT(device_address)) as unique_devices
FROM records
GROUP BY UNIX_TIMESTAMP(timestamp) DIV 300
ORDER BY timestamp ASC
result:
+----------------------+------+--------+----------------+
| timestamp | ones | zeroes | unique devices |
+----------------------+------+--------+----------------+
| 2018-09-22T15:54:51Z | 1 | 2 | 3 |
| 2018-09-22T15:57:52Z | 4 | 3 | 3 |
+----------------------+------+--------+----------------+
Use conditional aggregation
SELECT timestamp,
count(distinct case when status = 1 then device_address end) as ones,
count(distinct case when status = 0 then device_address end) as zeros,
FROM records
GROUP BY UNIX_TIMESTAMP(timestamp) DIV 300
ORDER BY timestamp ASC
sqlfiddle demo

Converting lump sums to transactions

I have a database that tracks the size of claims.
Each claim has fixed information that is stored in claim (such as claim_id and date_reported_to_insurer).
Each month, I get a report which is added to the table claim_month. This includes fields such as claim_id, month_id [101 is 31/01/2018, 102 is 28/02/2018, etc] and paid_to_date.
Since most claims don't change from month to month, I only add a record for claim_month when the figure has changed since last month. As such, a claim may have a June report and an August report, but not a July report. This would be because the amount paid to date increased in June and August, but not July.
The problem that I have now is that I want to be able to check the amount paid each month.
Consider the following example data:
+----------------+----------+----------------+--------------+
| claim_month_id | claim_id | month_id | paid_to_date |
+----------------+----------+----------------+--------------+
| 1 | 1 | 6 | 1000 |
+----------------+----------+----------------+--------------+
| 5 | 1 | 7 | 1200 |
+----------------+----------+----------------+--------------+
| 7 | 2 | 6 | 500 |
+----------------+----------+----------------+--------------+
| 12 | 1 | 9 | 1400 |
+----------------+----------+----------------+--------------+
| 18 | 2 | 8 | 600 |
+----------------+----------+----------------+--------------+
If we assume that this is all of the information regarding claim 1 and 2, then that would suggest that they are both claims that occurred during June 2018. Their transactions should look like the following:
+----------------+----------+----------------+------------+
| claim_month_id | claim_id | month_id | paid_month |
+----------------+----------+----------------+------------+
| 1 | 1 | 6 | 1000 |
+----------------+----------+----------------+------------+
| 5 | 1 | 7 | 200 |
+----------------+----------+----------------+------------+
| 7 | 2 | 6 | 500 |
+----------------+----------+----------------+------------+
| 12 | 1 | 9 | 200 |
+----------------+----------+----------------+------------+
| 18 | 2 | 8 | 100 |
+----------------+----------+----------------+------------+
The algorithm I'm using for this is
SELECT claim_month_id,
month_id,
claim_id,
new.paid_to_date - old.paid_to_date AS paid_to_date_change,
FROM claim_month AS new
LEFT JOIN claim_month AS old
ON new.claim_id = old.claim_id
AND ( new.month_id > old.month_id
OR old.month_id IS NULL )
GROUP BY new.claim_month_id
HAVING old.month_id = Max(old.month_id)
However this has two issues:
It seems really inefficient at dealing with claims with multiple
records. I haven't run any benchmarking, but it's pretty obvious.
It doesn't show new claims. In the above example, it would only show lines 2, 3 and 5.
Where am I going wrong with my algorithm, and is there a better logic to use to do this?
Use LAG function to get the next paid_to_date of each claim_id, and use the current paid_to_date minus the next paid_to_date.
SELECT
claim_month_id,
claim_id,
month_id,
paid_to_date - LAG(paid_to_date, 1, 0) OVER (PARTITION BY claim_id ORDER BY month_id) AS paid_month
FROM claim
The output table is:
+----------------+----------+----------+------------+
| claim_month_id | claim_id | month_id | paid_month |
+----------------+----------+----------+------------+
| 1 | 1 | 6 | 1000 |
| 5 | 1 | 7 | 200 |
| 12 | 1 | 9 | 200 |
| 7 | 2 | 6 | 500 |
| 18 | 2 | 8 | 100 |
+----------------+----------+----------+------------+

MySQL return prioritizes value else return other value

I have a table (This is a mock-table),
+------------+------+--------+
| Name | Code | Active |
+------------+------+--------+
| Sales | 55 | 1 |
| Sales | 55 | 0 |
| IT | 22 | 1 |
| Production | 33 | 1 |
| Production | 33 | 0 |
| Marketing | 77 | 0 |
| Marketing | 77 | 0 |
+------------+------+--------+
And I want to return a list of distinct names and codes. However, I want to determine if the department is active or not. so if Sales has a 1 in active and a 0 in active they are active, but is they had only zeros then they are not.
I've tried a variety of methods and read through a few dozen SO post, but am not gaining any progress.
The output I am trying to achieve is:
+------------+------+--------+
| Name | Code | Active |
+------------+------+--------+
| Sales | 55 | 1 |
| IT | 22 | 1 |
| Production | 33 | 1 |
| Marketing | 77 | 0 |
+------------+------+--------+
How can I prioritize the Active column to a value of 1, but still return an entry if all entries with the same code have a value of 0 (such as marketing)?
GROUP BY name and code and get the maximum value of Active. (Assuming 0 and 1 are the possible values for Active column)
SELECT Name,Code,MAX(Active) active
FROM tablename
GROUP BY Name,Code