Sum a generated list in MySQL, in a single query - mysql

I need to sum a result that I'm getting from an existing query. And the it has to extend the current query and remain a single query
(by this I mean NOT - DO 1; DO 2; DO3;)
My current query is:
SELECT SUM((count)/(SELECT COUNT(*) FROM mobile_site_statistics WHERE campaign_id='1201' AND start_time BETWEEN CURDATE()-1 AND CURDATE())*100) AS percentage FROM mobile_site_statistics WHERE device NOT LIKE '%Pingdom%' AND campaign_id='1201' AND start_time BETWEEN (CURDATE()-1) AND CURDATE() GROUP BY device ORDER BY 1 DESC LIMIT 10;
This returns:
+------------+
| percentage |
+------------+
| 47.3813 |
| 19.7940 |
| 5.6672 |
| 5.0801 |
| 3.9603 |
| 3.8500 |
| 3.1294 |
| 2.9924 |
| 2.9398 |
| 2.7136 |
+------------+
What I need is the total of that table (total percent used by the top 10 devices)(that's all) but it has to be a single query (Has to include the initial query)(Has to be a single query due to another program that's using the query)
Is this possible? every way I have tried so far has failed. We tried temporary tables, but that turned into multiple queries.

Just do a
SELECT SUM(percentage) AS total FROM (<YOUR_QUERY>) a
and replace the sub-query <YOUR_QUERY> with your initial query

Related

How to reuse variables in the select statement of mysql

I would like to use mysql variables to prevent same statements. In the following example i would like to sum the salary of an each employee and also sum it twice times. Of course the second column is wrong.
MariaDB [Messdaten]> select * from t;
+----+----------+--------+
| id | employee | salery |
+----+----------+--------+
| 1 | 10 | 1000 |
| 2 | 10 | 2000 |
| 3 | 20 | 3000 |
| 4 | 20 | 4000 |
+----+----------+--------+
4 rows in set (0.000 sec)
MariaDB [Messdaten]> select employee, #x:=sum(salery), 2*#x from t group by employee;
+----------+-----------------+-------+
| employee | #x:=sum(salery) | 2*#x |
+----------+-----------------+-------+
| 10 | 3000 | 14000 |
| 20 | 7000 | 14000 |
+----------+-----------------+-------+
2 rows in set (0.001 sec)
Of course i could use select employee, sum(salery), 2*sum(salery) but in my real use case the statements are very big and therefore bad readable.
What ist going wrong and if this is a gap of mysql are there some workarounds?
You can use a subquery like so to get the correct result while only summing (or executing a more complex statement) once
SELECT
employee,
totalSalary,
totalSalary*2 AS doubleSalary
FROM (
SELECT
employee,
sum(salary) AS totalSalary
FROM employees
GROUP BY employee
) AS employeeSalaries;
The unexpected variable behaviour is described in the MySQL docs here.
HAVING, GROUP BY, and ORDER BY, when referring to a variable that is assigned a value in the select expression list do not work as expected because the expression is evaluated on the client and thus can use stale column values from a previous row.

Why is Last selecting First in MS Access?

I have the follow SQL:
SELECT tbl_G_stats_atp.PK_G, tbl_G_stats_atp.InjuryCnt
FROM tbl_G_stats_atp
WHERE (((tbl_G_stats_atp.ID_A)=89) AND ((tbl_G_stats_atp.DATE_S)<37500))
GROUP BY tbl_G_stats_atp.PK_G, tbl_G_stats_atp.InjuryCnt;
It produces this result:
+---------+-----------+
| PK_G | InjuryCnt |
+---------+-----------+
| 1203857 | 0 |
| 1203881 | 0 |
| 1203890 | 0 |
| 1203913 | 0 |
| 1203916 | 0 |
| 1203989 | 0 |
| 1204001 | 0 |
| 1204102 | 0 |
| 1204172 | 0 |
+---------+-----------+
I want to select the last record so have used this SQL:
SELECT Last(tbl_G_stats_atp.PK_G) AS LastOfPK_G, tbl_G_stats_atp.InjuryCnt
FROM tbl_G_stats_atp
WHERE (((tbl_G_stats_atp.ID_A)=89) AND ((tbl_G_stats_atp.DATE_S)<37500))
GROUP BY tbl_G_stats_atp.InjuryCnt
ORDER BY Last(tbl_G_stats_atp.PK_G);
However it returns the first record (1203857).
I realise I can use this SQL as a replacement:
SELECT Max(tbl_G_stats_atp.PK_G) AS MaxOfPK_G, tbl_G_stats_atp.InjuryCnt
FROM tbl_G_stats_atp
WHERE (((tbl_G_stats_atp.ID_A)=89) AND ((tbl_G_stats_atp.DATE_S)<37500))
GROUP BY tbl_G_stats_atp.InjuryCnt;
However I'd like to understand why it's doing this. I may in future want to select the last record on a non-numeric field...
Have to be careful with using First or Last because records do not have intrinsic order. Even with an ORDER BY clause, results can be not as expected. I have avoided Last/First but just did a simple test and was able to return value from last record added to table - no WHERE, GROUP BY, or ORDER BY clauses included.
If you want to return all fields from that record, consider:
SELECT TOP 1 tbl_G_stats_atp.* FROM tbl_G_stats_atp WHERE ID_A=89 AND DATE_S<37500 ORDER BY PK_G DESC;
Even then, there must be a field of unique values that can be relied on to order records so desired record is brought to top. Usually an autonumber ID is positive and increasing (I've never seen otherwise) and should accomplish that. Or perhaps a date/time field will serve.

Doing Math with SQL entries

so I am trying to use MySQL to look at the values of our database, and spit out the sum of the values in a column between 2 date ranges. Currently I have gotten it to at least select the correct range of dates using the code:
SELECT fuelDate, SUM(gallons)
FROM fuel566243
WHERE fuelDate BETWEEN '2019-01-04' AND '2019-01-24'
GROUP BY fuelDate
ORDER BY fuelDate
The issue with this code, is in the SUM column that gets generated, it just displays all the values of the gallons column, but doesn't add them up. Is there a way to do this in SQL and output a result? Or is it easier to just use a foreach loop to cycle through the array and add the values using PHP?
When I run the code above, it gives me this output. How do I then get the sum of the gallons column shown below? Is there a way to do that in SQL? Or would I need to use a loop to add the values using PHP?
| fuelDate | Gallons | |
| 2019-01-04 | 53.8885 | |
| 2019-01-15 | 198.1700 | |
| 2019-01-17 | 167.2750 | |
| 2019-01-23 | 176.5620 | |
| 2019-01-24 | 181.0240 | |
The GROUP BY defines the rows that are being returned. You have included gallons as an aggregation key, so you have specified that you want a separate row for each value.
Simply remove the key from the GROUP BY and the SELECT:
SELECT fuelDate, SUM(gallons)
FROM fuel566243
WHERE fuelDate BETWEEN '2019-01-04' AND '2019-01-24'
GROUP BY fuelDate
ORDER BY fuelDate;
If you want the total during the period, then you want an aggregation with no GROUP BY:
SELECT SUM(gallons)
FROM fuel566243
WHERE fuelDate BETWEEN '2019-01-04' AND '2019-01-24';

MySQL query with list of values

I have a table with over then 50kk rows.
trackpoint:
+----+------------+-------------------+
| id | created_at | tag |
+----+------------+-------------------+
| 1 | 1484407910 | visitorDevice643 |
| 2 | 1484407913 | visitorDevice643 |
| 3 | 1484407916 | visitorDevice643 |
| 4 | 1484393575 | anonymousDevice16 |
| 5 | 1484393578 | anonymousDevice16 |
+----+------------+-------------------+
where 'created_at' is a timestamp of row added.
and i have a list of timestamps, for example like this one:
timestamps = [1502744400, 1502830800, 1502917200]
I need to select all timestamp in every interval between i and i+1 of timestamp.
Using Django ORM it's look like:
step = 86400
for ts in timestamps[:-1]:
trackpoint_set.filter(created_at__gte=ts,created_at__lt=ts + step).values('tag').distinct().count()
Because of actually timestamps list is very very longer and table has many of rows, finally i getting 500 time-out
So, my question is, how to for it in ONE raw SQL query join rows and list of values, so it looks like [(1502744400, 650), (1502830800, 1550)...]
Where second first value is timestamp, and the second is count of unique tags in each interval.
First index created_at. Next build query like created_at in (timestamp, timestamp+1). For each timestamp, run the query one by one rather than all at once.

How to sample rows in MySQL using RAND(seed)?

I need to fetch a repeatable random set of rows from a table using MySQL. I implemented this using the MySQL RAND function using the bigint primary key of the row as the seed. Interestingly this produces numbers that don't look random at all. Can anyone tell me whats going on here and how to get it to work properly?
select id from foo where rand(id) < 0.05 order by id desc limit 100
In one example out of 600 rows not a single one was returned. I change the select to include "id, rand(id)" and get rid of the rand clause in the where this is what I got:
| 163345 | 0.315191733944408 |
| 163343 | 0.814825518815616 |
| 163337 | 0.313726862253367 |
| 163334 | 0.563177533972242 |
| 163333 | 0.312994424545201 |
| 163329 | 0.312261986837035 |
| 163327 | 0.811895771708242 |
| 163322 | 0.560980224573035 |
| 163321 | 0.310797115145994 |
| 163319 | 0.810430896291911 |
| 163318 | 0.560247786864869 |
| 163317 | 0.310064677437828 |
Look how many 0.31xxx lines there are. Not at all random.
PS: I know this is slow but in my app the where clause limits the number of rows to a few 1000.
Use the same seed for all the rows to do that, like:
select id from foo where rand(42) < 0.05 order by id desc limit 100
See the rand() docs for why it works that way. Change the seed if you want another set of values.
Multiply the decimal number returned by id:
select id from foo where rand() * id < 5 order by id desc limit 100