I have a table with over then 50kk rows.
trackpoint:
+----+------------+-------------------+
| id | created_at | tag |
+----+------------+-------------------+
| 1 | 1484407910 | visitorDevice643 |
| 2 | 1484407913 | visitorDevice643 |
| 3 | 1484407916 | visitorDevice643 |
| 4 | 1484393575 | anonymousDevice16 |
| 5 | 1484393578 | anonymousDevice16 |
+----+------------+-------------------+
where 'created_at' is a timestamp of row added.
and i have a list of timestamps, for example like this one:
timestamps = [1502744400, 1502830800, 1502917200]
I need to select all timestamp in every interval between i and i+1 of timestamp.
Using Django ORM it's look like:
step = 86400
for ts in timestamps[:-1]:
trackpoint_set.filter(created_at__gte=ts,created_at__lt=ts + step).values('tag').distinct().count()
Because of actually timestamps list is very very longer and table has many of rows, finally i getting 500 time-out
So, my question is, how to for it in ONE raw SQL query join rows and list of values, so it looks like [(1502744400, 650), (1502830800, 1550)...]
Where second first value is timestamp, and the second is count of unique tags in each interval.
First index created_at. Next build query like created_at in (timestamp, timestamp+1). For each timestamp, run the query one by one rather than all at once.
Related
I have a MySQL table named rbsess with columns RBSessID (key), ClientID (int), RBUnitID (int), RentAmt (fixed-point int), RBSessStart (DateTime), and PrevID (int, references to RBSessID).
It's not transactional or linked. What it does track when a client was moved into a room and what the rent at the time of move in was. The query to find what the rent was for a particular client on a particular date is:
SET #DT='Desired date/time'
SET #ClientID=Desired client id
SELECT a.RBSessID
, a.ClientID
, a.RBUnitID
, a.RentAmt
, a.RBSessStart
, b.RBSessStart AS RBSessEnd
, a.PrevID
FROM rbsess a
LEFT
JOIN rbsess b
ON b.PrevID=a.RBSessID
WHERE a.ClientID=#ClientID
AND (a.RBSessStart<=#DT OR a.RBSessStart IS NULL)
AND (b.RBSessStart>#DT OR b.RBSessStart IS NULL);
This will output something like:
+----------+----------+----------+---------+---------------------+-----------+--------+
| RBSessID | ClientID | RBUnitID | RentAmt | RBSessStart | RBSessEnd | PrevID |
+----------+----------+----------+---------+---------------------+-----------+--------+
| 2 | 4 | 1 | 57500 | 2020-11-22 00:00:00 | NULL | 1 |
+----------+----------+----------+---------+---------------------+-----------+--------+
I also have
SELECT * FROM rbsess WHERE rbsess.ClientID=#ClientID AND rbsess.PrevID IS NULL; //for finding the first move in date
SELECT TIMESTAMPDIFF(DAY,#DT,LAST_DAY(#DT)) AS CountDays; //for finding the number of days until the end of the month
SELECT DAY(LAST_DAY(#DT)) AS MaxDays; //for finding the number of days in the month
SELECT (TIMESTAMPDIFF(DAY,#DT,LAST_DAY(#DT))+1)/DAY(LAST_DAY(#DT)) AS ProRateRatio; //for finding the ratio to calculate the pro-rated rent for the move-in month
SELECT ROUND(40000*(SELECT (TIMESTAMPDIFF(DAY,#DT,LAST_DAY(#DT))+1)/DAY(LAST_DAY(#DT)) AS ProRateRatio)) AS ProRatedRent; //for finding a pro-rated rent amount based on a rent amount.
I'm having trouble putting all of these together to form a single query that can output pro-rated and full rent amounts based on a start date and an optional end date all rent owed amounts in a single statement for each month in the period. I can add a payments table received and integrate it afterwards, just having a hard time with this seemingly simple real-world concept in a MySQL query. I'm using php with a MySQL back end. Temporary tables as intermediary queries are more than acceptable.
Even a nudge would be helpful. I'm not super-experienced with MySQL queries, just your basic CREATE, SELECT, INSERT, DROP, and UPDATE.
Examples as requested by GMB:
//Example data in rbsess table:
+----------+----------+----------+---------+---------------------+--------+
| RBSessID | ClientID | RBUnitID | RentAmt | RBSessStart | PrevID |
+----------+----------+----------+---------+---------------------+--------+
| 1 | 4 | 1 | 40000 | 2020-10-22 00:00:00 | NULL |
| 2 | 4 | 1 | 57500 | 2020-11-22 00:00:00 | 1 |
| 3 | 2 | 5 | 40000 | 2020-11-29 00:00:00 | NULL |
+----------+----------+----------+---------+---------------------+--------+
Expected results would be a list of the rent amounts owed for every month, including pro-rated amounts for partial occupancy in a month, from a date range of months. For example for the example data above for a date range spanning all of the year of 2020 from client with ClientID=4 the query would produce an amount for each month within the range similar to:
Month | Amt
2020-10-1 | 12903
2020-11-1 | 45834
2020-12-1 | 57500
I want to weekly update a field in a MySQL table "Persons", with the avg of two fields of the "Tasks" table, end_date and start_date:
PERSON:
+----------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| average_speed | int(11) | NO | | 0 | |
+----------------+-------------+------+-----+---------+-------+
TASKS:
+----------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| person_id | int(11) | NO | | NULL | |
| start_date | date | NO | | NULL | |
| end_date | date | NO | | NULL | |
+----------------+-------------+------+-----+---------+-------+
(tables are not complete).
average_speed = AVG(task.end_date - task.start_date)
Now, the Tasks table is really big, and ** I don't want to compute the average on every task for every person every week**. (That's a solution, but I'm trying to avoid it).
What's the best way to update the average_speed?
I thought about adding two columns in the person's table:
"last_count": count of computed tasks since now for each person
"last_sum": last sum of (end_date - start_date) for each person
So that on a new update i could do something like average_speed = (last_sum+new_sum) / (last_count + new_count) where new_count is the sum of the tasks in the last week.
Is there a better solution/architecture?
EDIT:
to answer a comment, the query I would do is something like this:
SELECT
count(t.id) as last_count,
sum(TIMESTAMPDIFF(MINUTE, t.start_date, t.end_date)) as last_sum
avg(TIMESTAMPDIFF(MINUTE, t.start_date, t.end_date))
from tasks as t
where t.end_date BETWEEN DATE_SUB(CURDATE(), INTERVAL 1 WEEK) AND CURDATE()
And i can rely on a php script to get result and do some calculations
Having a periodic update to the table is a bad way to go for all the reasons you've listed above, and others.
If you have access to the code that writes to the Tasks table, that’s the best place to put the update. Add an Average field and calculate and set the value when you write the task end time.
If you don’t have access to the code, you can add a calculated field to the table that shows the average and let SQL figure it out during the execution of a query. This can slow queries down a little, but the data is always valid and SQL is smart enough to only calculate that value when it is needed.
A third (ugly) option is a trigger on the table that updates the value when appropriate. I’m not a fan of triggers because they hide business logic in unexpected places, but sometimes you just have to get the job done.
I have this table (Pickups):
+-----------+------------+-------------+------------+
| worker_id | box_weight | bag_weight | date |
+-----------+------------+-------------+------------+
| 1 | 2 | 5 | 11-07-2018 |
| 1 | 7 | 9 | 11-07-2018 |
| 2 | 8 | 11 | 11-07-2018 |
| 2 | 7 | 12 | 11-07-2018 |
+-----------+------------+-------------+------------+
and I want in Laravel 5.4 Eloquent database engine get the sum of the box_weight and the bag_weight like this:
+-----------+-----------------+-----------------+------------+
| worker_id | sum(box_weight) | sum(bag_weight) | date |
+-----------+-----------------+-----------------+------------+
| 1 | 9 | 14 | 11-07-2018 |
| 2 | 15 | 23 | 11-07-2018 |
+-----------+-----------------+-----------------+------------+
Until now I could only retrieve the sum of a single column not the both in the same call.
Please find the answer bellow, since you didn't mention you want sum of same date per worker id or all dates, I assume only same date, if you want sum of all dates per worker id, remove date from groupBy
Eloquent Query
Pickup::select(['worker_id ','date',DB::raw('sum(box_weight)'),DB::raw('sum(bag_weight)')])
->groupBy('worker_id','date')
->get();
or in Query Builder Approach
DB::table('pickups')
->select(['worker_id ','date',DB::raw('sum(box_weight)'),DB::raw('sum(bag_weight)')])
->groupBy('worker_id','date')
->get();
You're looking for the MySql query or Laravel's QueryBuilder/Eloquent?
I'm assuming you want it grouped by worker_id and not by date, if it's by date, just add date to the groupBy
In the future, show us what you've tried and you're trying to accomplish in more detail
If you're looking for the MySqlQuery, Rom's answer will do just fine
SELECT worker_id, sum(box_weight), sum(bag_weight), date
FROM pickups
GROUP BY worker_id
If you're going from the Eloquent model:
//Assuming Pickup is your model name
Pickup::selectRaw('worker_id, sum(box_weight), sum(bag_weight), date')
->groupBy('worker_id')->get();
Using DB
DB::table('pickups')->selectRaw('worker_id, sum(box_weight), sum(bag_weight), date')
->groupBy('worker_id')->get();
//Or even
DB::select(DB::raw('SELECT worker_id, sum(box_weight), sum(bag_weight), date
FROM pickups
GROUP BY worker_id');
This will give you a collection of pickups, place toArray() at the end of the query if you wish to convert it to an array
The reason behind selectRaw is due to not being able to use ->sum() with ->select(). It works just fine for the sum of a column, not for multiple output and the same goes for select, as it can't relate sum(column) as a column
I have a table which looks like this
|Application No | Status | Amount | Type |
==========================================
|90909090 | Null | 3,000 | Null |
|90909090 | Forfeit| Null | A |
What I want to achieve is to combine the values together and end with a result like
|Application No | Status | Amount | Type |
==========================================
|90909090 | Forfeit| 3,000 | A |
I am new to SQL Query and have no idea how to do this
Thanks in advance
No need to join, use max() aggregate function and group by:
select applicationno, max(status), max(amount), max(type)
from yourtable
group by applicationno
However, if you have several non-null values for an application number in a field, then you may have to define a more granular rule than a simple aggregation via max.
INTRO: Given a table with a column 'time' of unique dates(or datetime) and another column with some random integer called 'users'.
I usually do a call as such:
select table.dates, count(table.dates)
from table
group by year(table.dates), month(table.dates)
order by table.dates desc
which will return the number of users per month, albeit in an unformatted way. (I know it's not the standard way, but I check my values and this seems to work)
Here is my problem:
DATA: a table with with non-unique year/month dates, and a corresponding user count on that row.
PROBLEM: I wish to sum the user counts for identical dates, and again show a user count for every month.
EDIT: Perhaps you can ignore the INTRO, and here is an example of the data I need to work with:
| Date |user count |
----------------------.-
|2015-01 | 9 |
|2014-09 | 5 |
|2014-09 | 2 |
|2014-08 | 5 |
|2014-09 | 7 |
|2014-08 | 2 |
|2014-07 | 3 |