MYSQL: how to remove SIMILAR database rows based on username and timestamp - mysql

I have a database table that I use for logging. I would like to search the table to find all entries with a particular action on the same day by the same user and only keep one of them.
The column names I think need to be used are:
"activity_action" - where the action sought is 'Daily Website Access'
"user_email" - Looking for the same user on the same day
"activity_timestamp" - where I want to ingnore the time and just check if the action was on the same day

If you are looking for a delete statement, here is one way to do it in MySQL:
delete t
from mytable t
inner join mytable t1
on t1.user_email = t.user_email
and t1.activity_action = t.activity_action
and date(t1.activity_timestamp) = date(t.activity_timestamp)
and t1.activity_timestamp < t.activity_timestamp
where t.activity_action = 'Daily Website Access'
This deletes records that have the sought activity_action and for which another record with the same action and email, that happened during the same day, and whose timestamp is smaller. In other words, this deletes duplicates par action/user/day while retaining the earliest.

Related

Where clause with multi AND & OR conditions

I got a table agenda in which the admin can make a reservation for him self or for someone else (another user). If the admin make the reservation for him self in agenda.user_id will be stored the id of admin.
In case that admin make a reservation for another person (another user) in agenda.user_id will be stored the id of the user for which the reservation will be made. The id of the admin will be stored in another column agenda.booked_user.
All the reservations are stored on agenda_users table also. agenda_users has this columns: id,agenda_id, user_id. The agenda_users.user_id it refers to agenda.user_id.
I want to retrieve all the reservations made by the admin which has made reservations for himself and for other users also.
I did a query with some AND & OR:
SELECT agenda.*
FROM agenda,agenda_users
WHERE agenda_users.agenda_id=agenda.id
AND (agenda_users.user_id=$user_id
AND agenda_users.user_id=agenda.user_id)
OR (agenda_users.user_id=agenda.user_id
AND agenda.booked_user=agenda.$user_id)
AND checkout IS NULL
AND NOW() < DATE_ADD(date_end, INTERVAL 6 HOUR) ORDER BY type ASC,date_start ASC
Cannot figure out the right solution to 'grab' all the reservations the admin has made for him self and other users.
solving the old-style-joins will leave you with this SQL:
SELECT agenda.*
FROM agenda
INNER JOIN agenda_users ON agenda_users.user_id=agenda.user_id AND agenda_users.agenda_id=agenda.id
WHERE
(agenda_users.user_id=$user_id) OR (agenda.booked_user=agenda.$user_id)
AND checkout IS NULL
AND NOW() < DATE_ADD(date_end, INTERVAL 6 HOUR) ORDER BY type ASC,date_start ASC;
This SQL is almost human-readable (and understandable). 😉
EDIT: Added extra () because AND has higher precedence than OR.
SELECT agenda.*
FROM agenda
INNER JOIN agenda_users ON agenda_users.user_id=agenda.user_id AND agenda_users.agenda_id=agenda.id
WHERE
((agenda_users.user_id=$user_id) OR (agenda.booked_user=agenda.$user_id))
AND checkout IS NULL
AND NOW() < DATE_ADD(date_end, INTERVAL 6 HOUR) ORDER BY type ASC,date_start ASC;
This is too long for a comment. So I am posting this as an answer and may adjust it, once you clarify doubts about the data model.
There is a parent table agenda and it has a child table agenda_users. So one agenda has several users. But the agenda table itself has two users, too. One is the person who made the reservation, but rather than using one column for that user, you are using sometimes one column and sometimes the other. You say that when an admin makes a reservation for another user, the admin gets stored in the column booked_user, although it's obviously not the booked user, but the booking user. I wonder whether you have understood the data model yourself, because the explanation sounds just wrong.
Then, an agenda should typically be identified by its id (hence the name), so the agenda_users should be linked via its agenda_id only. Are you sure that the user_id of the two tables must match, too? That would mean an agenda.id is unique only in combination with a user_id? It is possible, but doesn't seem likely.
Your query has some issues, too.
agenda.$user_id is probably supposed to mean $user_id only?
The parentheses are probably wrong, too, as AND has precedence over OR, so the checkout and date_end criteria will only work for the part after OR.
Then you are missing qualifiers. This doesn't make the query wrong, but makes it more difficult to read. What table do checkout and date_end belong to? I assume it's the agenda table and will write my query accordingly, because you mentioned the columns of the agenda_users table and these two columns were not among them.
You want to select data from agenda. So, do so; don't join another table. If you have criteria based on the other table, then use IN or EXISTS for the lookup. In your case, though, - but I can only guess here - it seems you don't need the agenda_users table at all.
SELECT *
FROM agenda
WHERE (user_id = $user_id OR booked_user = $user_id)
AND checkout IS NULL
AND NOW() < DATE_ADD(date_end, INTERVAL 6 HOUR)
ORDER BY type, date_start;
It is cleaner in my opinion to use UNION instead of a very complex where conditions.
Note that you know the user_id you are filtering for, therefore you don't need to join.
/* The ones for admin created by the user */
SELECT
agenda.*
FROM
agenda A
WHERE
A.user_id = $user_id
UNION ALL
/* the ones where the admin created it, but not for itself */
SELECT
agenda.*
FROM
agenda A
WHERE
A.booked_user_id = $user_id
AND A.user_id <> $user_id
Don't forget to add the rest of the where conditions to both subqueries of the union

Comparing each colum in a row to every row in the database sql

I am building a bot that matches users based on a score they get, this score is taken from calculations done to data in a database on the request of the user.
I have only 1 table in that database and a few columns (user,age,genre,language,format,...etc).
What I want to do is, once the user clicks "find match" button on the chatbot, this user's data, which is already in the database will be compared to the other user's data in the same table and compare each column 1 by 1 of each row.
For example, the user's genre preference will be compared to each genre pref of the other users in each row of the table, when there is a match, 1 point is added, then language will be compared of each user and 1 point is given when there's a match. This will go to each column in each row and be compared with the user's. In the end, the users that has highest matching points will be recommended to this user.
What's the best way and approach to do that?
I am using nodejs and mysql database.
Thank you.
I see this as a self join and conditional expressions:
select t.*,
(t1.genre = t.genre) + (t1.language = t.language) + (t1.format = t.format) as score
from mytable t
inner join mytable t1 on t1.user <> t.user
where t1.user = ?
order by score desc
The question mark represents the id of the currently logged on user, for who you want to search matching users. The query brings all other users, and counts how many values they have in common over the table columns: each matching value increases the score by 1. Results are sorted by descending score.

Getting rows in Microsoft Access to refer to other rows

I have a Microsoft Access table of data with 3 fields: "part_number", "date_of_price_change" and "new_price", but I need to convert the "new_price" field to show the "price_change", rather than the full "new_price" of each part on each date.
This will obviously involve some process that looks at each unique part number on each date and looks up the price of the record with the same part number with the next earliest date and deduct the prices to get the price change.
Problem is, I have no idea how to do this in Access and there are too many records for Excel. Can anyone assist with how to do this in Access? (Note that date changes can happen any time and are not periodic).
Many thanks in advance.
Ana
Add the new column price_change as a Money data type, then run a query something like below. Make sure you backup the table first with an APPEND table to a new table, just in case. Since it is a new column i may not matter.
UPDATE
T1
SET
T1.price_change = T1.new_price - Nz((SELECT top 1 T2.new_price from MyTable T2 WHERE T2.part_number = T1.Part_Number ORDER BY date_of_price_change DESC),0)
FROM
MyTable T1

How can I delete records in a table so there is one reading a minute?

I have a table in my mysql database that stores gps readings. It stores readings for different users taken at different times:
It should contain one reading per minute per user. However, due to a bug in the app that collects the reading, sometimes it has one reading per second per user, which is way too much data for our requirements.
Is there a way I can delete rows from the table using a query such that each user does not have more than one reading per minute? I want to avoid having to write a program to do this if possible!
Thanks!
First note that deleting a lot of records from a table can be highly non-performant. Often, it is better to simply recreate the table (or a new table):
create table new_gps as
select gps.*
from gps join
(select userid, min(dt) as mindt
from gps
group by userid, floor(time_to_sec(dt) / 60)
) gpsmin
on gps.userid = gpsmin.userid and gps.dt = gpsmin.mindt;
You can use the same idea for the delete:
delete gsp
from gps left join
(select userid, min(dt) as mindt
from gps
group by userid, floor(time_to_sec(dt) / 60)
) gpsmin
on gps.userid = gpsmin.userid and gps.dt = gpsmin.mindt
where gspmin is null;
a easyer way is to set a unique key on the dt column with IGNORE Option, so mysql earse all duplicate lines an prevent it for new duplications.
The only thing to do is:
ALTER IGNORE TABLE gsp ADD UNIQUE KEY (dt);

Setting columns in a table to match highest contextual value in another table using SQL

I’m trying to fix my forum’s botched database with the help of SQL queries in phpMyAdmin. The columns being used are as follows:
mybb_posts is a table that stores information for a single Post in each row, while mybb_users is a table that stores information for a single User in each row.
mybb_users.uid – The ID of a Forum User
mybb_users.lastpost – The Timestamp of the last Post a User made
mybb_posts.uid – Refers to which User made a Post
mybb_posts.dateline – The Timestamp that appears on a Post
I want set the entry for each user’s lastpost equal to the max value for dateline entries from posts where the uid matches the user’s uid. To express that as best as I can with my limited experience with SQL:
SET mybb_users.uid = MAX(mybb_posts.dateline WHERE mybb_posts.uid = mybb_users.uid)
I’ve given it a few tries, including that shameful display, but all resulted in errors.
I think this should do it:
UPDATE
mybb_users,
(SELECT uid, MAX(dateline) AS date FROM mybb_posts GROUP BY uid) AS lastposts
SET mybb_users.lastpost = lastposts.date
WHERE mybb_users.uid = lastposts.uid
So what goes on here? The sub query (third line) selects the maximum datelines for every users (thanks to the GROUP BY). The WHERE makes sure the temporary table from the sub query and the mybb_users are matched on the uid.