Eliminating duplicates in Access Query results - ms-access

I have an access database with records exported from our incident management system. I’m trying to resolve a duplicate counting issue we have due to the way we count incidents. Counting is fine as long as we have a single Operator for each incident. I’m having an issue with the use case where the operation of a vehicle involves two Operators or the other use case where we have an incident where two Operators collide with one another. For any incident an Operator is given a charge, Avoidable or Unavoidable. For an incident for either of the duplicate cases, Operator A can be charged with an Avoidable and Operator B can be charged an Unavoidable. However at the macro level we view this as one incident even though we have two records in the database with charges for both Operators.
Sample data
Incident_Number EmpName IncType Charge_1
1A Joe Collision Avoidable
1B Tom Collision Avoidable
1B Sue Collision Unavoidable
1C Harry Collision Avoidable
1C John Collision Unavoidable
1C Kathi Collision Unavoidable
1D Larry Collision Unavoidable
Sample of how I would like the query results
Incident_Number EmpName IncType Charge_1
1A Joe Collision Avoidable
1B Tom Collision Avoidable
1C Harry Collision Avoidable
1D Larry Collision Unavoidable
In trying to get this to work I tried this test, but it didn't prevent the duplicates. Is our problem just the way we store our data? Should I be trying a DISTINCT on the Charge_1 column?
SELECT *
FROM tst2019 as c1
WHERE Incident_Number <>
(SELECT MAX(Incident_Number) FROM tst2019 as c2
WHERE c2.charge_1=c1.charge_1);

Gustav,
Here's how I used your solution:
SELECT
Incident_Number,
Date_of_Incident,
Mode,
Incident_Type,
charge_1,
First(employee_name) AS empname,
Division
FROM
tst2019
GROUP BY
Incident_Number,
Date_of_Incident,
Mode,
Incident_Type,
Charge_1,
Division;
I did have one anomaly though, for the Cable Car case where operation of the vehicle requires two employees. For the same Incident Number the query results had a 1 row for Avoidable and 1 row for Unavoidable. Other than that this works exactly the way I would like, so thank you for the help.
Worst case, I could always do a manual check for the Cable Car incidents. Question, would it make sense to run a sql query, specific to the "Avoidables" then do a second append query specific to the "Unavoidables" but having the query omit any result where the incident number already exists in the initial "Avoidable" query results. If this is possible could you provide some guidance on how this could be done? Thanks again for your assistance in solving this duplicates issue.
You must group by Incident_Number:
SELECT
Incident_Number,
First(EmpName) As Operator,
IncType,
Min(Avoidable) As Status
FROM
tst2019
GROUP BY
Incident_Number,
IncType
Note please, that FIRST just picks "a" operator, not necessarily the "first", as nothing in your table defines who/what is first.

Related

SQL - Add To Existing Average

I'm trying to build a reporting table to track server traffic and popularity overall. Each SID is a unique game server hosting a particular game, and each UCID is a unique player key connecting to that server.
Say I have a table like so:
SID UCID AvgTime NumConnects
-----------------------------------------
1 AIE9348ietjg 300.55 5
1 Po328gieijge 500.66 7
2 AIE9348ietjg 234.55 3
3 Po328gieijge 1049.88 18
We can see that there are 2 unique players, and 3 unique servers, with SID 1 having 2 players that have connected to it at some point in the past. The AvgTime is the average amount of time those players spent on that server (in seconds), and the NumConnects is the size of the average (ie. 300.55 is averaged out of 5 elements).
Now I run a job in the background where I process a raw connection table and pull out player connections like so:
SID UCID ConnectTime DisconnectTime
-----------------------------------------
1 AIE9348ietjg 90.35 458.32
2 Po328gieijge 30.12 87.15
2 AIE9348ietjg 173.12 345.35
This table has no ID or other fluff to help condense my example. There may be multiple connect/disconnect records for multiple players in this table. What I want to do is add to my existing AvgTime for each SID these new values.
There is a formula from here I am trying to use (taken from this math stackexchange: https://math.stackexchange.com/questions/1153794/adding-to-an-average-without-unknown-total-sum/1153800#1153800)
Average = (Average * Size + NewValue) / Size + 1
How can I write an update query to update each ServerIDs traffic table above, and add to the average using the above formula for each pair of records. I tried something like the following but it didn't work (returned back null):
UPDATE server_traffic st
LEFT JOIN connect_log l
ON st.SID = l.SID AND st.UCID = l.UCID
SET AvgTime = (AvgTime * NumConnects + SUM(l.DisconnectTime - l.ConnectTime) / NumConnects + COUNT(l.UCID)
I would prefer an answer in MySql, but I'll accept MS SQL as well.
EDIT
I understand that statistics and calculations are generally not to be stored in tables and that you can run reports that would crunch the numbers for you. My requirement is that users can go to a website and view the popularity of various servers. This needs to be done in a way that
A: running a complex query per user doesn't crash or slow down the system
B: the page returns the data within a few seconds at most
See this example here: https://bf4stats.com/pc/shinku555555
This is a web page for battlefield 4 stats - notice that the load is almost near instant for this player, and I get back a load of statistics without waiting for some complex report query to return the data. I'm assuming they store these calculations in preprocessed tables where the webpage just needs to do a simple select to return back the values. That's the same approach I want to take with my Database and Web Application design.
Sorry if this is off topic to the original question - but hopefully this adds additional context that helps people understand my needs.
Since you cannot run aggregate functions like SUM and COUNT by themselves at the unit level in SQL but contained in an aggregate query, consider joining to an aggregate subquery for the UPDATE...LEFT JOIN. Also, adjust parentheses in SET to match above formula.
Also, note that since you use LEFT JOIN, rows with non-match IDs will render NULL for aggregate fields and this entity cannot be used in arithmetic operations and will return NULL. You can convert to zero with IFNULL() but may fail with formula's division.
UPDATE server_traffic s
LEFT JOIN
(SELECT SID, UCID, COUNT(UCID) As GrpCount,
SUM(DisconnectTime - ConnectTime) AS SumTimeDiff
FROM connect_log
GROUP BY SID, UCID) l
ON s.SID = l.SID AND s.UCID = l.UCID
SET s.AvgTime = (s.AvgTime * s.NumConnects + l.SumTimeDiff) / s.NumConnects + l.GrpCount
Aside - reconsider saving calculations/statistics within tables as they can always be run by queries even by timestamps. Ideally, database tables should store raw values.

Mysql combinations for 2 data sets

I would like a table or query in mysql of all permutations of two separate datasets, but with rules.
I have a table of jobs, and a table of drivers.
I'd like to produce table or query of all combinations of jobs to drivers. Each job must have a driver, but each driver doesn't necessarily need a job. Like this:
In this example I have 4 jobs and 3 drivers.
Job1 | Job2 | Job3 | Job4
1 1 1 1
1 1 1 2
1 1 1 3
1 1 2 1
1 1 3 1
This, I can't do, so if someone could help me that would be awesome. I believe that the number of rows in this example would be 4 to the power 3 (jobs to the power of drivers) which is 64 rows.
But the second part of this is what I call the "rules". Each job will have defined drivers that can do the job.
For example Job 1 can only be done by driver 1 or 3.
Job 2 can only be done by driver 1.
etc
I was thinking that if I did a create table, then running delete queries, but I am really at a loss. I would like to just create the query using the rules to start with in an attempt to speed it up.
This will eventually help me to make a plan for each job by showing all the ways that these jobs can be assigned.
Sorry for being vague but I'm hoping the community can help me out here.
Edit:
I think my maths may have been wrong. According to this: combination calculator where I input 3 to choose from (drivers) and 4 numbers chosen (jobs) order is important and repetition is allowed (not sure what that is) then it produces 81.
Although way unclear and smells like homework, her my two cents:
Setup one table with the jobs jobs:
job_id|job_name
Next, set up a table with drivers:
driver_id|driver_name
Now we need to maps:
First the "rules", drivers which are capable of jobs, job_capabilities:
driver_id|job_id
this will contain one row for each assignment job => driver, the combination (job_id, driver_id) should be unique - one driver is only one times capable of a particular job.
Second map contains the assignments itself, assigned_jobs:
driver_id|job_id
actually that has the same layout, but for a given time period (which is missing here) one driver actually can only work on one job, so driver_id and a date-time should be unique. Skipped for clearness now.
Now we can construct an SQL like
SELECT job_capabilities.driver_id, job.job_id from jobs, job_capabilities
where job.job_id = job_capabilities.job_id
AND job.job_id = 42;
We could just use that to insert into assigned_jobs with
INSERT INTO assigned_jobs .... (Select from above) ...
probably enhanced by a ON DUPLACET KEY UPDATE ... clause.
To now get the assigned jobs, we can alter that statement a bit:
SELECT assigned_jobs.driver_id, drivers.driver_name, job.job_id, job.job_name
from drivers, jobs, assigned_jobs
where job.job_id = assigned_jobs.job_id
AND drivers.driver_id = assigned_jobs.driver_id
AND job.job_id = 42;
This is not tested and probably not valid SQL, but a first approach I would use.

RowNumber for group in SSRS 2005

I have a table in a SSRS report that is displaying only a group, not the table details. I want to find out the row number for the items that are being displayed so that I can use color banding. I tried using "Rowcount(Nothing)", but instead I get the row number of the detail table.
My underlying data is something like
ROwId Team Fan
1 Yankees John
2 Yankees Russ
3 Red Socks Mark
4 Red Socks Mary
...
8 Orioles Elliot
...
29 Dodgers Jim
...
43 Giants Harry
My table showing only the groups looks like this:
ROwId Team
2 Yankees
3 Red Socks
8 Orioles
29 Dodgers
43 Giants
I want it to look like
ROwId Team
1 Yankees
2 Red Socks
3 Orioles
4 Dodgers
5 Giants
You can do this with a RunningValue expression, something like:
=RunningValue(Fields!Team.Value, CountDistinct, "DataSet1")
DataSet1 being the name of the underlying dataset.
Consider the data:
Creating a simple report and comparing the RowNumber and RunningValue approaches shows that RunningValue gives your required results:
You can easily achieve this with a little bit of vbcode. Go to Report - Properties - code and type something like:
Dim rownumber = 0
Function writeRow()
rownumber = rownumber + 1
return rownumber
End Function
Then on your cell, call this function by using =Code.writeRow()
As soon as you start using groups inside the tables, the RowNumber and RunningGroup functions start getting some weird behaviours, thus it's easier to just write a bit of code to do what you want.
I am not convinced all suggestions above provide are a one for all solution. My scenario is I have a grouping that has has multiple columns. I could not use the agreed solution RunningValue because I don't have a single column to use in the function unless I combine (say a computed column) them all to make single unique column.
I could not use the VBA code function as is for the same reason and I had to use the same value across multiple columns and multiple properties for that matter unless I use some other kind of smarts where if I knew the number of uses (say N columns * M properties) then I could only update the RowNumber on every NxM calls however, I could not see any count columns function so if I added a column I would also need to increase my N constant. I also did not want to add a new column as also suggested to my grouping as I could not figure out how to hide it and I could not write a vba system where I could call function A that returns nothing but updates the value (i.e. called only once per group row) then call another function GetRowNumber which simply returns the rownumber variable because the colouring was done before the call so I always had one column out of sync to the rest.
My only other 2 solutions I could think of is put the combined column as mentioned earlier in the query itself or use DENSE_RANK and sort on all group columns, i.e.
DENSE_RANK() OVER (ORDER BY GroupCol1, GroupCol2, ...) AS RowNumber

mysql combining records from one table

I have a single table that uses test# as the primary key. Here is what that table looks like:
Test# Name VerbalScore readingScore Notes
1 Bobby 92 Good job
2 Bobby 40 You Suck Bobby
The problem is I want to view and be able to see when there are multiple verbal scores for the same Name (so be able to see if the person took the same test more than once).
I want to have some kind of select statement to get this result from the above table:
1 Bobby 92 40 Good job, You Suck Bobby
Is that possible?
I am not totally sure I understand what you mean by "see when there are multiple verbal scores" but with mysql 5+, try
SELECT
Name,
GROUP_CONCAT(VerbalScore),
GROUP_CONCAT(readingScore),
GROUP_CONCAT(Notes)
FROM
myTable
GROUP BY
Name;
GROUP_CONCAT is a mysql specific grouping function.

Count a specific value from multiple columns and group by values in another column... in mysql

Hey. I have 160 columns that are filled with data when a user fills a report form out and submit it. A few of these sets of columns contain similar data, but there needs to be multiple instance of this data per record set as it may be different per instance in the report.
For example, an employee opens a case by a certain type at one point in the day, then at another point in the day they open another case of a different type. I want to create totals per user based on the values in these columns. There is one column set that I want to target right now, case type. I would like to be able to see all instances of the value "TSTO" in columns CT1, CT2, CT3... through CT20. Then have that sorted by the employee ID number, which is just one column in the table.
Any ideas? I am struggling with this one.
So far I have SELECT CT1, CT2, CT3, CT4, CT5, CT6, CT7, CT8, CT9, CT10, CT11, CT12, CT13, CT14, CT15, CT16, CT17, CT18, CT19, CT20 FROM REPORTS GROUP BY OFFICER
This will display the values of all the case type entries in a record set but I need to count them, I tried to use,
SELECT CT1, CT2, CT3, CT4, CT5, CT6, CT7, CT8, CT9, CT10, CT11, CT12, CT13, CT14, CT15, CT16, CT17, CT18, CT19, CT20 FROM REPORTS COUNT(TSTO) GROUP BY OFFICER
but it just spits an error. I am fairly new to mysql databasing and php, I feel I have a good grasp but query'ing the database and the syntax involved is a tad bit confused and/or overwhelming right now. Just gotta learn the language. I will keep looking and I have found some similar things on here but I don't understand what I am looking at (completely) and I would like to shy away from using code that "works" but I don't understand fully.
Thank you very much :)
Edit -
So this database is an activity report server for the days work for the employees. The person will often open cases during the day. These cases vary in type, and their different types are designated by a four letter convention. So your different case types could be TSTO, DOME, ASBA, etc etc. So the user will fill out their form throughout the day then submit it down to the database. That's all fine :) Now I am trying to build a page which will query the database by user request for statistics of a user's activities. So right now I am trying to generate statistics. Specifically, I want to be able to generate the statistic of, and in human terms, "HOW MANY OCCURENCES OF "USER INPUTTED CASE TYPE" ARE THERE FOR EMPLOYEEIDXXX"
So when a user submits a form they will type in this four letter case type up to 20 times in one form, there is 20 fields for this case type entry, thus there is 20 columns. So these 20 columns for case type will be in one record set, one record set is generated per report. Another column that is generated is the employeeid column, which basically identifies who generated the record set through their form.
So I would like to be able to query all 20 columns of case type, across all record sets, for a defined type of case (TSTO, DOME, ASBA, etc etc) and then group that to corresponding user(s).
So the output would look something like,
316 TSTO's for employeeid108
I hope this helps to clear it up a bit. Again I am fairly fresh to all of this so I am not the best with the vernacular and best practices etc etc...
Thanks so much :)
Edit 2 -
So to further elaborate on what I have going on, I have an HTML form that has 164 fields. Each of these fields ultimately puts a value into a column in a single record set in my DB, each submission. I couldn't post images or more than two URLs so I will try to explain it the best I can without screenshots.
So what happens is this information gets in the DB. Then there is the query'ing. I have a search page which uses an HTML form to select the type of information to be searched for. It then displays a synopsis of each report that matches the query. The user than enters the REPORT ID # for the report they want to view in full into another small form (an input field with a submit button) which brings them to a page with the full report displayed when they click submit.
So right now I am trying to do totals and realizing my DB will be needing some work and tweaking to make it easier to create querys for it for different information needed. I've gleaned some good information so far and will continue to try and provide concise information about my setup as best I can.
Thanks.
Edit 3 -
Maybe you can go to my photobucket and check them out, should let me do one link, there is five screenshots, you can kind of see better what I have happening there.
http://s1082.photobucket.com/albums/j376/hughessa
:)
The query you are looking for would be very long and complicated for your current db schema.
Every table like (some_id, column1, column2, column3, column4... ) where columns store the same type of data can be also represented by a table (some_id, column_number, column_value ) where instead of 1 row with values for 20 columns you have 20 rows.
So your table should rather look like:
officer ct_number ct_value
1 CT1 TSTO
1 CT2 DOME
1 CT3 TSTO
1 CT4 ASBA
(...)
2 CT1 DOME
2 CT2 TSTO
For a table like this if you wanted to find how many occurences of different ct_values are there for officer 1 you would use a simple query:
SELECT officer, ct_value, count(ct_value) AS ct_count
FROM reports WHERE officer=1 GROUP BY ct_value
giving results
officer ct_value ct_count
1 TSTO 2
1 DOME 1
1 ASBA 1
If you wanted to find out how many TSTO's are there for different officers you would use:
SELECT officer, ct_value, count( officer ) as ct_count FROM reports
WHERE ct_value='TSTO' GROUP BY officer
giving results
officer ct_value ct_count
1 TSTO 2
2 TSTO 1
Also any type of query for your old schema can be easily converted to new schema.
However if you need store additional information about every particular report I suggest having two tables:
Submissions
submission_id report_id ct_number ct_value
primary key
auto-increment
------------------------------------------------
1 1 CT1 TSTO
2 1 CT2 DOME
3 1 CT3 TSTO
4 1 CT4 ASBA
5 2 CT1 DOME
6 2 CT2 TSTO
with report_id pointing to a record in another table with as many columns as you need for additional data:
Reports
report_id officer date some_other_data
primary key
auto-increment
--------------------------------------------------------------------
1 1 2011-04-29 11:28:15 Everything went ok
2 2 2011-04-29 14:01:00 There were troubles
Example:
How many TSTO's are there for different officers:
SELECT r.officer, s.ct_value, count( officer ) as ct_count
FROM submissions s JOIN reports r ON s.report_id = r.report_id
WHERE s.ct_value='TSTO'
GROUP BY r.officer