SPSS: How do I generate ID numbers from client ID variable that contains duplicate IDs in the order of the first date of each ID - duplicates

Previously, I asked how to generate ID numbers from a client ID variable that contains duplicate IDs. I will use the same example data in this question but I would like to know how to generate ID numbers in the order of the first date of each ID. My client ID variable is string and has to remain as string.
My Data looks like:
ClientID TimeStamp
15137.45692 15/03/2021
10489.15789 03/02/2021
14143.96745 01/01/2021
15137.45692 15/01/2021
15137.45692 27/02/2021
14143.96745 08/03/2021
I would like it to look like:
ID ClientID TimeStamp
1 14143.96745 01/01/2021
1 14143.96745 08/03/2021
2 15137.45692 15/01/2021
2 15137.45692 27/02/2021
2 15137.45692 15/03/2021
3 10489.15789 03/02/2021
The previous code I tried was this:
sort cases by ClientID.
compute ID=1.
if $casenum>1 ID=lag(ID)+(ClientID<>lag(ClientID)).
exe.
However, whilst it gave me ID numbers for each ID, those ID numbers weren't ordered by TimeStamp.

In order to create the new ID the data needs to be sorted by ClientID. But then the new IDs will have the same order of the ClientID, while the order you want is not by the ClientID but by the first date of appearance. So first we need to calculate the first date for every ClientID, then we can use that to sort before creating the new ID.
Note: you need to make sure TimeStamp is defined as a date variable.
aggregate outfile=* mode=addvariables /break=ClientID /firstDate=min(TimeStamp).
sort cases by firstDate ClientID.
compute ID=1.
if $casenum>1 ID=lag(ID)+(ClientID<>lag(ClientID)).
exe.

Related

SPSS: How do I generate ID numbers from client ID variable that contains duplicate IDs

I have a dataset which contains thousands of rows which each person assigned a ClientID. I would like to use the ClientID variable to generate a new ID variable which starts at 1. Some ClientIDs are duplicated so I would like to make sure that duplicate ClientIDs are given the same ID number. Client IDs are string and my data has to be sorted by TimeStamp.
My data looks like:
ClientID TimeStamp
15137.45692 15/03/2021
10489.15789 03/02/2021
14143.96745 01/01/2021
15137.45692 15/01/2021
15137.45692 27/02/2021
14143.96745 08/03/2021
I would like it to look like:
ID ClientID TimeStamp
1 14143.96745 01/01/2021
2 15137.45692 15/01/2021
3 10489.15789 03/02/2021
2 15137.45692 27/02/2021
1 14143.96745 08/03/2021
2 15137.45692 15/03/2021
How do I do this?
I would do it in excel but I have over 250k rows of data and excel keeps crashing.
Thanks
The following syntax creates ID=1 and then adds 1 only in case of a new ClientID:
sort cases by ClientID.
compute ID=1.
if $casenum>1 ID=lag(ID)+(ClientID<>lag(ClientID)).
exe.
EDIT:
Here's another nice way to do it using rank function:
RANK VARIABLES=ClientID (A) /RANK /PRINT=NO /TIES=CONDENSE.

SQL combining or,and query

I have one table called student.I want to select a student name who is living in chennai or madurai and born on december 8 1996.The table column name is (name,city,DOB). Sort the result by name. I have written like this and i got error "Invalid relational operator".
SELECT name
FROM student
WHERE city='chennai' OR 'madurai' AND DOB='december 8 1996'
ORDER BY name;
You have to mention the column in each where clause test.
Also if you are mixing AND and OR you need to apply some parenthasis to ensure they are applied correctly.
Also the date should be in yyyy-mm-dd format Assuming that you have deined DOB as a DATE type. And you should have if it is holding a date.
SELECT name
FROM student
WHERE (city='chennai' OR city='madurai' ) AND DOB='1996-12-08'
ORDER BY name;

insert ignore or replace ignore not working

I'm moving data from one SQL table to a second table using
insert ignore or replace into
I isn't working, I believe because I don't have a unique key. I also don't know where I would put the key.
I need the second table to display the last zone the number was seen on that date. I can add a time column if needed.
Example Data:
number
zone
date
1
zone3
01-02-03
1
zone1
01-02-03
1
zone3
01-02-03
2
zone1
01-02-03
3
zone2
01-02-03
If I put number as a unique key it doesn't get added when the date changes.
If I add date as a unique key only one row gets added on that date.
The query:
REPLACE INTO database.table2 (number,zone,date)
SELECT number,zone,date
FROM database.table1
GROUP BY number,date;
I hoped with the number and date grouped that it won't duplicate record, but it is still adding multiples.

Query: COUNT in Access To Only Count Unique Values

I have a table like so:
Customer Purchase Date Product
Frank 7/28/2015 Hammer
Bob 7/29/2015 Shovel
Bob 7/29/2015 Pickaxe
Bill 7/30/2015 Pliers
The Purchase Date field records a new entry for every purchase. So, if in one visit a customer purchases four items, my database creates four entries each with the same date.
I'm trying to write a query that displays the numbers of visits for each customer. Output like so:
Frank 1
Bob 1
Bill 1
But when I use the COUNT function on the date in my query, it returns:
Frank 1
Bob 2
Bill 1
I want my query to only count unique dates, but the COUNT function doesn't work. Everywhere I read, it also says that the SQL COUNT (Distinct) doesn't work in Access. Access help says that if I set the Query Properties to Unique Values "Yes", it should only return unique values, but it doesn't work. I tried Unique Record "Yes" also, but that didn't work either.
Please help! Thanks!
Try this:
select Cust, count(cust) as CustomerCount
from (Select Distinct Table1.Customer as cust, Table1.PurchaseDate
from Table1)
group by cust

MySql query to get two rows into one with latest data from one row

Wondering if anyone could please help me with the following query.
http://sqlfiddle.com/#!2/79a49/1
This is the scenario:
A vehicle can be checkedout with a unique ra_no at any time. so a vehicle_id = 70 can be checkedout with ra_no = test with a branch_id and date_created and mileage.
It can then be checked back in with the same ra_no and then branch_id and date_created and mileage.
This query needs to return 1 row per ra_no for a vehicle_id eg. vehicle_id 70 and in this row it needs to display the currect status(latest) as well as the mileage(latest) and then the correct checked out date and location and correct checked in date and location.
You would group by ra_no but i cant seem to get the correct info out.
This needs to be converted to a view so that i can create a model for it in Yii.
Thanks!