MySQL query to find the count of each item in a column

MySQL query to find the count of each item in a column - mysql

I have a table with the following fields :
user_id, occured_at, event_type, event_name, location, device, user_type
question to solve :
Weekly Engagement: To measure the activeness of a user. Measuring if the user finds quality in a product/service weekly.
Your task: Calculate the weekly engagement per device?
I have found a solution for this, which is to find the distinct devices and then use a separate query to find the weekly engagement per device by manually specifying the device names in IN statement.
Query 1 :
SELECT DISTINCT device AS devices
FROM events1;
Output :
dell inspiron notebook iphone 5 iphone 4s windows surface macbook air iphone 5s macbook pro kindle fire ipad mini nexus 7 nexus 5 samsung galaxy s4 lenovo thinkpad samsumg galaxy tablet acer aspire notebook asus chromebook htc one nokia lumia 635 samsung galaxy note acer aspire desktop mac mini hp pavilion desktop dell inspiron desktop ipad air amazon fire phone nexus 10
Query2 :
SELECT week(occurred_at) AS weeknum,
COUNT(DISTINCT user_id) AS weekly_users,
COUNT(DISTINCT CASE WHEN device IN ('macbook pro', 'acer aspire notebook','acer aspire desktop','lenovo thinkpad', 'mac mini', 'dell inspiron desktop','dell inspiron notebook','windows surface','macbook air','asus chromebook','hp pavilion desktop') THEN user_id ELSE NULL END) AS computer,
COUNT(DISTINCT CASE WHEN device IN ('iphone 5s','nokia lumia 635','amazon fire phone','iphone 4s','htc one','iphone 5','samsung galaxy s4') THEN user_id ELSE NULL END) AS phone,
COUNT(DISTINCT CASE WHEN device IN ('kindle fire','samsung galaxy note','ipad mini','nexus 7','nexus 10','samsumg galaxy tablet','nexus 5','ipad air') THEN user_id ELSE NULL END) AS tablet
FROM events1
WHERE event_type = 'engagement'
AND event_name = 'login'
GROUP BY 1
ORDER BY 1 ;
final output
My doubt:
Is there a way to find this result using a single query which finds weekly users for each device like ('macbook pro', 'acer aspire notebook','acer aspire desktop','lenovo thinkpad' etc ) instead of first finding the device names and manually entering them as in two separate queries.
Here we have 26 device names, so we are able to manually enter it. What if there are lot of devices and it becomes impossible for us to enter it manually. What will we do in those situations. Can anyone please suggest a solution.

Related

How to show the title for nsys profile?

I have noticed that when I use nsys in my machine
nsys profile --stats=true -o output-report ./input
It outputs the data like this:
NVIDIA Nsight Systems version 2022.4.2.50-32196742v0
[5/8] Executing 'cudaapisum' stats report
Time (%) Total Time (ns) Num Calls Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Name
-------- --------------- --------- ------------ ------------ ---------- ----------- ------------ ----------------------
46.7 100,404,793 3 33,468,264.3 22,463.0 12,434 100,369,896 57,938,512.8 cudaMallocManaged
39.5 84,938,847 1 84,938,847.0 84,938,847.0 84,938,847 84,938,847 0.0 cudaDeviceSynchronize
13.8 29,677,781 3 9,892,593.7 9,610,457.0 9,514,092 10,553,232 574,154.9 cudaFree
0.0 82,478 1 82,478.0 82,478.0 82,478 82,478 0.0 cuLibraryLoadData
0.0 40,588 1 40,588.0 40,588.0 40,588 40,588 0.0 cudaLaunchKernel
0.0 892 1 892.0 892.0 892 892 0.0 cuModuleGetLoadingMode
The section is described by "Executing 'cudaapisum' stats report" instead of the normal title like "CUDA API Statistics". So I'm wondering if there's a flag that I can use to output the stats like the one below:
The output below isn't from my machine, it's from AWS's machine.
NVIDIA Nsight Systems version 2021.1.1.66-6c5c5cb
CUDA API Statistics:
Time(%) Total Time (ns) Num Calls Average Minimum Maximum Name
------- --------------- --------- ----------- --------- --------- ---------------------
61.5 250696605 3 83565535.0 36197 250541972 cudaMallocManaged
32.8 133916228 1 133916228.0 133916228 133916228 cudaDeviceSynchronize
5.7 23226526 3 7742175.3 6373371 9064987 cudaFree
0.0 56395 1 56395.0 56395 56395 cudaLaunchKernel
And the other thing I have to mention is that on my machine it automatically outputs the profile file to a .nsys-rep extension not the .qdrep extension. Are both of them the same or different?
I've been trying to find information in the nsys documentation, but I couldn't find any. I've tried searching in stackoverflow & nvidia's forum on Nsight but none came up so far. Maybe I've missed something. Any help will be appreciated.
Note: both of them is using the same command but just a slightly different file.

And the other thing I have to mention is that on my machine it automatically outputs the profile file to a .nsys-rep extension not the .qdrep extension. Are both of them the same or different?
.nsys-rep is the new extension name for .qdrep files, it is the same format though. The change happened with version 2021.4.
Specifically, from the release notes of the aforementioned version:
Result file rename
In order to make the Nsight tools family more consistent, all versions of Nsight Systems starting with 2021.4 will use the “.nsys-rep” extension for generated report files by default.
Older versions of Nsight Systems used “.qdrep”.
Nsight Systems GUI 2021.4 and higher will continue to support opening older “.qprep” reports.
Versions of Nsight Systems GUI older than 2021.4 will not be able to open “.nsys-rep” reports.
Please note that the versions of the tool on your local machine and the AWS machine are different.
So I'm wondering if there's a flag that I can use to output the stats like the one below
There isn't a flag to control the output you are mentioning. You could modify your workflow slightly, profile your application without the --stats CLI switch, and collect the report file (nsys-rep/qdrep). Then you can use the nsys stats command and apply specific stats reports to your report file.
If you have feature requests for the Nsight Systems tool, please let us know through the NVIDIA Developer Forum.

Google Fit estimated steps through REST API decrease in time with some users

We are using Googlefit REST API in a process with thousands of users, to get daily steps. With most of users, process is OK, although we are finding some users with this specific behaviour: users step increase during the day, but at some point, they decrease significantly.
We are finding a few issues related to this with Huawei health apps mainly (and some Xiaomi health apps).
We use this dataSourceId to get daily steps: derived:com.google.step_count.delta:com.google.android.gms:estimated_steps
An example of one of our requests to get data for 15th March (Spanish Times):
POST https://www.googleapis.com/fitness/v1/users/me/dataSources
Accept: application/json
Content-Type: application/json;encoding=utf-8
Authorization: Bearer XXXXXXX
{
"aggregateBy": [{
"dataTypeName": "com.google.step_count.delta",
"dataSourceId": "derived:com.google.step_count.delta:com.google.android.gms:estimated_steps"
}],
"bucketByTime": { "durationMillis": 86400000 },
"startTimeMillis": 1615244400000,
"endTimeMillis": 1615330800000
}
With most of users, this goes well (it gets the same data that shows up to the user in googlefit app), but with some users as described, numbers during day increase at first, and decrease later. Some users' data in the googlefit app is much greater (or significantly greater) than the one found through the REST API.
We have even traced this with a specific user during the day. Using buckets of 'durationMillis': 3600000, we have painted a histogram of hourly steps in one day (with a custom made process).
For the same day, in different moments of time (a couple of hours difference in this case), we get this for the EXACT SAME USER:
20210315-07 | ########################################################## | 1568
20210315-08 | ############################################################ | 1628
20210315-09 | ########################################################## | 1574
20210315-10 | ####################### | 636
20210315-11 | ################################################### | 1383
20210315-12 | ###################################################### | 1477
20210315-13 | ############################################### | 1284
20210315-14 | #################### | 552
vs. this, that was retrieved A COUPLE OF HOURS LATER:
20210315-08 | ################# | 430
20210315-09 | ######### | 229
20210315-10 | ################# | 410
20210315-11 | ###################################################### | 1337
20210315-12 | ############################################################ | 1477
20210315-13 | #################################################### | 1284
20210315-14 | ###################### | 552
("20210315-14" means 14.00 at 15th March of 2021)
This is the returning JSON in the first case:
[{"startTimeNanos":"1615763400000000000","endTimeNanos":"1615763460000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":6,"mapVal":[]}]},
{"startTimeNanos":"1615788060000000000","endTimeNanos":"1615791600000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1568,"mapVal":[]}]},
{"startTimeNanos":"1615791600000000000","endTimeNanos":"1615795080000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1628,"mapVal":[]}]},
{"startTimeNanos":"1615795200000000000","endTimeNanos":"1615798500000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1574,"mapVal":[]}]},
{"startTimeNanos":"1615798860000000000","endTimeNanos":"1615802400000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":636,"mapVal":[]}]},
{"startTimeNanos":"1615802400000000000","endTimeNanos":"1615806000000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1383,"mapVal":[]}]},
{"startTimeNanos":"1615806000000000000","endTimeNanos":"1615809480000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1477,"mapVal":[]}]},
{"startTimeNanos":"1615809660000000000","endTimeNanos":"1615813200000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1284,"mapVal":[]}]},
{"startTimeNanos":"1615813380000000000","endTimeNanos":"1615815420000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":552,"mapVal":[]}]}]
This is the returning JSON in the latter case:
[{"startTimeNanos":"1615788300000000000","endTimeNanos":"1615791600000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":517,"mapVal":[]}]},
{"startTimeNanos":"1615791600000000000","endTimeNanos":"1615794540000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":430,"mapVal":[]}]},
{"startTimeNanos":"1615796400000000000","endTimeNanos":"1615798200000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":229,"mapVal":[]}]},
{"startTimeNanos":"1615798980000000000","endTimeNanos":"1615802400000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":410,"mapVal":[]}]},
{"startTimeNanos":"1615802400000000000","endTimeNanos":"1615806000000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1337,"mapVal":[]}]},
{"startTimeNanos":"1615806000000000000","endTimeNanos":"1615809480000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1477,"mapVal":[]}]},
{"startTimeNanos":"1615809660000000000","endTimeNanos":"1615813200000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":1284,"mapVal":[]}]},
{"startTimeNanos":"1615813380000000000","endTimeNanos":"1615815420000000000","dataTypeName":"com.google.step_count.delta","originDataSourceId":"raw:com.google.step_count.delta:com.huawei.health:","value":[{"intVal":552,"mapVal":[]}]}]
AS you can see, all points always come from originDataSourceId: "raw:com.google.step_count.delta:com.huawei.health"
It looks like a process of Googlefit is doing some kind of adjustments, removing some steps or datapoints, although we cannot find a way to detect what and why, and we cannot explain to the user what is happening or what he or we can do to make his app data to be exactly like ours (or the other way around). His googlefit app shows a number that is not the same one as the one that the REST API shows.
User has already disabled the "googlefit app tracking activities" option.
I would love to know, or try to get some hints to know:
What can I do to debug even more?
Any hint about why is happening this?
Is there anyway, from a configuration point of view (for the user) to prevent this to happen?
Is there anyway, from a development point of view, to prevent this to happen?
Thanks and regards.
UPDATE AFTER Andy Turner's question (thanks for the comment!).
We were able to "catch" this during several hours: 18.58 (around 6K steps), 21.58 (around 25K steps), 22.58 (around 17K steps), 23.58 (around 26K steps). We exported datasets for those, and here is the result.
Another important info: Data is coming only from "raw:com.google.step_count.delta:com.huawei.health". We went through other datasets that might look suspicious, and all were empty (apart from derived and so on).
If we interpret this correctly, probably it's huawei which is sending sometimes a value, and next time, another thing; so it's probably some misconfiguration in the huawei part.
Here are the datasets exported:
https://gist.github.com/jmarti-theinit/8d98996873a9c499a14899a9b62162f3
Result of the GIST is:
Length of 18.58 points 165
Length of 21.58 points 503
Length of 22.58 points 294
Length of 23.58 points 537
How many points in 21.58 that exist in 18.58 => 165
How many points in 22.58 that exist in 18.58 => 57
How many points in 22.58 that exist in 21.58 => 294
How many points in 23.58 that exist in 18.58 => 165
How many points in 23.58 that exist in 21.58 => 503
How many points in 23.58 that exist in 22.58 => 294
So our bet is points are removed and added by devices behind huawei (for example only 57 are common in 18.58 - 22.58), and we cannot control anything more from googlefit's side. Is that correct? Anything else we could see?

We're having similar issues using the REST API.
Here you have what coincides with the case of Jordi:
we are also from Spain (and our users too), although we use servers in Spain and the US
we get the same daily steps value as the google fit app for some users, but not for other users
daily steps increases during the current day, but every next day we make the request, daily steps decrease sometimes
we are making the same request, from the start of day to the end of the day, with 86400000 as bucket time and same data type and data source id
We are in the final development phase, so we're testing with a few users only. Our users have Xiaomi mi band devices.
We think that the problem could be a desynchronization of the servers that we're hitting, because if we test with other apps like this one, they show the correct values. We've created another google cloud console oauth client credentials and new email accounts to test with a brand new users and oauth clients, but the results are the same.
This is the recommended way to get the daily steps andwe are using exactly the same request
https://developers.google.com/fit/scenarios/read-daily-step-total
and even with the "try it" option in the documentation the results are wrong.
What else can we do to help you resolve the issue?
Thank you very much!

search for business that carry my products on google maps by zipcode

My requirement is as follows,
say my business product is available for sale in many stores. When user goes to find location of these stores who carry my product,
1) based on user Ip address the location should be detected (nice to have)
2) provide zip code field and let filter by distance (10mi,25mi,50mi).
fetch all business locations that carry my products by the provided zipcode.
3) I show results on map and also in following format
Costco , 10 Lawrance Exoressway, Sunnyvale CA, Hours, phone number, website
Safeway, 1500 almaden Expressway, San Jose CA, Hours, phone number, website
wholefoods , blossomhill rd, Losgatos CA, Hours, phone number, website
Any information or examples will be really appreciated Thanks

One physical USB harddisk provides two ~2TB block devices, but I want one large 4TB device

I have a vendor preformatted hard disk (Verbatim Store'n'Save USB 3.0 4TB). When connecting to my CentOS 6 server I can see 2 block devices /dev/sdd and /dev/sde with appropriate partitions, each of them about 2TB size. dmesg gives:
usb 4-4: new SuperSpeed USB device number 3 using xhci_hcd
usb 4-4: LPM exit latency is zeroed, disabling LPM.
usb 4-4: New USB device found, idVendor=18a5, idProduct=0400
usb 4-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 4-4: Product: USB 3.0 Desktop HD
usb 4-4: Manufacturer: Verbatim
usb 4-4: SerialNumber: 30624C151155
usb 4-4: configuration #1 chosen from 1 choice
scsi7 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 3
usb-storage: waiting for device to settle before scanning
usb-storage: device scan complete
scsi 7:0:0:0: Direct-Access TOSHIBA MD04ACA400 FP2A PQ: 0 ANSI: 6
scsi 7:0:0:1: Direct-Access TOSHIBA MD04ACA400 FP2A PQ: 0 ANSI: 6
sd 7:0:0:0: Attached scsi generic sg4 type 0
sd 7:0:0:1: Attached scsi generic sg5 type 0
sd 7:0:0:0: [sdd] 4294965248 512-byte logical blocks: (2.19 TB/1.99 TiB)
sd 7:0:0:0: [sdd] Write Protect is off
sd 7:0:0:0: [sdd] Mode Sense: 1f 00 00 08
sd 7:0:0:0: [sdd] Assuming drive cache: write through
sd 7:0:0:1: [sde] 3519071920 512-byte logical blocks: (1.80 TB/1.63 TiB)
sd 7:0:0:1: [sde] Write Protect is off
sd 7:0:0:1: [sde] Mode Sense: 1f 00 00 08
fdisk gives:
fdisk -l /dev/sdd
Disk /dev/sdd: 2199.0 GB, 2199022206976 bytes
255 heads, 63 sectors/track, 267349 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x1428089d
Device Boot Start End Blocks Id System
/dev/sdd1 1 267350 2147480576 c W95 FAT32 (LBA)
I want to have only one device with 4TB size.
I tried to use parted to set the disk-label-typ to GPT but without success.

As physical sector size is 512 bytes, You can not have a single 4TB partition...

Stock management of assemblies and its sub parts (relationships)

I have to track the stock of individual parts and kits (assemblies) and can't find a satisfactory way of doing this.
Sample bogus and hyper simplified database:
Table prod:
prodID 1
prodName Flux capacitor
prodCost 900
prodPrice 1350 (900*1.5)
prodStock 3
-
prodID 2
prodName Mr Fusion
prodCost 300
prodPrice 600 (300*2)
prodStock 2
-
prodID 3
prodName Time travel kit
prodCost 1200 (900+300)
prodPrice 1560 (1200*1.3)
prodStock 2
Table rels
relID 1
relSrc 1 (Flux capacitor)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
-
relID 2
relSrc 2 (Mr Fusion)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
prodPrice: it's calculated based on the cost but not in a linear way. In this example for costs of 500 or less, the markup is a 200%. For costs of 500-1000 the markup is 150%. For costs of 1000+ the markup is 130%
That's why the time travel kit is much cheaper than the individual parts
prodStock: here is my problem. I can sell kits or the individual parts, So the stock of the kits is virtual.
The problem when I buy:
Some providers sell me the Time Travel kit as a whole (with one barcode) and some sells me the individual parts (with a different barcode)
So when I load the stock I don't know how to impute it.
The problem when I sell:
If I only sell kits, calculate the stock would be easy: "I have 3 Flux capacitors and 2 Mr Fusions, so I have 2 Time travel kits and a Flux Capacitor"
But I can sell Kits or individual parts. So, I have to track the stock of the individual parts and the possible kits at the same time (and I have to compensate for the sell price)
Probably this is really simple, but I can't see a simple solution.
Resuming: I have to find a way of tracking the stock and the database/program is the one who has to do it (I cant ask the clerk to correct the stock)
I'm using php+MySql. But this is more a logical problem than a programing one
Update: Sadly Eagle's solution wont work.
the relationships can and are recursive (one kit uses another kit)
There are kit that does use more than one of the same part (2 flux capacitors + 1 Mr Fusion)
I really need to store a value for the stock of the kit. The same database is used for the web page where users want to buy the parts. And I should show the avaliable stock (otherwise they wont even try to buy). And can't afford to calculate the stock on every user search on the web page
But I liked the idea of a boolean marking the stock as virtual

Okay, well first of all since the prodStock for the Time travel kit is virtual, you cannot store it in the database, it will essentially be a calculated field. It would probably help if you had a boolean on the table which says if the prodStock is calculated or not. I'll pretend as though you had this field in the table and I'll call it isKit for now (where TRUE implies it's a kit and the prodStock should be calculated).
Now to calculate the amount of each item that is in stock:
select p.prodID, p.prodName, p.prodCost, p.prodPrice, p.prodStock from prod p where not isKit
union all
select p.prodID, p.prodName, p.prodCost, p.prodPrice, min(c.prodStock) as prodStock
from
prod p
inner join rels r on (p.prodID = r.relDst and r.relType = 4)
inner join prod c on (r.relSrc = c.prodID and not c.isKit)
where p.isKit
group by p.prodID, p.prodName, p.prodCost, p.prodPrice
I used the alias c for the second prod to stand for 'component'. I explicitly wrote not c.isKit since this won't work recursively. union all is used rather than union for effeciency reasons, since they will both return the same results.
Caveats:
This won't work recursively (e.g. if
a kit requires components from
another kit).
This only works on kits
that require only one of a particular
item (e.g. if a time travel kit were
to require 2 flux capacitors and 1
Mr. Fusion, this wouldn't work).
I didn't test this so there may be minor syntax errors.
This only calculates the prodStock field; to do the other fields you would need similar logic.
If your query is much more complicated than what I assumed, I apologize, but I hope that this can help you find a solution that will work.
As for how to handle the data when you buy a kit, this assumes you would store the prodStock in only the component parts. So for example if you purchase a time machine from a supplier, instead of increasing the prodStock on the time machine product, you would increase it on the flux capacitor and the Mr. fusion.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008