How can I calculate rankings of unique values in MySQL? - mysql

Lets say I have a simple table with 'appID' and 'userID', which his populated for each app that used by a specific user (ever). (These IDs refer to other tables which have information about app and user which do not matter for this question).
For example:
appID userID
1 1
1 2
2 2
Here, we can see app #1 was used by two users, but app #2 was used by one user.
I want to generate statistics for the most commonly used app using a MySQL query, if possible. The query would return a list of sorted results with the appID and the total number of unique users using it.
I did some research but cannot figure out an easy to do this in SQL. If anyone can help, I'd appreciate it. If it requires a very long and involved stored procedure, I may just switch to doing some of the calculations in code (at least initially) since it will be easier to troubleshoot.

SELECT appID, COUNT(*) FROM myTable GROUP BY appID ORDER BY COUNT(*) DESC

Related

Many-to-many relationship database, which to pick with a lot of users

We are currently working on a game with a database where the player can have certain items who are stored in the database. I always learned at school and on the field to use option 1 but my colleague is saying option 2.
We have to pick one options and we are asking the question now: Which of the two options is the best and fastest?
And also which one is the best and fastest with 50K users?
Option 1. You are correct.
If you use option 2 you'll be sorry. It's denormalized. Updating those comma-separated lists of itemID values is ridiculously difficult. You'll have to use transactions to read the value string, change it, and write it back.
Also, option 1 can exploit database indexes for much more efficient searching. Ask your friend how you will find all users with itemId = 15. Ask him to write that query for you. With Option 1 you can use
SELECT UserId
FROM tbl
WHERE ItemId = 15
You can use a query to generate option 2 for display. Display is all it's good for.
SELECT UserId, GROUP_CONCAT(DISTINCT ItemId ORDER BY ItemID) ItemId
FROM linktable
GROUP BY UserId

Returning first matching row Improving query performance

I have a database of public records with over 1 million rows. I need to query the database using a users fullname and Birthdate. The query would look like this:
SELECT *
FROM TexasBexarCountyMisdemeanorPublicRecords
WHERE ([FULL-NAME] = 'EXAMPLELASTNAME, EXAMPLEFIRSTNAME')
AND (BIRTHDATE = '1989-10-18 00:00:00')
Currently this query takes 2 minutes and 45 seconds to be completed successfully. Unfortunately because of the data structure and the information the user provides, i cant think of any other way of querying the database.
The purpose of this query is to provide a list of records pertaining to a single user. These records define their ownership by having a users full name and a date of birth. All the columns have string values. It is also possible that a user has more than one record.
This query is extremely inefficient. Is there a better way to search through the records looking for a FULLNAME and BIRTHDAY Match?
I was thinking of stopping the search after a first match however that would not be useful for users that have several records.
I already have a primary key column and can find records using that ID quickly. However there is no way I know that ID when the user is logging in for the very first time. Thus I need to search through the records using their first last name and date of birth and then save those id's for future reference.
If I add more criteria to the search would that make it more efficient? If I pass values for SEX, RACE, ETC.
Thanks for the help.
It returns fast on ID because its indexed. Index the fields you are searching on and also consider covering indexes if appropriate.

mysql group_concat alternative or multiple rows as columns

Before i start my question i cover briefly what the problem is:
I have a table that stores around 4 million 'parameter' values. These values have an id, simulation id and parameter id.
The parameter id maps to a parameter table that basically just maps the id to a text like representation of the parameter x,y, etc etc
The simulation table has around 170k entries that map parameter values to a job.
There is also a score table which stores the score of each simulation , simulations have varying scores for example one might have one score another might have three. The scores tables has a simulation_id column for selecting this.
Each job has an id and an objective.
Currently im trying to select all the parameter_values who's parameter is 'x' and where the job id is 17 and fetch the score of it. The variables of the select will change but in princible its only really these things im interested in.
Currently im using this statement:
SELECT simulation.id , value , name , ( SELECT GROUP_CONCAT(score) FROM score WHERE score.simulation_id = simulation.id ) AS score FROM simulation,parameter_value,parameter WHERE simulation.id=parameter_value.simulation_id AND simulation.job_id = 17 AND parameter_value.parameter_id=parameter.id AND parameter.name = "$x1"
This works nicley except its taking around 3 seconds to execute. Can this be done any faster?
I don't know if it would be faster doing a query before this a pre-calculating the parameter_ids im searching for and doing an WHERE parameter_id IN (1,2,3,4) etc.
But i was under the impression SQL would optimize this anyway?
I have created index's where ever possible but cant get faster than the 2.7 seconds mark.
So my question would be:
Should i pre-calculate some values and avoid the joins,
Is there another other than group_concat to get the scores
and is there any other optimizations i could make to this?
I should also add that the scores must be in the same row or at least return sorted so i can easily read them from the result set.
Thanks,
Lewis

count(*) using left join taking long time to respond

I have a MySQL query for showing the count of data in the Application listing page.
query
SELECT COUNT(*) FROM cnitranr left join cnitrand on cnitrand.tx_no=cnitranr.tx_no
Explain screen shot
Indexes on cnitranr
tx_no (primary )approx 1 crore of data[ENGINE MYISAM]
index on cnitrand
(tx_no secondary)approx 2 crore of data[ENGINE MYISAM]
Profiler output is like this
Can anyone suggest possibilities in optimizing this query or may i want to run a crone job for counting the count .Please help.
You would need to implement a materialized view.
Since MySQL does not support them directly, you would need to create a table like that:
CREATE TABLE totals (cnt INT)
and write a trigger on both tables that would increment and decrement cnt on INSERT, UPDATE and DELETE to each of the tables.
Note that if you have a record with many linked records in either table, the DML affecting such a record would be slow.
On large data volumes, you very rarely need exact counts, especially for pagination. As I said in a comment above, Google, Facebook etc. only show approximate numbers on paginated results.
It's very unlikely that a person would want to browse through 20M+ records on page only able to show 100 or so.

MySQL query that can get distinct ids for logins and how many there were

I have a database that gets a new row added for every time a user logs into the system. What I would like to be able to do is first to make a query that gets the distinct entries by the hashed user ids. This I can do. SELECT DISTINCT is our friend it seems. After this, however, I would still like to be able to get a count, per user id, of how many times people logged in.
For instance, if Max logged in 3 times and Sally logged in 2 times I would have five rows in my DB. After running SELECT DISTINCT by their user ids it would just give me one Max and one Sally user id. After having this information, however, I'd like to be able to create a hash, map, whatever, that stores the following information:
Max => 3
Sally => 2
Is there a way using pure, or mostly pure, SQL that this could be achieved in an efficient manner or should I simply get out ALL of the login db rows and search and compile the information myself. I know I could do it this way but somehow that feels slower. Thanks
You can do this in SQL using the "GROUP BY" operator. For example:
SELECT user_id, count(*) FROM USERS GROUP BY user_id
SELECT username,COUNT(*) AS LoginCount FROM logintable GROUP BY username
SELECT login, COUNT(login) FROM yourtable GROUP BY login
Sure...
select
hashed_user_id, count(*)
from
table name
group by hashed_user_id
the answer could be more specific if you give tabl structure:
however, this is typically an aggregate function with a group by - something like this:
select user, count(user)
from access_log
group by user
Trying to do a count like you're saying is an unecessary overhead. My advice would be to add an extra field in your users table 'logincount' and +1 every time a user logs in. Much quicker.