to simplify, I got a database of registered users, I want to count how many emails there are for each email domain name (note I do not know all domain names)
for example,
Users table-
id | email
------------------
1 | test#hotmail.com
2 | test#somesite.aaa<--unknown to me
3 | test#unknownsite.aaa<--unknown to me
4 | test#hotmail.com
5 | test#somesite.aaa<--unknown to me
6 | test#yahoo.com
7 | test#yahoo.com
8 | test#hotmail.com
( note: I want to count each email without specifying which email exactly )
so the result I want is
suffix | count
hotmail.com | 3
somesite.aaa | 2
unknownsite.aaa| 1
yahoo.com | 2
again, I stress this, I do not know unknownsite.aaa nor can i mention it in a statement because it is unknown to me, i hope I am clear.
So essentially I want to make a statistic of what my website users use as an email host website.
but like I said and I will repeat the third time, I do not know every mailhost that exists.
I am going to investigate this more, I have a feeling this is something mysql cannot handle.
EDIT:
I came up with the following solution, but seems tiny bit redundant, which I hate :P
select substring_index(`email`,'#',-1),count(*) as count from users group by substring_index(`email`,'#',-1) order by count ASC;
SELECT SUBSTRING_INDEX(email, '#', -1) AS suffix, COUNT(*) AS count
FROM users
GROUP BY suffix
See it on sqlfiddle.
use RLIKE or REGEXP in mysql
http://dev.mysql.com/doc/refman/5.0/en/regexp.html
Related
I asked a similar question earlier today, but I've run into another issue that I need assistance with.
I have a logging system that scans a server and catalogs every user that's online at that given moment. Here is how my table looks like:
-----------------
| ab_logs |
-----------------
| id |
| scan_id |
| found_user |
-----------------
id is an autoincrementing primary key. Has no real value other than that.
scan_id is an integer that is incremented after each successful scan of all users. It so I can separate results from different scans.
found_user. Stores which user was found online during the scan.
The above will generate a table that could look like this:
id | scan_id | found_user
----------------------------
1 | 1 | Nick
2 | 2 | Nick
3 | 2 | John
4 | 3 | John
So on the first scan the system found only Nick online. On the 2nd it found both Nick and John. On the 3rd only John was still online.
My problem is that I want to get the total amount of unique users connected to the server at the time of each scan. In other words, I want the aggregate number of users that have connected at each scan. Think counter.
From the example above, the result I want from the sql is:
1
2
2
EDIT:
This is what I have tried so far, but it's wrong:
SELECT COUNT(DISTINCT(found_user)) FROM ab_logs WHERE DATE(timestamp) = CURDATE() GROUP BY scan_id
What I tried returns this:
1
2
1
The code below should give you the results you are looking for
select s.scan_id, count(*) from
(select distinct
t.scan_id
,t1.found_user
from
tblScans t
inner join tblScans t1 on t.scan_id >= t1.scan_id) s
group by
s.scan_id;
Here is sqlFiddle
It assumes the names are unique and includes current and every previous scans in the count
Try with group by clause:
SELECT scan_id, count(*)
FROM mytable
GROUP BY scan_id
For some reason, I am unable to export a table of subscribers from my phpList (ver. 3.0.6) admin pages. I've searched on the web, and several others have had this problem but no workarounds have been posted. As a workaround, I would like to query the mySQL database directly to retrieve a similar table of subscribers. But I need help with the SQL command. Note that I don't want to export or backup the mySQL database, I want to query it in the same way that the "export subscribers" button is supposed to do in the phpList admin pages.
In brief, I have two tables to query. The first table, user contains an ID and email for every subscriber. For example:
id | email
1 | e1#gmail.com
2 | e2#gmail.com
The second table, user_attribute contains a userid, attributeid, and value. Note in the example below that userid 1 has values for all three possible attributes, while userid's 2 and 3 are either missing one or more of the three attributeid's, or have blank values for some.
userid | attributeid | value
1 | 1 | 1
1 | 2 | 4
1 | 3 | 6
2 | 1 | 3
2 | 3 |
3 | 1 | 4
I would like to execute a SQL statement that would produce a row of output for each id/email that would look like this (using id 3 as an example):
id | email | attribute1 | attribute2 | attribute3
3 | e3#gmail.com | 4 | "" | "" |
Can someone suggest SQL query language that could accomplish this task?
A related query I would like to run is to find all id/email that do not have a value for attribute3. In the example above, this would be id's 2 and 3. Note that id 3 does not even have a blank value for attributeid3, it is simply missing.
Any help would be appreciated.
John
I know this is a very old post, but I just had to do the same thing. Here's the query I used. Note that you'll need to modify the query based on the custom attributes you have setup. You can see I had name, city and state as shown in the AS clauses below. You'll need to map those to the attribute id. Also, the state has a table of state names that I linked to. I excluded blacklisted (unsubscribed), more than 2 bounces and unconfirmed users.
SELECT
users.email,
(SELECT value
FROM `phplist_user_user_attribute` attrs
WHERE
attrs.userid = users.id and
attributeid=1
) AS name,
(SELECT value
FROM `phplist_user_user_attribute` attrs
WHERE
attrs.userid = users.id and
attributeid=3
) AS city,
(SELECT st.name
FROM `phplist_user_user_attribute` attrs
LEFT JOIN `phplist_listattr_state` st
ON attrs.value = st.id
WHERE
attrs.userid = users.id and
attributeid=4
) AS state
FROM
`phplist_user_user` users
WHERE
users.blacklisted=0 and
users.bouncecount<3 and
users.confirmed=1
;
I hope someone finds this helpful.
I have two tables which allow a user to request songs. Of course a song can be requested by multiple users:
| Id | Song_Name | | Requested_Id | By_IP |
+====+===========+ +==============+=========+
| 1 | song1 | | 1 | 1.1.1.1 |
| 2 | song2 | | 1 | 2.2.2.2 |
| 3 | song3 | | 1 | 3.3.3.3 |
| 2 | 2.2.2.2 |
In order to prevent one user from requesting a song multiple times (abuse), I need to check whether a certain song has already been requested by the user which is just trying to request it again. So I'm doing a LEFT JOIN between the first and the second table and a GROUP BY by the row's Id which returns one row for each song.
PROBLEM: GROUP BY returns unpredictable values on fields which are not grouped. That is known. But How can I make sure that SELECT returns the row containing a specific IP, in case this IP exists in this group? If the IP does not exist, any other row of the group can be returned by SELECT.
Thanks a lot!
UPDATE: I need to show the song in a list, independent of how many users (or even none at all) have requested it. So SELECT definitely needs to return one row for every song. But in case that for example the user with IP 3.3.3.3 is trying to request song1, (which was already requested by him) I expect the query to return this:
| Id | Song_Name | By_IP |
+====+===========+=========+
| 1 | song1 | 3.3.3.3 | (3.3.3.3 in case it exists, otherwise anything else)
| 2 | song2 | 2.2.2.2 |
I also need the grouping with the other requests (IPs), because I need to get the whole number of requests per song as well. Therefore I use Count().
WORKAROUND: Since it seems to be pretty complicated to do what I need (if possible at all), I'm now working with a workaround. I'm using the GROUP_CONCAT() aggregate function. This delivers me all IPs of that group separated by ",". So I can search whether the one I'm searching for already exists there. The only drawback of this is, that the (default) maximum lenght of this returned string is 1024. That means that I can't handle a big amount of users, but for now it should be fine.
It is still unclear what do u want? there is no requested date present in table. without date how do u know when a particular song has been requested.
Select Songs.id, Songs.Song_name, requested_songs.By_IP
from Songs
INNER JOIN requested_songs
on Songs.id = requested_songs.Requested_id
Group BY requested_songs.Requested_id
order by requested_songs.Requested_id ASC
;
SQLFiddle Demo:
Are you sure you're not overthinking your solution a bit? If all you want to do is eliminate duplicates, just put a UNIQUE index on your second table on both columns.
If you're trying to do something more complicated with that GROUP BY, please provide a sample resultset, as Quassnoi requested.
Just group by with Song_Name and By_IP. Like this
SELECT * FROM `songs` JOIN users GROUP BY song_name, ip
I am thinking of returning a randomly ordered SQL response where the results are mixed up randomly, with a limit.
The thing is I need All the rows back, basically divided into groups (chunks of rows). I hope I am clear.
For example, from table A:
ID | NAME | PROFESSION
++++++++++++++++++++++++++++++++
1 | Jack | Carpenter
2 | Rob | Manager
3 | Phil | Driver
4 | Mary | Cook
5 | Tim | Postman
6 | Bob | Programmer
The query would return something like this:
With a limit of 0,2:
6 | Bob | Programmer
4 | Mary | Cook
With a limit of 2,2:
1 | Jack | Carpenter
5 | Tim | Postman
With a limit of 4,2:
3 | Phil | Driver
2 | Rob | Manager
Note: all the table rows were returned. In my page I need to have a << >> buttons that will show the user the needed "group"s of data.
How do I go about writing such a query ?
A better name for your explained problem would be randomly shuffled records. That is true that the order is random but since the order needs to be remembered, you have no choice but to save it in a column. You can do this by saving a randomly populated field and ordering your records based on that. This way you have ordered your records in no specific order while the order is remembered for future select queries. And whenever you got tired of the order, you can update the mentioned field with new randomly generated values to shuffle them again. This is the technique used by players to shuffle a playlist without replaying a song twice.
[EDIT]
While the first given solution stands as the general answer, there's a hack you can use in MySQL to randomly order records. In this way, all you need to store for remembering an order is its seed.
SELECT * FROM tbl ORDER BY RAND(s);
For instance, if you want each user see the records in some different randomly ordered, you can use their user_id as the seed. This way the order each user will ever see the records in, will remain the same while it is random and different from other users.
I can think of two things here:
If the data in the table is huge, add a column that tells the group to which a row belongs. When the user clicks on >> or << buttons, get the rows for that particular group.
If you are dealing with small amount of data, you could do this in the code itself.
If you use ORDER BY RAND() then you will have to flag selected records somewhere which is no advisable.
You can use some intelligent algorithm with combination of total_pages and ID e.g.
SELECT *
FROM my_table
ORDER BY MOD(ID, total_pages);
Add a column to the table called something like random_col
Then each time you need to randomise the table you run
UPDATE table SET random_col = RAND()
And now each time you want to retrieve results you run a normal select
SELECT * FROM table ORDER BY random_col ASC LIMIT x,y
And the results will appear in the same order until you randomise them again by running the 'UPDATE'
I have a table (pretty big one) with lots of columns, two of them being "post" and "user".
For a given "post", I want to know which "user" posted the most.
I was first thinking about getting all the entries WHERE (post='wanted_post') and then throw a PHP hack to find which "user" value I get the most, but given the large size of my table, and my poor knowledge of MySQL subtle calls, I am looking for a pure-MySQL way to get this value (the "user" id that posted the most on a given "post", basically).
Is it possible ? Or should I fall back on the hybrid SQL-PHP solution ?
Thanks,
Cystack
It sounds like this is what you want... am I missing something?
SELECT user
FROM myTable
WHERE post='wanted_post'
GROUP BY user
ORDER BY COUNT(*) DESC
LIMIT 1;
EDIT: Explanation of what this query does:
Hopefully the first three lines make sense to anyone familiar with SQL. It's the last three lines that do the fun stuff.
GROUP BY user -- This collapses rows with identical values in the user column. If this was the last line in the query, we might expect output something like this:
+-------+
| user |
+-------+
| bob |
| alice |
| joe |
ORDER BY COUNT(*) DESC -- COUNT(*) is an aggregate function, that works along with the previous GROUP BY clause. It tallies all of the rows that are "collapsed" by the GROUP BY for each user. It might be easier to understand what it's doing with a slightly modified statement, and it's potential output:
SELECT user,COUNT(*)
FROM myTable
WHERE post='wanted_post'
GROUP BY user;
+-------+-------+
| user | count |
+-------+-------+
| bob | 3 |
| alice | 1 |
| joe | 8 |
This is showing the number of posts per user.
However, it's not strictly necessary to actually output the value of an aggregate function in this case--we can just use it for the ordering, and never actually output the data. (Of course if you want to know how many posts your top-poster posted, maybe you do want to include it in your output, as well.)
The DESC keyword tells the database to sort in descending order, rather than the default of ascending order.
Naturally, the sorted output would look something like this (assuming we leave the COUNT(*) in the SELECT list):
+-------+-------+
| user | count |
+-------+-------+
| joe | 8 |
| bob | 3 |
| alice | 1 |
LIMIT 1 -- This is probably the easiest to understand, as it just limits how many rows are returned. Since we're sorting the list from most-posts to fewest-posts, and we only want the top poster, we just need the first result. If you wanted the top 3 posters, you might instead use LIMIT 3.