mySQL - Reiteratively Count rows that have particular CSV string - mysql

2-column MySQL Table:
| id| class |
|---|---------|
| 1 | A,B |
| 2 | B,C,D |
| 3 | C,D,A,G |
| 4 | E,F,G |
| 5 | A,F,G |
| 6 | E,F,G,B |
Requirement is to generate a report/output which tells which individual CSV value of class column is in how many rows.
For example, A is present in 3 rows (with id 1,3,5), and C is present in 2 rows (with id 2,3), and G is in 4 rows (3,4,5,6) so the output report should be
A - 3
B - 3
C - 2
...
...
G - 4
Essentially, column id can be ignored.
The draft that I can think of - first all the values of class column need to picked, split on comma, then create a distinct list of each unique value (A,B,C...), and then count how many rows contain the unique value from that distinct list.
While I know basic SQL queries, this is way too complex for me. Am unable to match it with some CSV split function in MySQL. (Am new to SQL so don't know much).
An alternative approach I made it to work - Download class column values in a file, feed it to a perl script which will create a distinct array of A,B,C, then read the downloaded CSV file again foreach element in distinct array and increase the count, and finally publish the report. But this is in perl which will be a separate execution, while the client needs it in SQL report.
Help will be appreciated.
Thanks

You may try split-string-into-rows function to get distinct values and use COUNT function to find number of occurrences. Specifically check here

Related

Combining multiple select queries in to one table

I've been racking my brains for a while and I'm sure there is a simple solution to it, but for the life of me it's not obvious.
I want to query a database in MySQL Workbench to return a set of serial numbers for a given part number, which is of a fairly basic form:
Select serial_num as sn12345
from process
where part_number = 12345
Thus my output is
sn12345
---------------
0000001
0000002
etc
Now I have a number of part numbers I want to get the serial numbers of so that my output is like
sn12345 | sn12346 | sn12347 |
------------------------------------------
0000001 | 0000005 | 0000008 |
0000002 | 0000006 | 0000009 |
Assume that there are more columns than just these. However, I do not want to UNION the query as I want output in individual columns. Also, there may be different numbers of serial number entries for each part number, i.e 100 for one, but 1000 for a second, and 5 for a third, etc, so I'll probably have a lot of NULL entries.
Thanks in advance!

MySQL full text search matching similar results

I'll try to explain my situation: I'm trying to create a search engine for products on my website, so when the user needs to find a product I need to show similar ones, here's an example.
User searches:
assassins creed OR assassinscreed OR aSsAssIn's CreeD assuming there are no letters/numbers mispelling (those 3 queries should produce the same result)
Expected results:
Assassin's Creed AND Assassin's Creed: Unity AND Assassin's Creed: Special Edition
What have I tried so far
I have created a MySQL field for the search engine which contains a parsed name of the product (Assassin's Creed: Unity -> assassinscreedunity
I parse the search query
I search using MySQL's INSTR()
My problem
I'm fine by using this, but I heard it can be slow when the number of rows increases, I've created a full-text index in my table, but I don't think it would help, so I need another solution.
Thanks for any answer, and ask me anything before downvoting.
First of all, you should keep track of performance issues in your queries more precisely than 'heard it cand be slow' and 'think it would help'. One starting point may be the Slow Query Log.
If you have a table which contains the same parsed name in more than one row, consider normalizing your database. In the specific case, store unique parsed names in one table, and only the id of the corresponding parsed name in the table you described in your question. This way, you only need to check the smaller table with unique names and can then quickly find all matching entries in the main table by id.
Example:
Consider the following table with your structure
id | product_name | rating
-----------------------------------
1 | assassinscreedunity | 5
2 | assassinscreedunity | 2
3 | monkeyisland | 3
4 | monkeyisland | 5
5 | assassinscreedunity | 4
6 | monkeyisland | 4
you would have to scan all six entries to find relevant rows.
In contrast, consider two tables like this:
id | p_id | rating
--------------------
1 | 1 | 5
2 | 1 | 2
3 | 2 | 3
4 | 2 | 5
5 | 1 | 4
6 | 2 | 4
id | name
--------------------------
1 | assassinscreedunity
2 | monkeyisland
In this case, you only have to scan two entries (compared to six) and can then efficiently look up relevant rows using the integer id.
To further enhance the performance, you could extend the concept of a parsed name and use hashes. For example, you could calculate the SHA1-hash of your parsed name which is a 160 bit value. You can find entries in your database for this value very efficiently. To match substrings, you can add them to the second table as well. Since the hash only needs to computed once, you still can use the database to match by an integer. Another thing for you might be fuzzy hashing.
In addition, you should read up on the Rabin–Karp algorithm or string searching in general.

export phpList subscribers via sql in mysql database

For some reason, I am unable to export a table of subscribers from my phpList (ver. 3.0.6) admin pages. I've searched on the web, and several others have had this problem but no workarounds have been posted. As a workaround, I would like to query the mySQL database directly to retrieve a similar table of subscribers. But I need help with the SQL command. Note that I don't want to export or backup the mySQL database, I want to query it in the same way that the "export subscribers" button is supposed to do in the phpList admin pages.
In brief, I have two tables to query. The first table, user contains an ID and email for every subscriber. For example:
id | email
1 | e1#gmail.com
2 | e2#gmail.com
The second table, user_attribute contains a userid, attributeid, and value. Note in the example below that userid 1 has values for all three possible attributes, while userid's 2 and 3 are either missing one or more of the three attributeid's, or have blank values for some.
userid | attributeid | value
1 | 1 | 1
1 | 2 | 4
1 | 3 | 6
2 | 1 | 3
2 | 3 |
3 | 1 | 4
I would like to execute a SQL statement that would produce a row of output for each id/email that would look like this (using id 3 as an example):
id | email | attribute1 | attribute2 | attribute3
3 | e3#gmail.com | 4 | "" | "" |
Can someone suggest SQL query language that could accomplish this task?
A related query I would like to run is to find all id/email that do not have a value for attribute3. In the example above, this would be id's 2 and 3. Note that id 3 does not even have a blank value for attributeid3, it is simply missing.
Any help would be appreciated.
John
I know this is a very old post, but I just had to do the same thing. Here's the query I used. Note that you'll need to modify the query based on the custom attributes you have setup. You can see I had name, city and state as shown in the AS clauses below. You'll need to map those to the attribute id. Also, the state has a table of state names that I linked to. I excluded blacklisted (unsubscribed), more than 2 bounces and unconfirmed users.
SELECT
users.email,
(SELECT value
FROM `phplist_user_user_attribute` attrs
WHERE
attrs.userid = users.id and
attributeid=1
) AS name,
(SELECT value
FROM `phplist_user_user_attribute` attrs
WHERE
attrs.userid = users.id and
attributeid=3
) AS city,
(SELECT st.name
FROM `phplist_user_user_attribute` attrs
LEFT JOIN `phplist_listattr_state` st
ON attrs.value = st.id
WHERE
attrs.userid = users.id and
attributeid=4
) AS state
FROM
`phplist_user_user` users
WHERE
users.blacklisted=0 and
users.bouncecount<3 and
users.confirmed=1
;
I hope someone finds this helpful.

How to get the right "version" of a database entry?

Update: Question refined, I still need help!
I have the following table structure:
table reports:
ID | time | title | (extra columns)
1 | 1364762762 | xxx | ...
Multiple object tables that have the following structure
ID | objectID | time | title | (extra columns)
1 | 1 | 1222222222 | ... | ...
2 | 2 | 1333333333 | ... | ...
3 | 3 | 1444444444 | ... | ...
4 | 1 | 1555555555 | ... | ...
In the object tables, on an object update a new version with the same objectID is inserted, so that the old versions are still available. For example see the entries with objectID = 1
In the reports table, a report is inserted but never updated/edited.
What I want to be able to do is the following:
For each entry in my reports table, I want to be able to query the state of all objects, like they were, when the report was created.
For example lets look at the sample report above with ID 1. At the time it was created (see the time column), the current version of objectID 1 was the entry with ID 1 (entry ID 4 did not exist at that point).
ObjectID 2 also existed with it's current version with entry ID 2.
I am not sure how to achieve this.
I could use a query that selects the object versions by the time column:
SELECT *
FROM (
SELECT *
FROM objects
WHERE time < [reportTime]
ORDER BY time DESC
)
GROUP BY objectID
Lets not talk about the performance of this query, it is just to make clear what I want to do. My problem is the comparison of the time columns. I think this is no good way to make sure that I got the right object versions, because the system time may change "for any reason" and the time column would then have wrong data in it, which would lead to wrong results.
What would be another way to do so?
I thought about not using a time column for this, but instead a GLOBAL incremental value that I know the insertion order across the database tables.
If you are interting new versions of the object, and your problem is the time column(I assume you are using this column to sort which one is newer); I suggest you to use an auto-incremental ID column for the versions. Eventually, even if the time value is not reliable for you, the ID will be.Since it is always increasing. So higher ID, newer version.

mysql lookup table

Lookup table - unique row identity
The other lookup tables just do not make sense as from what I have seen giving a row an ID then putting that id in another table which also has a id then adding these id's to some more tables which may reference them and still creating a lookup tables with more id's (this is how all the examples I can find seem) What I have done is this :
product_item - table
------------------------------------------
id | title | supplier | price
1 | title11 | suuplier1 | price1
etc.
it then goes on to include more items (sure you get it)
product_feature - table
--------------------------
id | title | iskeyfeature
1 | feature1 | true
feature_desc - table
-----------------------------
id | title | desc
1 | desc1 | text description
product_lookup - table
item_id | feature_id | feature_desc
1 | 1 | 1
1 | 2 | 2
1 | 3 | 3
1 |64 | 15
(as these only need to be referenced in the lookup the id's can be multiples per item or multiple items per feature)
What I want to do without adding item_id to every feature row or description row is retrieve only the columns from the multiple tables where their id is referenced in the same row of the lookup table. I want to know if it is possible to select all the referenced columns from the lookup row if I only know the item_id eg. Item_id = 1 return all rows where item_id = 1 with the columns referenced in the same row. Every item can have multiple features and also every feature could be attached to multiple items , this will not matter if I can just get the pattern right in how to construct this query from a single known value.
Any assistance or just some direction will be greatly appreciated. I'm using phpmyadmin, and sure this will be easier with some php voodoo I am learning mysql from tutorials ect and would like to know how to do it with sql directly.
Having a NULL value in a column is not the major concern that would lead to this design - it's the problem with adding new attribute columns in the future, at which MySQL is disgracefully bad.
If you want to make a query that returns everything about an item in one row, you need to LEFT OUTER JOIN back to the product_lookup table for each feature_id. This is about every 10th mysql question on Stack Overflow, so you should be able to find tons of examples.