Empty Rows in MySQL SELECT with LEFT JOIN - mysql

I have three tables called: users, facilities, and staff_facilities.
users contains average user data, the most important fields in my case being users.id, users.first, and users.last.
facilities also contains a fair amount of data, but none of it is necessarily pertinent to this example except facilities.id.
staff_facilties consists of staff_facilities.id (int,auto_inc,NOT NULL),staff_facilities.users_id (int,NOT NULL), and staff_faciltities.facilities_id (int,NOT NULL). (That's a mouthful!)
staff_facilities references the ids for the other two tables, and we are calling this table to look up users' facilities and facilities' users.
This is my select query in PHP:
SELECT users.id, users.first, users.last FROM staff_facilities LEFT JOIN users ON staff_facilities.users_id=users.id WHERE staff_facilities.facilties_id=$id ORDER BY users.last
This query works great on our development server, but when I drop it into the client's production environment often times blank rows appear in the results set. Our development server is using the replicated tables and data that already exist on the client's production server, but the hardware and software vary quite a bit.
These rows are devoid of any information, including the three id fields that require NOT NULL values to be entered into the database. Running the query through the MySQL management tools on the backend returns the same results. Searching the table for NULL fields has not turned up anything.
The other strange thing is that the number of empty rows is changing based on the varying results caused by the WHERE clause id check. It's usually around one to three empty rows, but they are consistent when using the same parameter.
I've many times dealt with the returning of nearly duplicate rows due to LEFT JOINS, but I've never had this happen before. As far as displaying the information goes, I can easily hide it from the end user. My concern is primarily that this problem will be compounded as time passes and the number of records grows larger. As it sits, this system has just been installed, and we already have 2000+ records in the staff_facilities table.
Any insight or direction would be appreciated. I can provide further more detailed examples and information as well.

You are only selecting columns from the table on the right side of the join. Of course some of them are completely null, you did a left join. So those records match to an id in the table on the left side of the join but not to any data on the right side of the join. Since you aren't returning any columns from the left table, you see no data.

Related

MySQL 1:N Data Mapping

Something really bugs me and im not sure what is the "correct" approach.
If i make a select to get contacts from my database there are a decent amount of joins involved.
It will look something like this (around 60-70 columns):
SELECT *
FROM contacts
LEFT JOIN company
LEFT JOIN person
LEFT JOIN address
LEFT JOIN person_communication
LEFT JOIN company_communication
LEFT JOIN categories
LEFT JOIN notes
company and person are 1:1 cardinality so its straight forward.
But "address", "communication" and "categories" are 1:n cardinality.
So depending on the amount of rows in the 1:n tables i will get a lot of "double" rows (I don't know whats the real term for that, the rows are not double i know that the address or phone number etc is different). For myself as a contact, a fairly filled contact, i get 85 rows back.
How do you guys work with that?
In my PHP application i always wrote some "Data-Mapper" where the array key was the "contact.ID aka primary" and then checked if it exists and then pushed the additional data into it. Also PHP is not really type strict what makes it easy.
Now I'm learning GO(golang) and i thought screw that LOOOONG select and data mapping just write selects for all the 1:n.... yeah no, not enough connections to load a table full of contacts. I know that i can increase the connections but the error seems to imply that this would be the wrong way.
I use the following driver: https://github.com/go-sql-driver/mysql
I also tried GROUP_CONCAT but then i running in trouble parsing it back.
Do i have to do my mapping approach again or is there some nice solution out there? I found it quite dirty at points tho?
The solution is simple: you need to execute more than one query!
The cause of all the "duplicate" rows is that you're generating a result called a Cartesian product. You are trying to join to several tables with 1:n relationships, but each of these has no relationship to the other, so there's no join condition restricting them with respect to each other.
Therefore you get a result with every combination of all the 1:n relationships. If you have 3 matches in address, 5 matches in communication, and 5 matches in categories, you'd get 3x5x5 = 75 rows.
So you need to run a separate SQL query for each of your 1:n relationships. Don't be afraid—MySQL can handle a few queries. You need them.

How to efficiently program a big/big left join query in mySQL?

Our problem lies in performing a left join on two large tables (both having millions of entries).
The first one is a table that contains input supplied by the end-user of our program. It contains answers to a variety of questions. Every question belongs to a certain questionnaire. The most important columns are an identifier for the given response, an identifier for the questionnaire form, the datetime the answer is given and an identifier for the user that supplied the answer.
The second table contains information on daily progress of the users regarding the completion of questionnaires. It contains information on the amount of answers a certain user has given on a certain day for a given activity. The most important columns in this table are the user id, the questionnaire id and the date.
The second database is updated right after a new answer enters the first database. Updating is performed by code (workers) that runs on a different server. We would to like make the system robust against failure of this other server. An important step to ensure that the table with the results ('responses') remains in sync with the progress ('progress_questionnaires') table is to be able to check whether a combination of user_id, questionnaire_id and datetime from the 'responses' table is also present in the 'progress_questionnaires' table. A query that captures the required results, but does not perform on large databases (NxN, in which N is couple of millions entries), is displayed below.
A query that captures the required results is:
SELECT r.chapter_id, r.user_id, CAST(first_created as date) as date, 1 as original
FROM responses r
LEFT JOIN progress_questionnaires pq ON r.questionnaire_id = pq.questionnaire_id AND r.user_id = pq.user_id AND CAST(r.first_created as date) = pq.date
WHERE pa.activity_id IS NULL
GROUP BY r.questionnaire_id, r.user_id, CAST(r.first_created as date)
As stated before, this query does capture the required results, but does not perform well on large tables. All key columns are properly indexed as far as we know.
We would be very happy if someone could help us out.
P.S. We are using MariaDB, SQL version 5.5.43. I hope I supplied al necessary information, but logically I would be happy to supply additional information where necessary.

SQL Inner joining a column from two tables with a single user id

I've been trying to get this SQL query running for a while now and can't seem to get the last little bit going.
The backend database to all this data is a Drupal install with data spread out across a number of modules, so I need to do a lot of joining to get a certain view table set up that I need for a third-party application.
It's hard to explain the entire schema, but here's the sqlfiddle:
http://sqlfiddle.com/#!2/68df0/2/0
So basically, I have a userid which I map to a profile id through a join. Then I need to use that profile ID to pull the related data about that profile from two other tables. (there should only be one row with each pid in each of the other tables)
So in the end, I would like to see a table with username, othername, and key_id.
I got the first two pieces in there, but just can't seem to figure out how to join in the othername, and keep getting null.
The server is running MySQL.
Thanks guys!
LEFT JOIN other_name
ON profile_link.pid=other_name.pid;

MySQL Query: Return all rows with a certain value in one column when value in another column matches specific criteria

This may be a little difficult to answer given that I'm still learning to write queries and I'm not able to view the database at the moment, but I'll give it a shot.
The database I'm trying to acquire information from contains a large table (TransactionLineItems) that essentially functions as a store transaction log. This table currently contains about 5 million rows and several columns describing products which are included in each transaction (TLI_ReceiptAlias, TLI_ScanCode, TLI_Quantity and TLI_UnitPrice). This table has a foreign key which is paired with a primary key in another table (Transactions), and this table contains transaction numbers (TRN_ReceiptNumber). When I join these two tables, the query returns one row for every item we've ever sold, and each row has a receipt number. 16 rows might have the same receipt number, meaning that all of these items were sold in a single transaction. Below that might be 12 more rows, each sharing another receipt number. All transactions are broken down into multiple rows like this.
I'm attempting to build a query which returns all rows sharing a single receipt number where at least one row with that receipt number meets certain criteria in another column. For example, three separate types of gift cards all have values in the TLI_ScanCode column that begin with "740000." I want the query to return rows with values beginning with these six digits in the TLI_ScanCode column, but I would also like to return all rows which share a receipt number with any of the rows which meet the given scan code criteria. Essentially, I need the query to return all rows for every receipt number which is also paired in at least one row with a gift card-related scan code.
I attempted to use a subquery to return a column of all receipt numbers paired with gift card scan codes, using "WHERE A.TRN_ReceiptAlias IN (subquery..." to return only those rows with a receipt number which matched one of the receipt numbers returned by the subquery. This appeared to run without issue for five minutes before the server ground to a halt for another twenty while it processed the query. The query appeared to complete successfully, but given that I was working with IT to restore normal store operations during this time I failed to obtain the results of the query (apart from the associated shame and embarrassment).
I'd like to know if there is a way to write a query to obtain this information without causing the server to hang. I'm assuming that either: a) it wasn't very smart to use a subquery in this manner on such a large table, or b) I don't know enough about SQL to obtain the information I need. I'm assuming the answer is both A and B, but I'd very much like to learn how to do this the right way. Any help would be greatly appreciated. Thanks!
SELECT *
FROM a as a1
JOIN b
ON b.id = a.id
JOIN a as a2
ON a2.id = b.id
WHERE b.some_criteria = 'something';
Include an index on (b.id,b.some_criteria)
You aren't the first person, nor will you be the last to bring down your system with an inefficient query.
The most important lesson is that "Decision Support" and "Analytics" really don't co-exist with a transaction system. You really want to pull the data into a datamart or datawarehouse or some other database that isn't your transaction database, so that you don't take the business offline.
In terms of understanding why your initial query was so inefficient, you want to familiarize yourself with the EXPLAIN EXTENDED syntax that returns you plan information that should help you debug your query and work on making it perform acceptably. If you update your question with the actual explain plan output for it, that would be helpful in determining what the issue is.
Just from the outline you provided, it does sound like a self join would make sense rather than the subquery.

SQL to update column in modified table

I am a reasonably competent SQL programmer but my skills are still pretty much in the domain of simple INSERT, SELECT, UPDATE statements with an occasional LIKE etc thrown in. What I am currently trying to do is rather more complex. Here is the scenario.
I have three tables.
Table 1, *users* identifies users via a User ID, uid. Users can have one or more sub accounts
Table 2 *accounts* keeps a record of subaccounts for each user with, amongst other things the columns uid and sid where uid is the one defined in the *users* table.
Table 3, *data* is currently storing some data, in a data column that is being associated with a particular subaccount, sid.
The thing I have just realized is that there is no particular reason to block users from using those data across subaccounts. No problem - I can change my data subset search SQL to work with the uid instead. However, given the frequency of such searches, it seems well worth while simply sticking in a uid column in *data*.
To do that I would need to write some smart SQL that would get uid,sid pairs from the *accounts* table and use that information to update the newly created uid column in the data table. This I have to admit is beyond my knowledge of SQL.
I should mention that the system using these data is now in production and has several 100s of users so the option of just acting like they are not there is not available. Not terribly relevant I think but I should mention that uid and sid are alphanumeric strinsg with both columns being indexed.
I would be most grateful to anyone here who might be able to help out with it.
Mysql can do updates based on joins and based on reading of your schema here's what I'd do...
UPDATE accounts a, data d
set d.uid=a.uid
where a.sid=d.sid
and d.uid is NULL