I have a MySql query
SELECT TE.company_id,
SUM(TE.debit- TE.credit) As summation
FROM Transactions T JOIN Transaction_E TE2
ON (T.parent_id = TE2.transaction_id)
JOIN Transaction_E TE
ON (TE.transaction_id = T.id AND TE.company_id IS NOT NULL)
JOIN Accounts A
ON (TE2.account_id=A.id AND A.deactivated_timestamp=0)
WHERE (TE.company_id IN (1,2))
AND A.user_id=2341 GROUP BY TE.company_id;
When I explain the query, the plan for it is like (in summary):
| Select type | table | type | rows |
-------------------------------------
| SIMPLE | A | ref | 2 |
| SIMPLE | TE2 | ref | 17 |
| SIMPLE | T | ref | 1 |
| SIMPLE | TE | ref | 1 |
But if I do a count(*) on the same query (instead of SUM(..) ), then it shows that there are ~40k rows for a particular company_id. What I don't understand is why the query plan shows so few rows being scanned while there is at least 40k rows being processed. What does the rows column in the query plan represent? Does it not represent the number of rows that get processed in that table? In that case it should be at most 2*17*1*1 = 34 rows?
The query plan just shows a high level judgement on the expected number of rows required per table to meet the end result.
It is to be used as a tool for judging as to how the optimizer is 'seeing' your query, and to help it a bit, in case query performance is worse or can be improved.
There is always a possibility that the query plan is built based on an earlier snapshot of statistics, and hence should not be taken on face value, especially while dealing with cardinality.
Well, first let's get rid of the computational bug:
SELECT TE.company_id, TE.summation
FROM
( SELECT company_id,
SUM(debit - credit) As summation
FROM Transaction_E
WHERE company_id IN (1,2)
) TE
JOIN Transactions T ON TE.transaction_id = T.id
JOIN Transaction_E TE2 ON T.parent_id = TE2.transaction_id
JOIN Accounts A ON TE2.account_id = A.id
AND A.deactivated_timestamp = 0
WHERE A.user_id = 2341;
Your query is probably summing up the same company multiple times before doing the GROUP BY. My variant avoids that inflation of the aggregate.
I got rid of TE.company_id IS NOT NULL because it was redundant.
See what the EXPLAIN says about this, then let's discuss your question about EXPLAIN further.
Related
my client was given the following code and he uses it daily to count the messages sent to businesses on his website. I have looked at the MYSQL.SLOW.LOG and it has the following stats for this query, which indicates to me it needs optimising.
Count: 183 Time=44.12s (8073s) Lock=0.00s (0s)
Rows_sent=17337923391683297280.0 (-1), Rows_examined=382885.7
(70068089), Rows_affected=0.0 (0), thewedd1[thewedd1]#localhost
The query is:
SELECT
businesses.name AS BusinessName,
messages.created AS DateSent,
messages.guest_sender AS EnquirersEmail,
strip_tags(messages.message) AS Message,
users.name AS BusinessName
FROM
messages
JOIN users ON messages.from_to = users.id
JOIN businesses ON users.business_id = businesses.id
My SQL is not very good but would a LEFT JOIN rather than a JOIN help to reduce the number or rows returned? Ive have run an EXPLAIN query and it seems to make no difference between the LEFT JOIN and the JOIN..
Basically I think it would be good to reduce the number of rows returned, as it is absurdly big..
Short answer: There is nothing "wrong" with your query, other than the duplicate BusinessName alias.
Long answer: You can add indexes to the foreign / primary keys to speed up searching which will do more than changing the query.
If you're using SSMS (SQL management studio) you can right click on indexes for a table and use the wizard.
Just don't be tempted to index all the columns as that may slow down any inserts you do in future, stick to the ids and _ids unless you know what you're doing.
he uses it daily to count the messages sent to businesses
If this is done per day, why not limit this to messages sent in specific recent days?
As an example: To count messages sent per business per day, for just a few recent days (example: 3 or 4 days), try this:
SELECT businesses.name AS BusinessName
, messages.created AS DateSent
, COUNT(*) AS n
FROM messages
JOIN users ON messages.from_to = users.id
JOIN businesses ON users.business_id = businesses.id
WHERE messages.created BETWEEN current_date - INTERVAL '3' DAY AND current_date
GROUP BY businesses.id
, DateSent
ORDER BY DateSent DESC
, n DESC
, businesses.id
;
Note: businesses.name is functionally dependent on businesses.id (in the GROUP BY terms), which is the primary key of businesses.
Example result:
+--------------+------------+---+
| BusinessName | DateSent | n |
+--------------+------------+---+
| business1 | 2021-09-05 | 3 |
| business2 | 2021-09-05 | 1 |
| business2 | 2021-09-04 | 1 |
| business2 | 2021-09-03 | 1 |
| business3 | 2021-09-02 | 5 |
| business1 | 2021-09-02 | 1 |
| business2 | 2021-09-02 | 1 |
+--------------+------------+---+
7 rows in set
This assumes your basic join logic is correct, which might not be true.
Other data could be returned as aggregated results, if necessary, and the fact that this is now limited to just recent data, the amount of rows examined should be much more reasonable.
This question is a bit complicated to me, and I can't explain it in one sentence so the title may seem quite ambiguous.
I have 3 tables in my MySQL database, their structure is shown below:
word_list (5 million rows)
+-----+--------+
| wid | word |
+-----+--------+
| 1 | foo |
| 2 | bar |
| 3 | hello |
+-----+--------+
paper_word_relation (10 million rows)
+-----+-------+
| pid | word |
+-----+-------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 3 |
+-----+-------+
paper_citation_relation (80K rows)
+----------+--------+
| pid_from | pid_to |
+----------+--------+
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 1 |
| 2 | 3 |
+----------+--------+
I want to find out how many papers contain word W, and cite the papers also contain word W.(for each word in the list)
I use two inner join to do this job but it seems extremely slow when the word is popular - above 50s (quite fast if the word is rarely used - below 0.1s), here is my code
SELECT COUNT(*) FROM (
SELECT a.pid_from, a.pid_to, b.word FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b ON a.pid_from = b.pid
INNER JOIN paper_word_relation AS c ON a.pid_to = c.pid
WHERE b.word = 2 AND c.word = 2) AS d
How can I do this faster? Is my query not efficient enough or it's the problem about the amount of data?
I can only come up with one solution that I delete the words which occur less than 2 in the paper_word_relation table. (About 4 million words only occur once)
Thanks!
If you are only concerned with getting the Count, you should not be first getting the results into a Derived Table, and then Count the rows out. This may create unnecessary temporary tables storing lots of data in-memory. You can directly count the number of rows.
I also think that you need to count unique number of papers. Because of Many-to-Many relationships in paper_citation_relation table, duplicate rows may be coming for a single paper.
SELECT COUNT(DISTINCT a.pid_from)
FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b ON a.pid_from = b.pid
INNER JOIN paper_word_relation AS c ON a.pid_to = c.pid
WHERE b.word = 2 AND c.word = 2
For performance, you will need following indexing:
Composite Index on (pid_from, pid_to) in the paper_citation_relation table.
Composite Index on (pid, word) in the paper_word_relation table.
We may also possibly optimize the query further by reducing one join and use conditional AND/OR based filtering in HAVING. You will need to benchmark it though.
SELECT COUNT(*)
FROM (
SELECT a.pid_from
FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b
ON (a.pid_from = b.pid OR
a.pid_to = b.pid)
GROUP BY a.pid_from
HAVING SUM(a.pid_from = b.pid AND b.word = 2) AND
SUM(a.pid_to = b.pid AND b.word = 2)
)
After the first 1:n join you get the same pid_to multiple times and your next join is no longer 1:n but n:m, creating a possibly huge intermediate result before the final DISTINCT. It's similar to a CROSS JOIN and it's getting worse for popular words, e.g. 10*10 vs. 1000*1000 rows.
You must remove the duplicates before the join, this should return the same number as #MadhurBhaiya's answer
SELECT Count(*) -- no more DISTINCT needed
FROM
(
SELECT DISTINCT cr.pid_to -- reducing m to 1
FROM paper_citation_relation AS cr
JOIN paper_word_relation AS wr
ON cr.pid_from = wr.pid
WHERE wr.word = 2
) AS dt
JOIN paper_word_relation AS wr
ON dt.pid_to = wr.pid -- 1:n join again
WHERE wr.word = 2
If you want to count the number of papers which have been cited you need to get a distinct list of pid (either pid_from or pid_to) from paper_citation_relation first and then join to the specific word.
SELECT Count(*)
FROM
( -- get a unique list of cited or citing papers
SELECT pid_from AS pid -- citing
FROM paper_citation_relation
UNION -- DISTINCT by default
SELECT pid_to -- cited
FROM paper_citation_relation
) AS dt
JOIN paper_word_relation AS wr
ON wr.pid = dt.pid
WHERE wr.word = 2 -- now check for the searched word
The number returned by this might be slightly higher (it counts a paper regardless if cited or citing).
For information, on the following examples, big_table is composed of millions of rows and small_table of hundreds.
Here is the basic query i'm trying to do:
SELECT b.id
FROM big_table b
LEFT JOIN small_table s
ON b.small_id=s.id
WHERE s.name like 'something%'
ORDER BY b.name
LIMIT 10, 10;
This is slow and I can understand why both index can't be used.
My initial idea was to split the query into parts.
This is fast:
SELECT id FROM small_table WHERE name like 'something%';
This is also fast:
SELECT id FROM big_table WHERE small_id IN (1, 2) ORDER BY name LIMIT 10, 10;
But, put together, it becomes slow:
SELECT id FROM big_table
WHERE small_id
IN (
SELECT id
FROM small_table WHERE name like 'something%'
)
ORDER BY name
LIMIT 10, 10;
Unless the subquery is re-evaluated for every row, it shouldn't be slower than executing both query separately right?
I'm looking for any help optimizing the initial query and understanding why the second one doesn't work.
EXPLAIN result for the last query :
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
| 1 | PRIMARY | small_table | range | PRIMARY, ix_small_name | ix_small_name | 768 | NULL | 1 | Using where; Using index; Using temporary; Using filesort |
| 1 | PRIMARY | big_table | ref | ix_join_foreign_key | ix_join_foreign_key | 9 | small_table.id | 11870 | |
temporary solution :
SELECT id FROM big_table ignore index(ix_join_foreign_key)
WHERE small_id
IN (
SELECT id
FROM small_table ignore index(PRIMARY)
WHERE name like 'something%'
)
ORDER BY name
LIMIT 10, 10;
(result & explain is the same with an EXISTS instead of IN)
EXPLAIN output becomes:
| 1 | PRIMARY | big_table | index | NULL | ix_big_name | 768 | NULL | 20 | |
| 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 8 | func | 1 | |
| 2 | MATERIALIZED | small_table | range | ix_small_name | ix_small_name | 768 | NULL | 1 | Using where; Using index |
if anyone has a better solution, I'm still interested.
The problem that you are facing is that you have conditions on the small table but are trying to avoid a sort in the large table. In MySQL, I think you need to do at least a full table scan.
One step is to write the query using exists, as others have mentioned:
SELECT b.id
FROM big_table b
WHERE EXISTS (SELECT 1
FROM small_table s
WHERE s.name LIKE 'something%' AND s.id = b.small_id
)
ORDER BY b.name;
The question is: Can you trick MySQL into doing the ORDER BY using an index? One possibility is to use the appropriate index. In this case, the appropriate index is: big_table(name, small_id, id) and small_table(id, name). The ordering of the keys in the index is important. Because the first is a covering index, MySQL might read through the index in order by name, choosing the appropriate ids.
You are looking for an EXISTS or IN query. As MySQL is known to be weak on IN I'd try EXISTS in spite of liking IN better for its simplicity.
select id
from big_table b
where exists
(
select *
from small_table s
where s.id = b.small_id
and s.name = 'something%'
)
order by name
limit 10, 10;
It would be helpful to have a good index on big_table. It should first contain the small_id to find the match, then the name for the sorting. The ID is automatically included in MySQL indexes, as far as I know (otherwise it should also be added to the index). So thus you'd have an index containing all fields needed from big_table (that is called a covering index) in the desired order, so all data can be read from the index alone and the table itself doesn't have to get accessed.
create index idx_big_quick on big_table(small_id, name);
you can try this:
SELECT b.id
FROM big_table b
JOIN small_table s
ON b.small_id = s.id
WHERE s.name like 'something%'
ORDER BY b.name;
or
SELECT b.id FROM big_table b
WHERE EXISTS(SELECT 1 FROM small_table s
WHERE s.name LIKE 'something%' AND s.id = b.small_id)
ORDER BY b.name;
NOTE: you don't seem to need LEFT JOIN. Left outer join will almost always result in full table scan of the big_table
PS make sure you have an index on big_table.small_id
Plan A
SELECT b.id
FROM big_table b
JOIN small_table s ON b.small_id=s.id
WHERE s.name like 'something%'
ORDER BY b.name
LIMIT 10, 10;
(Note removal of LEFT.)
You need
small_table: INDEX(name, id)
big_table: INDEX(small_id), or, for 'covering': INDEX(small_id, name, id)
It will use the s index to find 'something%' and walk through. But it must find all such rows, and JOIN to b to find all such rows there. Only then can it do the ORDER BY, OFFSET, and LIMIT. There will be a filesort (which may happen in RAM).
The column order in the indexes is important.
Plan B
The other suggestion may work well; it depends on various things.
SELECT b.id
FROM big_table b
WHERE EXISTS
( SELECT *
FROM small_table s
WHERE s.name LIKE 'something%'
AND s.id = b.small_id
)
ORDER BY b.name
LIMIT 10, 10;
That needs these:
big_table: INDEX(name), or for 'covering', INDEX(name, small_id, id)
small_table: INDEX(id, name), which is 'covering'
(Caveat: If you are doing something other than SELECT b.id, my comments about covering may be wrong.)
Which is faster (A or B)? Cannot predict without understanding the frequency of 'something%' and how 'many' the many-to-1 mapping is.
Settings
If these tables are InnoDB, then be sure that innodb_buffer_pool_size is set to about 70% of available RAM.
Pagination
Your use of OFFSET implies that you are 'paging' through the data? OFFSET is an inefficient way to do it. See my blog on such, but note that only Plan B will work with it.
I have a database with table xxx_facileforms_forms, xxx_facileforms_records and xxx_facileforms_subrecords.
Column headers for xxx_facileforms_subrecords:
id | record | element | title | neame | type | value
As far as filtering records with element = '101' ..query returns proper records, but when i add subquery to filete aditional element = '4871' from same table - 0 records returned.
SELECT
F.id AS form_id,
R.id AS record_id,
PV.value AS prim_val,
COUNT(PV.value) AS count
FROM
xxx_facileforms_forms AS F
INNER JOIN xxx_facileforms_records AS R ON F.id = R.form
INNER JOIN xxx_facileforms_subrecords AS PV ON R.id = PV.record AND PV.element = '101'
WHERE R.id IN (SELECT record FROM xxx_facileforms_records WHERE record = R.id AND element = '4871')
GROUP BY PV.value
Does this looks right?
Thank You!
EDIT
Thank you for support and ideas! Yes, I left lot of un guessing. Sorry. Some input/output table data might help make it more clear.
_facileforms_form:
id | formname
---+---------
1 | myform
_facileforms_records:
id | form | submitted
----+------+--------------------
163 | 1 | 2014-06-12 14:18:00
164 | 1 | 2014-06-12 14:19:00
165 | 1 | 2014-06-12 14:20:00
_facileforms_subrecords:
id | record | element | title | name|type | value
-----+--------+---------+--------+-------------+--------
5821 | 163 | 101 | ticket | radio group | flight
5822 | 163 | 4871 | status | select list | canceled
5823 | 164 | 101 | ticket | radio group | flight
5824 | 165 | 101 | ticket | radio group | flight
5825 | 165 | 4871 | status | select list | canceled
Successful query result:
form_id | record_id | prim_val | count
1 | 163 | flight | 2
So i have to return value data (& sum those records) from those records where _subrecord element - 4871 is present (in this case 163 and 165).
And again Thank You!
Thank You for support and ideas! Yes i left lot of un guessing.. sorry . So may be some input/output table data might help.
_facileforms_form:
headers -> id | formname
1 | myform
_facileforms_records:
headers -> id | form | submitted
163 | 1 | 2014-06-12 14:18:00
164 | 1 | 2014-06-12 14:19:00
165 | 1 | 2014-06-12 14:20:00
_facileforms_subrecords
headers -> id | record | element | title | name | type | value
5821 | 163 | 101 | ticket | radio group| flight
5822 | 163 | 4871 | status | select list | canceled
5823 | 164 | 101 | ticket | radio group | flight
5824 | 165 | 101 | ticket | radio group | flight
5825 | 165 | 4871 | status | select list | canceled
Succesful Query result:
headers -> form_id | record_id | prim_val | count
1 | 163 | flight | 2
So i have to return value data (& sum those records) from those records where _subrecord element - 4871 is present (in this case 163 and 165).
And again Thank You!
No, it doesn't look quite right. There's a predicate "R.id IN (subquery)" but that subquery itself has a reference to R.id; it's a correlated subquery. Looks like something is doubled up there. (We're assuming here that id is a UNIQUE or PRIMARY key in each table.)
The subquery references an identifier element... the only other reference we see to that identifier is from the _subrecords table (we don't see any reference to that column in _records table... if there's no element column in _records, then that's a reference to the element column in PV, and that predicate in the subquery will never be true at the same time the PV.element='101' predicate is true.
Kudos for qualifying the column references with a table alias, that makes the query (and the EXPLAIN output) much easier to read; the reader doesn't need to go digging around in the table definitions to figure out which table does and doesn't contain which columns. But please take that pattern to the next step, and qualify all column references in the query, including column references in the subqueries.
Since the reference to element isn't qualified, we're left to guess whether the _records table contains a column named element.
If the goal is to return only the rows from R with element='4871', we could just do...
WHERE R.element='4871'
But, given that you've gone to the bother of using a subquery, I suspect that's not really what you want.
It's possible you're trying to return all rows from R for a _form, but only for the _form where there's at least one associated _record with element='4871'. We could get that result returned with either an IN (subquery) or an EXISTS (correlated_ subquery) predicate, or an anti-join pattern. I'd give examples of those query patterns; I could take some guesses at the specification, but I would only be guessing at what you actually want to return.
But I'm guessing that's not really what you want. I suspect that _records doesn't actually contain a column named element.
The query is already restricting the rows returned from PV with those that have element='101'.)
This is a case where some example data and the example output would help explain the actual specification; and that would be a basis for developing the required SQL.
FOLLOWUP
I'm just guessing... maybe what you want is something pretty simple. Maybe you want to return rows that have element value of either '101' or '4913'.
The IN comparison operator is a convenient of way of expressing the OR condition, that a column be equal to a value in a list:
SELECT F.id AS form_id
, R.id AS record_id
, PV.value AS prim_val
, COUNT(PV.value) AS count
FROM xxx_facileforms_forms F
JOIN xxx_facileforms_records R
ON R.form = F.id
JOIN xxx_facileforms_subrecords PV
ON PV.record = R.id
AND PV.element IN ('101','4193')
GROUP BY PV.value
NOTE: This query (like the OP query) is using a non-standard MySQL extension to GROUP BY, which allows non-aggregate expressions (e.g. bare columns) to be returned in the SELECT list.
The values returned for the non-aggregate expressions (in this case, F.id and R.id) will be a values from a row included in the "group". But because there can be multiple rows, and different values on those rows, it's not deterministic which of values will be returned. (Other databases would reject this statement, unless we wrapped those columns in an aggregate function, such as MIN() or MAX().)
FOLLOWUP
I noticed that you added information about the question into an answer... this information would better be added to the question as an EDIT, since it's not an answer to the question. I took the liberty of copying that, and reformatting.
The example makes it much more clear what you are trying to accomplish.
I think the easiest to understand is to use EXISTS predicate, to check whether a row meeting some criteria "exists" or not, and exclude rows where such a row does not exist. This will use a correlated subquery of the _subrecords table, to which check for the existence of a matching row:
SELECT f.id AS form_id
, r.id AS record_id
, pv.value AS prim_val
, COUNT(pv.value) AS count
FROM xxx_facileforms_forms f
JOIN xxx_facileforms_records r
ON r.form = f.id
JOIN xxx_facileforms_subrecords pv
ON pv.record = r.id
AND pv.element = '101'
-- only include rows where there's also a related 4193 subrecord
WHERE EXISTS ( SELECT 1
FROM xxx_facileforms_subrecords sx
WHERE sx.element = '4193'
AND sx.record = r.id
)
--
GROUP BY pv.value
(I'm thinking this is where OP was headed with the idea that a subquery was required.)
Given that there's a GROUP BY in the query, we could actually accomplish an equivalent result with a regular join operation, to a second reference to the _subrecords table.
A join operation is often more efficient than using an EXISTS predicate.
(Note that the existing GROUP BY clause will eliminate any "duplicates" that might otherwise be introduced by a JOIN operation, so this will return an equivalent result.)
SELECT f.id AS form_id
, r.id AS record_id
, pv.value AS prim_val
, COUNT(pv.value) AS count
FROM xxx_facileforms_forms f
JOIN xxx_facileforms_records r
ON r.form = f.id
JOIN xxx_facileforms_subrecords pv
ON pv.record = r.id
AND pv.element = '101'
-- only include rows where there's also a related 4193 subrecord
JOIN xxx_facileforms_subrecords sx
ON sx.record = r.id
AND sx.element = '4193'
--
GROUP BY pv.value
I have a database ~800k records showing ticket purchases. All tables are InnoDB. The slow query is:
SELECT e.id AS id, e.name AS name, e.url AS url, p.action AS action, gk.key AS `key`
FROM event AS e
LEFT JOIN participation AS p ON p.event=e.id
LEFT JOIN goldenkey AS gk ON gk.issuedto=p.person
WHERE p.person='139160'
OR p.person IS NULL;
This query is coming from PDO hence quoting of p.person. All columns used in JOINs and WHERE are indexed. p.event is foreign key constrained to e.id and gk.issuedto and p.person are foreign key constrained to an unmentioned table, person.id. All these are INTs. The table e is small - only 10 rows. Table p is ~500,000 rows and gk is empty at this time.
This query runs on a person's details page. We want to get a list of all events, then if there is a participation row their participation and if there is a golden key row then their golden key.
Slow query log gives:
Query_time: 12.391201 Lock_time: 0.000093 Rows_sent: 2 Rows_examined: 466104
EXPLAIN SELECT gives:
+----+-------------+-------+------+---------------+----------+---------+----------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+----------+---------+----------------+------+-------------+
| 1 | SIMPLE | e | ALL | NULL | NULL | NULL | NULL | 10 | |
| 1 | SIMPLE | p | ref | event | event | 4 | msadb.e.id | 727 | Using where |
| 1 | SIMPLE | gk | ref | issuedto | issuedto | 4 | msadb.p.person | 1 | |
+----+-------------+-------+------+---------------+----------+---------+----------------+------+-------------+
This query runs at 7~12 seconds on first run for a given p.person then <0.05s in future. Dropping the OR p.person IS NULL does not improve query time. This query slowed right down when the size of p was increased from ~20k to ~500k (import of old data).
Does anyone have any suggestions on how to improve performance? Remembering overall aim is to retrieve a list of all events, then if there is a participation row their participation and if there is a golden key row then their golden key. If multiple queries will be more efficient I can do that.
If you can do away with p.person IS NULL try the following and see if it helps:
SELECT e.id AS id, e.name AS name, e.url AS url, p.action AS action, gk.key AS `key`
FROM event AS e
LEFT JOIN participation AS p ON (p.event=e.id AND p.person='139160')
LEFT JOIN goldenkey AS gk ON gk.issuedto=p.person
For grins... Add the keyword "STRAIGHT_JOIN" to your select...
SELECT STRAIGHT_JOIN ... rest of query...
I'm not sure how many indexes you have and schema of your table, but try avoid using null values by default, it can slow down your queries dramatically.
If you are doing a lookup for one particular person, which I'm guessing you are since you have the person id filter in there. I would try and reverse the query, so you are first searching though the person table and then making a union to and additional query which gives you all the events.
SELECT
e.id AS id, e.name AS name, e.url AS url,
p.action AS action, gk.key AS `key`
FROM person AS p
JOIN event AS e ON p.event=e.id
LEFT JOIN goldenkey AS gk ON gk.issuedto=p.person
UNION
SELECT
e.id AS id, e.name AS name, e.url AS url,
NULL, NULL
FROM event AS e
This would obviously mean you have a duplicate event in case the first query matches, but thats easily solved by wrapping a select around the whole thing, or maybe by using a variable and selecting the e.id into that in the first query and using that variable in the second query (not sure if this will work though, haven't tested it, cant see why not though).