MySQL - Select result where field is empty or doesn't exists - mysql

I've got a MySQL table with movies and flexible data. Not every movie has the same fields (filled).
However I want to do a query that finds all movies where a specific field is empty or doesn't exist.
This is an example of what my database table looks like:
| id | article_id | fieldname | content |
|----|------------|-----------|------------------|
| 1 | 1 | title | Star Wars |
| 2 | 1 | director | George Lucas |
| 3 | 1 | actor | Harrison Ford |
|----|------------|-----------|------------------|
| 4 | 2 | title | Jurassic Park |
| 5 | 2 | duration | Jeff Goldblum |
|----|------------|-----------|------------------|
| 6 | 3 | title | E.T. |
| 7 | 3 | actor | |
| 8 | 3 | director | Steven Spielberg |
How can I get all movies where "actor" is empty or doesn't exist?

SELECT * FROM table_name WHERE content IS NULL AND fieldname = 'actor'
should do the trick.

Related

How to get only the second duplicated record in laravel 5.5?

Let's say i have a user table like this :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 8 | Joseph | psa.jos#mail.com | 34 |
| 9 | Joseph | joseph.la#mail.com | 28 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
In the actual, the database consists of more data and some of the data is duplicated, with more than two records. But the point is i want to get only the first and the second row of the duplicated rows that contains the name of "Joseph", How can i achieve this ? My code so far...
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 1')->get();
With that code the result will retrieve :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
And i use this code to try to get the second duplicated row :
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 2')->get();
The result is still same as the mentioned above :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
I want the result is to get record that have the id "5" with name "Joseph" like this :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 4 | George | gge.48#mail.com | 28 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
But it seems only the first duplicate row is retrieved and i can't get the second duplicated row, can anybody give me suggestion ?
Let's start from your query
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 1')->get();
This will show all groups of rows whose count equals to 1 ore more. and this is the description of DISTINCT.
If you want to get only duplicate records you should get groups whose count is LARGER than 1.
The other thing to notice here is that a non-aggrigated column will be chosen randomly. because when you get a name and it's count, for example if you select name,count(name), email (email is not in the group by clause - not aggregated), and 4 rows have the same name. so you'll see:
+--------+-------------+-------+
| Name | Count(Name) | Email |
+--------+-------------+-------+
| Joseph | 4 | X |
+--------+-------------+-------+
what do you expect instead of X? which one of the 4 emails? actually, in SQLServer it's forbidden to select a non-aggrigated column and other databases will just give you a random one of the counted 3.
see this answer for more details it's explained very well: Do all columns in a SELECT list have to appear in a GROUP BY clause
So, we'll use having count(name) > 1 and select only the aggregated column name
DB::from('users')->select('name')->groupBy('name')->havingRaw('count("name") > 1')->get();
This should give you (didn't test it) this:
+--------+-------------+
| name | Count(name) |
+--------+-------------+
| Joseph | 4 |
+--------+-------------+
This will give you all names who have 2 or more instances. you can determine the number of duplicates in the having clause. for example having count(name) = 3 will give you all names which have exactly 3 duplicates.
So how to get the second duplicate? I have a question for that:
What is the first (original) duplicate? is it the one with the oldest created_at or the oldest updated_at ? or maybe some other condition?. because of that you should make another query with order by clause to give you the duplicates in the order most convenient to you. for example:
select * from `users` where `name` in (select `name` from users group by `name` having count(`name`) > 1) order by `id` asc
which will give:
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 3 | Joseph | joseph410#mail.com | 21 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 8 | Joseph | psa.jos#mail.com | 34 |
| 9 | Joseph | joseph.la#mail.com | 28 |
+----+-----------+---------+------------+------+

MYSQL: Select ids from table where multiple values match a single column at least twice

I've got a table that looks like this:
+----+--------------------------------+
| id | slug |
+----+--------------------------------+
| 1 | gift |
| 1 | psychological-manipulation |
| 1 | christmas |
| 1 | giving |
| 1 | the-town-santa-forgot |
| 1 | santa-claus |
| 1 | mp3 |
| 1 | christmas |
| 2 | entertainment-culture |
| 2 | christmas |
| 2 | culture |
| 2 | literature |
| 2 | christmas-music |
| 2 | christmas-window |
| 2 | broadcasting-nec |
| 2 | how-the-grinch-stole-christmas |
| 2 | the-polar-express |
| 2 | banker |
| 2 | christmas |
| 2 | potter |
| 2 | christmas-eve |
| 2 | bailey |
| 2 | its-a-wonderful-life |
| 2 | the-polar-express |
| 2 | disney |
| 2 | tim-burton |
| 2 | a-christmas-carol |
| 2 | the-nightmare-before-christmas |
| 2 | chuck-jones |
+----+--------------------------------+
I want to get unique ids from the table where at least two of a list of slugs match for a given id.
For example lets say I've got the slugs values of:
gift
christmas
giving
I would want all unique ids that have a matching record for at least 2 of those.
i.e. only an id that had both the gift AND christmas slug or the giving AND christmas slug or the gift AND giving slug, etc...
You can use the distinct modifier to count the number of different slugs per ID:
SELECT id
FROM mytable
WHERE slug IN ('gift', 'christmass', 'giving')
GROUP BY id
HAVING COUNT(DISTINCT slug) >= 2

Mysql - pivot indeterminate number of rows to multiple columns

Given the following two tables:
+- Members -+
| ID | Name |
+----+------+
| 1 | Bob |
| 2 | Jim |
| 3 | Judy |
etc...
This table represents the members' children. Each parent may have many or no children
+- Children -------------+-----+
| ID | ParentID | Name | Age |
+----+----------+--------+-----+
| 1 | 3 | Jeff | 4 |
| 2 | 3 | Casey | 3 |
| 3 | 1 | Steven | 10 |
| 4 | 2 | Mary | 7 |
| 5 | 1 | Esther | 8 |
| 6 | 2 | Abe | 11 |
| 7 | 3 | Paul | 6 |
etc...
I need to create a table that looks like this:
+----+------+--------+------+---------+------+--------+------+
| ID | Name | Child1 | Age1 | Child2 | Age2 | Child3 | Age3 |
+----+------+--------+------+---------+------+--------+------+
| 1 | Bob | Steven | 10 | Esther | 8 | | |
| 2 | Jim | Abe | 11 | Mary | 7 | | |
| 3 | Judy | Paul | 6 | Jeff | 4 | Casey | 3 |
+----+------+--------+------+---------+------+--------+------+
I've tried various pivot table approaches, but every one that I've seen requires a known number of rows in the second table for each row in the first table. I essentially need an unknown number of columns. A group_concat isn't going to meet my requirements.
Is this possible with MySQL or do I need to do this in the backend?

Mysql LEFT JOIN of three tables returns to many Rows

I´m using Mysql since quite a while and am really confused by the result of a simple LEFT JOIN on three Tables.
I have the following three tables (I created an example, to narrow it down)
a) persons
+----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+----------------+
| PersonID | int(11) | NO | PRI | NULL | auto_increment |
| Name | varchar(50) | YES | | NULL | |
| Age | int(11) | YES | | NULL | |
+----------+-------------+------+-----+---------+----------------+
b) person_fav_artists
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| FavInterpretID | int(10) | NO | PRI | NULL | auto_increment |
| PersonID | int(10) | NO | | 0 | |
| Interpret | varchar(100) | YES | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
c) person_fav_movies
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| FavMovieID | int(10) | NO | PRI | NULL | auto_increment |
| PersonID | int(10) | NO | | 0 | |
| Movie | varchar(100) | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
My example tables are used to store an any number of artists and movies to a single person.
Weather this makes sence or not doesn´t really matter since it´s just a simple example.
Now I have the following data in the tables:
mysql> SELECT * FROM persons;
+----------+------+------+
| PersonID | Name | Age |
+----------+------+------+
| 1 | Jeff | 22 |
| 2 | Lisa | 15 |
| 3 | Jon | 30 |
+----------+------+------+
mysql> SELECT * FROM person_fav_artists;
+----------------+----------+----------------+
| FavInterpretID | PersonID | Interpret |
+----------------+----------+----------------+
| 1 | 1 | Linkin Park |
| 2 | 1 | Muse |
| 3 | 2 | Madonna |
| 4 | 2 | Katy Perry |
| 5 | 2 | Britney Spears |
| 6 | 1 | Fort Minor |
| 7 | 1 | Jay Z |
+----------------+----------+----------------+
mysql> SELECT * FROM person_fav_movies;
+------------+----------+-------------------+
| FavMovieID | PersonID | Movie |
+------------+----------+-------------------+
| 1 | 1 | American Pie 1 |
| 2 | 1 | American Pie 2 |
| 3 | 1 | American Pie 3 |
| 4 | 3 | A Game of Thrones |
| 5 | 3 | Eragon |
+------------+----------+-------------------+
Now i´m simply joining the tables with the following query:
Select * FROM persons
LEFT JOIN person_fav_artists USING (PersonID)
LEFT JOIN person_fav_movies USING (PersonID);
which returns the following result:
+----------+------+------+----------------+----------------+------------+-------------------+
| PersonID | Name | Age | FavInterpretID | Interpret | FavMovieID | Movie |
+----------+------+------+----------------+----------------+------------+-------------------+
| 1 | Jeff | 22 | 1 | Linkin Park | 1 | American Pie 1 |
| 1 | Jeff | 22 | 1 | Linkin Park | 2 | American Pie 2 |
| 1 | Jeff | 22 | 1 | Linkin Park | 3 | American Pie 3 |
| 1 | Jeff | 22 | 2 | Muse | 1 | American Pie 1 |
| 1 | Jeff | 22 | 2 | Muse | 2 | American Pie 2 |
| 1 | Jeff | 22 | 2 | Muse | 3 | American Pie 3 |
| 1 | Jeff | 22 | 6 | Fort Minor | 1 | American Pie 1 |
| 1 | Jeff | 22 | 6 | Fort Minor | 2 | American Pie 2 |
| 1 | Jeff | 22 | 6 | Fort Minor | 3 | American Pie 3 |
| 1 | Jeff | 22 | 7 | Jay Z | 1 | American Pie 1 |
| 1 | Jeff | 22 | 7 | Jay Z | 2 | American Pie 2 |
| 1 | Jeff | 22 | 7 | Jay Z | 3 | American Pie 3 |
| 2 | Lisa | 15 | 3 | Madonna | NULL | NULL |
| 2 | Lisa | 15 | 4 | Katy Perry | NULL | NULL |
| 2 | Lisa | 15 | 5 | Britney Spears | NULL | NULL |
| 3 | Jon | 30 | NULL | NULL | 4 | A Game of Thrones |
| 3 | Jon | 30 | NULL | NULL | 5 | Eragon |
+----------+------+------+----------------+----------------+------------+-------------------+
17 rows in set (0.00 sec)
So far so good.
My question is now if it´s "normal" that '12' Rows are returned for the person 'Jeff' despite the fact that he only has four 'artists' and three 'movies' assigned to him.
I think I may understand why the result is as it is, but I think it´s quite stupid to return so many Rows for so less actual data.
So is there something wrong with my query or is this behaviour on purpose?
The result I´d like to have would be like the following (only for Jeff):
+----------+------+------+----------------+----------------+------------+-------------------+
| PersonID | Name | Age | FavInterpretID | Interpret | FavMovieID | Movie |
+----------+------+------+----------------+----------------+------------+-------------------+
| 1 | Jeff | 22 | 1 | Linkin Park | 1 | American Pie 1 |
| 1 | Jeff | 22 | 2 | Muse | 2 | American Pie 2 |
| 1 | Jeff | 22 | 3 | Fort Minor | 3 | American Pie 3 |
| 1 | Jeff | 22 | 4 | Jay Z | 1 | NULL | <- 'American Pie 1/2/3' would be OK as well.
+----------+------+------+----------------+----------------+------------+-------------------+
Thanks for your help!
Nothing wrong with query or the results, it is just returning all possible combinations. One option would be to split into two separate queries if the amount of data is going to be large.
You are getting the correct result with the 12 records becuase that is the correct tuple with the way you are asking for the data. I am not sure why you are joinming these 3 tables together becuase inherently, the 2 related tables are not the same type of data. What I would suggest is that you select person & movies and then you can union person & artists, becuase your union will want the columns to be the same, i would suggest adding a type to differentiate from artists and movies and then the nice name should just be AS a string_value
This behaviour is on purpose.
Your first table has 1 columns for Jeff.
The second table has 4 columns for Jeff, so the joined table gives
1x4.
The third table has 3 columns for Jeff, so the joined table gives
1x4x3.
You now got all possible combinations.
I think it's normal since it's taking all the combinations of fav movie and fav artist. I think this is how the joining works.
Try by replacing LEFT JOIN with INNER JOIN as:
SELECT *
FROM persons
INNER JOIN person_fav_artists USING (PersonID)
INNER JOIN person_fav_movies USING (PersonID);

Tricky database design

I need to design a database to store user values : for each user, there is a specific set of columns.
For instance, Jon wants to store values in a table with 2 columns : name, age.
And Paul wants to store values in a 3 columns table : fruit, color, weight.
At this point, I have 2 options.
Option 1 - Store data as text values
I would have a first table 'profiles' with the users' preferences :
+----+---------+--------+-------------+
| id | user_id | label | type |
+----+---------+--------+-------------+
| 1 | 1 | name | VARCHAR(50) |
| 2 | 1 | age | INT |
| 3 | 2 | fruit | VARCHAR(50) |
| 4 | 2 | color | VARCHAR(50) |
| 5 | 2 | weight | DOUBLE |
+----+---------+--------+-------------+
And then store the datas as text in another table :
+----+------------+--------+
| id | id_profile | value |
+----+------------+--------+
| 1 | 1 | Aron |
| 2 | 2 | 17 |
| 3 | 1 | Vince |
| 4 | 2 | 27 |
| 5 | 1 | Elena |
| 6 | 2 | 78 |
| 7 | 3 | Banana |
| 8 | 4 | Yellow |
| 9 | 5 | 124.8 |
+----+------------+--------+
After that, I would programatically create and populate a clean table.
Option 2 - One column per type
On this option, I would have a first table 'profiles2' like that :
+----+---------+--------+------+
| id | user_id | label | type |
+----+---------+--------+------+
| 1 | 1 | name | 3 |
| 2 | 1 | age | 1 |
| 3 | 2 | fruit | 3 |
| 4 | 2 | color | 3 |
| 5 | 2 | weight | 2 |
+----+---------+--------+------+
with the type corresponding of a set of type : 1=INT , 2=DOUBLE , 3=VARCHAR(50)
And a data table like that :
+----+-------------+-----------+--------------+---------------+
| id | id_profile2 | int_value | double_value | varchar_value |
+----+-------------+-----------+--------------+---------------+
| 1 | 1 | NULL | NULL | Aron |
| 2 | 2 | 17 | NULL | NULL |
| 3 | 1 | NULL | NULL | Vince |
| 4 | 2 | 27 | NULL | NULL |
| 5 | 1 | NULL | NULL | Elena |
| 6 | 2 | 78 | NULL | NULL |
| 7 | 3 | NULL | NULL | Banana |
| 8 | 4 | NULL | NULL | Yellow |
| 9 | 5 | NULL | 124.8 | NULL |
+----+-------------+-----------+--------------+---------------+
Here I have cleaner tables, but still a programmatic trick to implement to get everything in order.
The questions
Have anyone ever face this situation ?
What do you think of my 2 options ?
Is there a better solution, less tricky ?
Tx a lot!
Edit
Hi again,
My model had a bug : impossible to retrieve a "line" of information; i.e. the informations in the "values" table are not sortables.
After some wanredings around the EAV model, it showed not suitable because it's not designed to store datas, but specific infos.
Then I ended with this model :
Firt table 'labels' :
+----+------------+------+----------+
| id | profile_id | name | datatype |
+----+------------+------+----------+
| 1 | 1 | 1 | Nom |
| 2 | 1 | 1 | Age |
| 3 | 2 | 2 | Fruit |
| 4 | 2 | 2 | Couleur |
| 5 | 2 | 2 | Poids |
+----+------------+------+----------+
Then a very simple 'nodes' talbe, just to keep track of the lines of infos :
+----+------------+
| id | profile_id |
+----+------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 2 |
+----+------------+
and a set of tables corresponding to different datatypes :
+----+---------+----------+--------+
| id | node_id | label_id | value |
+----+---------+----------+--------+
| 1 | 1 | 1 | John |
| 2 | 2 | 1 | Doe |
| 3 | 3 | 3 | Orange |
| 4 | 3 | 4 | Orange |
| 5 | 4 | 3 | Banane |
| 6 | 4 | 4 | Jaune |
+----+---------+----------+--------+
With this model, queries are ok. Data input is a bit tricky but I will manage with a clean code.
Cheers
Take a look at EAV data models.
Option 3: make two different tables.
One table is obviously for people. The other is obviously for fruit. They should be in different tables.
Why not just have a user table with name and ID, the a userValues table that has key value pairs? that was John can have key "fruit" and value "mango, and another key "tires" and value "goodyear". Bob can have key "coin" and value "penny" and key "age" and value "42". Anyone can have any value they like and you have maximum flexibility. Speed won't be great, and you'll have to cast string to values, but it's always a tradeoff.
Cheers,
Daniel