I'm extremely new to writing SQL queries - I am hoping to create some charts in a front end application, but have to manipulate the data to create a view because the front end is not well suited to running complicated queries.
Here is my current situation:
I have a table that has client data as well as a date that record was created. Here is a sample not in any particular order.
| ID | post_date | post_title |
-------------------------------------------
| 1654 | 2017-09-04 | Bill Smith (5678)|
| 1658 | 2017-09-05 | Jan Jones (3423) |
| 1878 | 2017-08-17 | Jim Tanz (7890) |
| 1659 | 2017-09-06 | Jan Jones (3425) |
I would like to display unique values by last name, but at the moment all the names are in one column. The ID is unique as it is incremented for each record and the number in parentheses (transaction ID) appended to the last name is also unique and comes from another application we are pulling the name from.
I have been able to split the post_title column, but only into 2 columns but am left with FName and LastName (TrID), which doesn't allow me to pick distinct entries by last name to do a client count because the TrIDs are all different.
My intent was to create a view with 3 columns then display distinct entries by last name and count the clients, each month to see if there has been any client growth, but I am still at the very early step.
Any assistance would be greatly appreciated (and remembered forever :>)
Thanks!
Some text operations and it may work:
SELECT t.post_title
,LEFT(t.post_title, LOCATE(' ', post_title )) AS FName
,SUBSTR(t.post_title, LOCATE(' ', post_title)+1, LOCATE(' ',post_title,LOCATE(' ', post_title)+1)-LOCATE(' ', post_title)) AS LName
,REPLACE(REPLACE(TRIM(RIGHT(t.post_title,LOCATE(' ', REVERSE(post_title)))), '(', ''), ')','') AS ID
FROM (SELECT 'Bill Smith (5678)' AS post_title
UNION SELECT 'Jan Jones (3423)'
UNION SELECT 'Jim Tanz (7890)') t;
Rextester Demo
You can use SUBSTRING_INDEX to separate the string, so to retrieve the first name:
SUBSTRING_INDEX(post_title," ",1)
This gets everything up until the nth instance of the space, so it's a bit messier to get the last name, as when using '2' we will get the values up until the second space, then we need to then extract the second value (-1, as we go backwards). Therefore, getting the 'Last Name' is done using:
SUBSTRING_INDEX(SUBSTRING_INDEX(post_title," ",2)," ",-1)
Scenario 1: Splitting post_title into three fields:
SELECT
SUBSTRING_INDEX(post_title," ",1) as firstName,
SUBSTRING_INDEX(SUBSTRING_INDEX(post_title," ",2)," ",-1) as lastName,
SUBSTRING_INDEX(REPLACE(REPLACE(post_title,"(",""),")","")," ",-1) as post_ID
FROM tableName;
Output:
+-----------+----------+---------+
| firstName | lastName | post_ID |
---------------------------------+
| Bill | Smith | 5678 |
| Jan | Jones | 3423 |
| Jim | Tanz | 7890 |
| Jan | Jones | 3425 |
+-----------+----------+---------+
Scenario 2: Grouping functions
You could also use the named field to group and count by Last Name
SELECT
COUNT(*) as Qty,
SUBSTRING_INDEX(SUBSTRING_INDEX(post_title," ",2)," ",-1) as lastName
FROM tableName
GROUP BY lastName;
Output:
+-----+----------+
| Qty | lastName |
+-----+----------+
| 2 | Jones |
| 1 | Smith |
| 1 | Tanz |
+-----+----------+
And so on. Hard to tailor this any further, as I'm not fully sure what you're intending to do, but hopefully the above is of use.
Related
So I have the following key/value pair table, where users submit data through a form and each question on the form is added to the table here as an individual row. Submission_id identifies each form submission.
+----+---------------+--------------+--------+
| id | submission_id | key | value |
+----+---------------+--------------+--------+
| 1 | 10 | manufacturer | Apple |
| 2 | 10 | model | 5s |
| 3 | 10 | firstname | Paul |
| 4 | 15 | manufacturer | Apple |
| 5 | 15 | model | 5s |
| 6 | 15 | firstname | Paul |
| 7 | 20 | manufacturer | Apple |
| 8 | 20 | model | 5s |
| 9 | 20 | firstname | Andrew |
+----+---------------+--------------+--------+
From the data above you can see that the submissions with id of 10 and 15 both have the same values (just different submission id). This is basically because a user has submitted the same form twice and so is a duplicate.
Im trying to find a way to order these table where the any duplicate submissions appear together in order. Given the above table I am trying to build a query that gives me the result as below:
+---------------+
| submission_id |
+---------------+
| 10 |
| 15 |
| 20 |
+---------------+
So I want to check to see if a submission where the manufacturer, model and firstname keys have the same value. If it does then these get the submission id and place them adjacently in the result. In the actual table there are other keys, but I only want to match duplicates based on these 3 keys (manufacturer, model, firstname).
I’ve been going back and forth to the drawing board quite some time now and have tried looking for some possible solutions but cannot get something reliable.
That's not a key value table. It's usually called an Entity-Attribute-Value table/relation/pattern.
Looking at the problem, it would be trivial if the table were laid out in conventional 1st + 2nd Normal form - you just do a join on the values, group by those and take a count....
SELECT manufacturer, model, firstname, COUNT(DISTINCT submission_id)
FROM atable
GROUP BY manufacturer, model, firstname
HAVING COUNT(DISTINCT submission_id)>1;
Or a join....
SELECT a.manufacturer, a.model, a.firstname
, a.submission_id, b.submission_id
FROM atable a
JOIN atable b
ON a.manufacturer=b.manufacturer
AND a.model=b.model
AND a.firstname=b.firstname
WHERE a.submission_id<b.submission_id
;
Or using sorting and comparing adjacent rows....
SELECT *
FROM
(
SELECT #prev.submission_id AS prev_submission_id
, #prev.manufacturer AS prev_manufacturer
, #prev.model AS prev_model
, #prev.firstname AS pref_firstname
, a.submission_id
, a.manufacturer
, a.model
, set #prev.submission_id:=a.submission_id as currsid
, set #prev.manufacturer:=a.manufacturer as currman
, set #prev.model:=a.model as currmodel
, set #prev.firstname=a.forstname as currname
FROM atable
ORDER BY manufacturer, model, firstname, submission_id
)
WHERE prev_manufacturer=manufacturer
AND prev_model=model
AND prev_firstname=firstname
AND prev_submission_id<>submission_id;
So the solution is to simply make your data look like a normal relation....
SELECT ilv.values
, COUNT(ilv.submission_id)
, GROUP_CONCAT(ilv.submission_id)
FROM
(SELECT a.submission_id
, GROUP_CONCAT(CONCAT(a.key, '=',a.value)) AS values
FROM atable a
GROUP BY a.submission_id
) ilv
GROUP BY ilv.values
HAVING COUNT(ilv.submission_id)>1;
Hopefully the join and sequence based solutions should now be obvious.
In a database with million of records, would it be a possible way to take the best records (ie. rows with the most populated columns)
SEQ_ID | PERSON_ID | GENDER | DOB | COUNTRY
1 | A000001 | Male | 01-01-1970 |
2 | A000001 | | | Indonesia
Would it be possible to take the 2 Records to combine it into 1? E.g
SEQ_ID | PERSON_ID | GENDER | DOB | COUNTRY
1 | A000001 | Male | 01-01-1970 | Indonesia
With your example, you can use aggregation:
select MIN(SEQ_ID), PERSON_ID, MAX(GENDER), MAX(DOB), MAX(COUNTRY)
from t
group by PERSON_ID;
It is not clear if this will work generally on your data, but this does answer the question that you asked. If this doesn't work, you should ask a new question, with more appropriate data and explanation of the logic you want to use.
I have problem with creating query, which getting no duplicate values form my table. Unfortunately, Full Name column has Name and Surname in different order.
For example:
+----+----------------------+
| ID | Full Name |
+----+----------------------+
| 1 | Marshall Wilson |
| 2 | Wilson Marshall |
| 3 | Lori Hill |
| 4 | Hill Lori |
| 5 | Casey Dean Davidson |
| 6 | Davidson Casey Dean |
+----+----------------------+
I would like to get that result:
+----+-----------------------+
| ID | Full Name |
+----+-----------------------+
| 1 | Marshall Wilson |
| 3 | Lori Hill |
| 5 | Casey Dean Davidson |
+----+-----------------------+
My target is to create query, which getting in similar way, for example: select distinct for Name and Surname in the same order.
Any thoughts?
It requires lots of String operations, and usage of multiple Derived Tables. It may not be efficient.
We first tokenize the FullName into multiple words it is made out of. For that we use a number generator table gen. In this case, I have assumed that maximum number of substrings is 3. You can easily extend it further by adding more Selects, like, SELECT 4 UNION ALL .. and so on.
We use Substring_Index() with Replace() function to get a substring out, using a single space character (' ') as Delimiter. Trim() is used to remove any leading/trailing spaces left.
Now, the trick is to use this result-set as a Derived table, and do a Group_Concat() on the words such that they are sorted in a ascending order. This way even the duplicate names (but substrings in different order), will get similar words_sorted value. Eventually, we simply need to Group By on words_sorted to weed out the duplicates.
Query #1
SELECT
MIN(dt2.ID) AS ID,
MIN(dt2.FullName) AS FullName
FROM
(
SELECT
dt1.ID,
dt1.FullName,
GROUP_CONCAT(IF(word = '', NULL, word) ORDER BY word ASC) words_sorted
FROM
(
SELECT e.ID,
e.FullName,
TRIM(REPLACE(
SUBSTRING_INDEX(e.FullName, ' ', gen.idx),
SUBSTRING_INDEX(e.FullName, ' ', gen.idx-1),
'')) AS word
FROM employees AS e
CROSS JOIN (SELECT 1 AS idx UNION ALL
SELECT 2 UNION ALL
SELECT 3) AS gen -- You can add more numbers if more than 3 substrings
) AS dt1
GROUP BY dt1.ID, dt1.FullName
) AS dt2
GROUP BY dt2.words_sorted
ORDER BY ID;
| ID | FullName |
| --- | ------------------- |
| 1 | Marshall Wilson |
| 3 | Hill Lori |
| 5 | Casey Dean Davidson |
View on DB Fiddle
I have data stored in a mySQL database in the following format:
+------------+------------+-----------+
| id | field | value |
+============+============+===========+
| 1 | first | Bob |
+------------+------------+-----------+
| 1 | last | Smith |
+------------+------------+-----------+
| 2 | first | Jim |
+------------+------------+-----------+
| 2 | last | Jones |
+------------+------------+-----------+
and I would like it returned as follows:
+------------+------------+-----------+
| id | first | last |
+============+============+===========+
| 1 | Bob | Smith |
+------------+------------+-----------+
| 2 | Jim | Jones |
+------------+------------+-----------+
I know this seems like a silly way to store data, but it's just a simple example of what I really have. The table is formatted this way from a WordPress plugin, and I'd like to make it work without having to rewrite the plugin.
From what I've read, I can't use PIVOT with mySql. Is there something similar to PIVOT that I can use to achieve what I'm going for?
Try this pivot query:
SELECT id,
MAX(CASE WHEN field = 'first' THEN value ELSE NULL END) AS first,
MAX(CASE WHEN field = 'last' THEN value ELSE NULL END) AS last
FROM yourTable
GROUP BY id
Follow the link below for a running demo:
SQLFiddle
Try this;)
select
id,
max(if(field='first', value, null)) as first,
max(if(field='last', value, null)) as last
from yourtable
group by id
SQLFiddle DEMO HERE
I have two MySql tables:
users(id_user, name, age, gender ).
ways(#id_user,id_way, start, end, date).
What I want is to retrieve all the ways with their corresponding users details.
So my result would be like this:
id_way | start | end | date | id_user | name | age | gender
---------------------------------------------------------------------------
2 | place1 | place2 | 12/06/2013 | 145 | john | 28 | m
Have you tried JOIN?
SELECT ways.id_way, ways.start, ways.end, ways.date, users.*
FROM ways JOIN users USING (id_user)