mysql Order By tree depth - mysql

I have table 'pe' with columns id, name, and lcltyid
I have table 'wp_exrz_locality_localities' with id, name, and parent
the locality table is a tree and the parent contains an id of another locality row. pe.lcltyid is a key wp_exrz_locality_localities.id
basically what I want to do is retrieve all 'pe' entries sorted by their "tree depth"
However the total depth of the tree can be any amount at any time. And I need the depth in a way that allows me to use it in a sub query for sorting.
Originally I thought I needed a stored proceedure/function to get the depth of a lclty entry. After i made the proceedure I found out that proceedures cant be used in expressions. Then I tried to make a function but binary logging is enabled by me host and "log_bin_trust_function_creators = 0", so no stored functions for me.
Lastly I am trying to understand recursion but can 't seem to make it work. I am just trying to create a recursive statement that will retrieve the "depth" meaning the number of parents for an individual node up until the top node, or when the parent = 0 I just get an error "syntax to use near 'RECURSIVE node_ancestors..."
WITH RECURSIVE node_ancestors(id, parent) AS (
SELECT id, id FROM `wp_exrz_locality_localities` WHERE id IN (1, 2, 3)
UNION ALL
SELECT na.id, wp_exrz_locality_localities.parent
FROM node_ancestors AS na, wp_exrz_locality_localities
WHERE wp_exrz_locality_localities.id = na.parent AND wp_exrz_locality_localities.parent != 0
)
SELECT id, COUNT(parent) AS depth FROM node_ancestors GROUP BY id;
Any help is greatly appreciated
an example:
EDIT
table pe:
id---name---lcltyid
2---first---4
3---second---3
table wp_exrz_locality_localities:
id---name---parent
1---USA---0
3---SanFran---1
4---California---3
SELECT * FROM 'pe' ORDER BY ([lcltydepth]) ASC;
desired output:
id---name---lcltyid
3---second---3
2---first---4
where lclctydepth is 3 for the "first" pe and 2 for "second" because the second one is attached to a state with only the US above it and the first one is attached to a city, with state and US above it. So it would order them by the number of parents required to get the the last parent with parentid = 0;
I hope this helps?

Related

Mysql : COUNT value by level ( if the level not exist the num wil be 0 )

I my try to group by field's value
(table)level_count:
id level
1 high
2 high
3 hgih
4 low
.
.
.
If medium is not exist in level but I want result like :
level count
high 3
medium 0
low 1
I try IN and distinct but they only found the exist value
Is there any way to count different level include does not exist's value ??
As mentioned in the comments already: The DBMS must know what levels exist (otherwise it wouldn't know that level 'medium' was missing from table level_count).
So create a table levels with one column level. Add three rows: 'high', 'medium', 'low'.
Add a foreign key on level_count(level) to levels(level). This keeps you from storing typos like 'hgih' in your sample data.
The query is then:
select l.level, count(lc.level)
from levels l
left outer join level_count lc on lc.level = l.level
group by l.level;
My original code using PHP with PDO
SQL:
SELECT `level` , count(`level`) FROM `level_count` GROUP BY `level`
then push reuslut to level array, it will look like:
level['high'] = 3, level['medium'] = 0, level['low'] = 1
It's work but I think it is not perfect
So I think is there possible any SQL syntax can reach this purpose in the Outset

How to use mysql values as column names to reduce the number of rows?

SELECT uo.id as visited_object, uo.id, uo.parent_object, object_username,
object_title, object_type, lat, lng, uop.propertyName,
uop.propertyContent,
(SELECT avg(child_value) FROM `uf_object_childs` uoc
LEFT JOIN uf_objects uo ON uoc.object_id=uo.id
WHERE uoc.child_type='rev_rating') as object_rating
FROM `uf_objects` uo
LEFT JOIN uf_object_properties uop ON uo.id=uop.objectId
WHERE uo.object_type='page'
The query above returns the following
How can I change it so it returns only a single row on which propertyName values will be used as column names and propertyContent is used as their respective values.
Which means I get only one row with three extra columns named cover_photo, city and address on which their values are http://...., New York and Lincoln Square, New York ... respectively.
Any one can help me to figure this out ?
If you know for which properties you want to have columns, you can do it quite simply like this, assuming each property exists only one for each 'object'.
Basically, you group by the object id, and aggregate the property value. I'm using max, but you could use min just as well. It's just a way to aggregate the properties and pick one of the values. Which one it is, is determined by the property name using the IF function (you could use CASE as well).
SELECT
uo.id as visited_object, uo.id, uo.parent_object, object_username,
object_title, object_type, lat, lng, uop.propertyName,
uop.propertyContent,
(SELECT avg(child_value) FROM `uf_object_childs` uoc
LEFT JOIN uf_objects uo ON uoc.object_id=uo.id
WHERE uoc.child_type='rev_rating') as object_rating
MAX(IF(uop.propertyName = 'cover_photo', uop.PROPERYVALUE, null)) AS cover_photo,
MAX(IF(uop.propertyName = 'city', uop.PROPERYVALUE, null)) AS city,
MAX(IF(uop.propertyName = 'address', uop.PROPERYVALUE, null)) AS address
FROM `uf_objects` uo
LEFT JOIN uf_object_properties uop ON uo.id=uop.objectId
WHERE uo.object_type='page'
GROUP BY
uo.id
Note though, that the reason you have to do this with this, if I may say so, suboptimal query, is because your data model isn't very good either. The object data (id, user, parent, object_title, etc) is stored redundantly which is usually a bad sign.
Normally, you'd either make actual columns for properties like 'cover_photo', so you don't have to duplicate the object_title and those other fields.
Alternatively, you could create a separate table of object_properties that contains a label and a value per object. This is especially useful if you don't know what the properties will be, but if you do, the suggestion above is better. And for such a separate table, you would still need a query to 'pivot' the data into the flattened for you asked for, but at least the data itself wouldn't be stored redundantly.

SQLAlchemy foreign keys mapped to list of ids, not entities

In the usual Customer with Orders example, this kind of SQLAlchemy code...
data = db.query(Customer)\
.join(Order, Customer.id == Order.cst_id)\
.filter(Order.amount>1000)
...would provide instances of the Customer model that are associated with e.g. large orders (amount > 1000). The resulting Customer instances would also include a list of their orders, since in this example we used backref for that reason:
class Order:
...
customer = relationship("customers", backref=backref('orders'))
The problem with this, is that iterating over Customer.orders means that the DB will return complete instances of Order - basically doing a 'select *' on all the columns of Order.
What if, for performance reasons, one wants to e.g. read only 1 field from Order (e.g. the id) and have the .orders field inside Customer instances be a simple list of IDs?
customers = db.query(Customer)....
...
pdb> print customers[0].orders
[2,4,7]
Is that possible with SQLAlchemy?
What you could do is make a query this way:
(
session.query(Customer.id, Order.id)
.select_from(Customer)
.join(Customer.order)
.filter(Order.amount > 1000)
)
It doesn't produce the exact result as what you have asked, but it gives you a list of tuples which looks like [(customer_id, order_id), ...].
I am not entirely sure if you can eagerly load order_ids into Customer object, but I think it should, you might want to look at joinedload, subqueryload and perhaps go through the relationship-loading docs if that helps.
In this case it works you could write it as;
(
session.query(Customer)
.select_from(Customer)
.join(Customer.order)
.options(db.joinedload(Customer.orders))
.filter(Order.amount > 1000)
)
and also use noload to avoid loading other columns.
I ended up doing this optimally - with array aggregation:
data = db.query(Customer).with_entities(
Customer,
func.ARRAY_AGG(
Order.id,
type_=ARRAY(Integer, as_tuple=True)).label('order_ids')
).outerjoin(
Orders, Customer.id == Order.cst_id
).group_by(
Customer.id
)
This returns tuples of (CustomerEntity, list) - which is exactly what I wanted.

How can I sanitize my DB from these duplicates

I have a table with the following fields:
id | domainname | domain_certificate_no | keyvalue
An example for the output of a select statement can be as:
'57092', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_1', '55525772666'
'57093', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_2', '22225554186'
'57094', '02a1fae.netsolstores.com', '02a1fae.netsolstores.com_3', '22444356259'
'97168', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_1', '55525772666'
'97169', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_2', '22225554186'
'97170', '02aa6aa.netsolstores.com', '02aa6aa.netsolstores.com_3', '22444356259’
I need to sanitize my db such that: I want to remove the domain names that have repeated keyvalue for the first domain_certificate_no (i.e, in this example, I look for the field domain_certificate_no: 02aa6aa.netsolstores.com_1, since it is number 1, and has repeated value for the key, then I want to remove the whole chain which is 02aa6aa.netsolstores.com_2 and 02aa6aa.netsolstores.com_3 and this by deleting the domain name that this chain belongs to which is 02aa6aa.netsolstores.com.
How can I automate the checking process for the whole DB. So, I have a query that checks any domain name in the pattern ('%.%.%) EDIT: AND they have share domain name (in this ex: netsolstores.com) , if it finds cert no. 1 that belongs to this domain name has a repeated key value, then delete. Otherwise no. Please, note tat, it is ok for domain_certificate_no to have repeated value if it is not number 1.
EDIT: I only compare the repeated valeues for the same second level domain name. Ex: in this question, I compare the values that share the domain name: .netsolstores.com. If I have another domain name, with sublevel domains, I do the same. But the point is that I don't need to compare the whole DB. Only the values with shared domain name (but different sub domain).
I'm not sure what happens with '02aa6aa.netsolstores.com_1' in your example.
The following keeps only the minimum id for any repeated key:
with t as (
select t.*,
substr(domain_certificate_no,
instr(domain_certificate_no, '_') + 1, 1000) as version,
left(domain_certificate_no, instr(domain_certificate_no, '_') - 1) as dcn
from t
)
select t.*
from t join
(select keyvalue, min(dcn) as mindcn
from t
group by keyvalue
) tsum
on t.keyvalue = tsum.keyvalue and
t.dcn = tsum.mindcn
For the data you provide, this seems to do the trick. This will not return the "_1" version of the repeats. If that is important, the query can be pretty easily modified.
Although I prefer to be more positive (thinking about the rows to keep rather than delete), the following should delete what you want:
with t as (
select t.*,
substr(domain_certificate_no,
instr(domain_certificate_no, '_') + 1, 1000) as version,
left(domain_certificate_no, instr(domain_certificate_no, '_') - 1) as dcn
from t
),
tokeep as (
select t.*
from t join
(select keyvalue, min(dcn) as mindcn
from t
group by keyvalue
) tsum
on t.keyvalue = tsum.keyvalue and
t.dcn = tsum.mindcn
)
delete from t
where t.id not in (select id from tokeep)
There are other ways to express this that are possibly more efficient (depending on the database). This, though, keeps the structure of the original query.
By the way, when trying new DELETE code, be sure that you stash a copy of the table. It is easy to make a mistake with DELETE (and UPDATE). For instance, if you leave out the WHERE clause, all the rows will disappear, after the long painful process of logging all of them. You might find it faster to simply select the desired results into a new table, validate them, then truncate the old table and re-insert them.

Linq Group on a multi-level object with select statement

I've got 3 dataset objects that are nested with each other using entity set objects. I am selecting the data like this
var newList = from s in MainTable
from a in s.SubTable1 where a.ColumnX = "value"
from b in a.Detail where b.Name = "searchValue"
select new {
ID = s.ID,
Company = a.CompanyName,
Name = b.Name,
Date = s.DueDate
Colour = b.Colour,
Town = a.Town
};
and this works fine, but the trouble is there are many records in the Detail object-list/table for each Name value so I get a load of duplicate rows and thus I only want to display one record per b.Name. I have tried putting
group s by b.Name into g
before the select, but then this seems to stop the select enabling me to select the columns I want (there are more, in practice). How do I use the group command in this circumstance while still keeping the output rows in a "flat" format?
Appending comment as answer to close question:-
Of course that if you group your results, you cant get select a column of a child, thats because there may be more than one childs and you have to specify an aggregate column for example the sum,max etx –