Another add data to multiple records in PostGIS table - gis

Nicklas, thanks for the answer to my previous question....
Forgive me for my ignorance and for asking what may turn out to be rather simple questions, but databases are not my area of expertise.
The select statement that gives me the number of crashes in each percinct:
SELECT P.precinct, count(C) FROM nycpp P, nyccrash C WHERE _st_contains(P.the_geom, C.crashpoint) GROUP BY P.precinct ORDER BY P.precinct;
I want to add just the count to my nycpp table the variable that will hold the count is number_of crashes....
Thanks again the assistance
Chris

Hallo
I assume that precinct is the unique id in nycpp, then you can try:
update nycpp set number_of crashes=a.n_crashes from
(SELECT P.precinct, count(C) as n_crashes FROM nycpp P, nyccrash C
WHERE _st_contains(P.the_geom, C.crashpoint)
GROUP BY P.precinct
ORDER BY P.precinct) a
where nycpp.precinct=a.precinct;
But why do you use _st_contains instead of st_contains
the underscore version will not use your spatial indexes, but st_contains will do a first indexscan finding intersecting bouning boxes before running theunderscore version.
So, you probably absolutely want to use st_contains instead of contains. If your tables are big enough for needing index: In this query the spatial indexes is important on both tables and an index on precinct. Don't forget to analyse the table after creating the indexes to make them work.
BTW, I think you are supposed to mark the questions as answered if you are satisfied with the answer so others don't have to try to answer them.
HTH
Nicklas

Related

SQL Left/Inner/Normal Join vs Where while conditional statement

I am relatively new to the SQL programming, so please go easy on me.
I am currently writing a query, which would output the result based on the value from one of the outer parameters. The structure is currently looking like following:
#ShowEntireCategory bit = 0
select distinct
p.pk
p.name
--other columns
from dbo.Project P
--bunch of left joins
where p.Status = 'Open'
--other conditions
What I am trying to implement is: when the value of ShowEntireCategory is 1 (changed programmatically through radiobutton selection) it will show records of all subcategories, which are inside of the the category. When it is 0, it will only show records from the selected subcategory, while other subcategories in that category remains untouched.
I have been performing a research on the best approach, and it narrowed down to either WHERE statements or JOINs.
What I want to know is: which of these approaches I should use for my scenario? In my case the priority is optimization (minimum time to execute) and ease of implementation.
NOTE: My main goal here is not to receive a ready to use code here (though an example code snippets would be welcome), I just want to know a better approach, so I can continue researching in that direction.
Thank you in advance!
UPDATE
I have performed additional research on the database structure, and managed to narrow down to parameters relevant to the question
One is dbo.Project table, which contains: PK, CategoryKey (FK) (connected to the one in second table), Name, Description, and all other parameters which are irrelevant.
Second one is dbo.Area table, which contains: PK, AreaNumber, Name, CategoryKey (FK), IsCategory (1 = is category, 0 = not category).
Sorry, but I work in fast-paced environment, this is as much as I was able to squeeze. Please let me know if it is not enough.
With the information you provided the best solution would be to use a combination of WHERE clauses and JOINS. You would likely need to use a WHERE clause on the second table (described in the update) to select all rows which are categories. Then, you would JOIN this result with your other tables/data. Finally, you can use a CASE clause (details found here) to check your variable and determine if you need all categories or just some (which can be dealt with through an additional WHERE clause).
Not sure this entirely answers your question, but for a more detailed answer we would need a more detailed description of the database schema.

Extremely basic SQL Misunderstanding

I'm preparing for an exam in databases and SQL and I'm solving an exercise:
We have a database of 4 tables that represent a human resources company. The tables are:
applicant(a-id,a-name,a-city,years-of-study),
job(job-name,job-id),
qualified(a-id,job-id)
wish(a-id,job-id).
the table applicant represents the table of applicants obviously. And jobs is the table of available jobs. the table qualified shows what jobs a person is qualified for, and the table wish shows what jobs a person is interested in.
The question was to write a query that displays for each job-id, the number of applicants that are both qualified and interested to work in.
Here is the solution the teacher wrote:
Select q1.job_id
, count(q1.a_id)
from qualified as q1
, wish as w1
Where q1.a_id = w1.a_id
and q1.job_id = w1.job_id
Group by job_id;
That's all well and good, I'm not sure why we needed that "as q1" and "as w1", but i can see why it works.
And here is the solution I wrote:
SELECT job-id,COUNT(a-id) FROM job,qualified,wish WHERE (qualified.a-id=wish.a-id)
GROUP BY job-id
Why is my solution wrong? And also - From which table will it select the information? Suppose I write SELECT job-id FROM job,qualified,wish. From which table will it take the information? because job-id exists in all 3 of these tables.
You can only refer to tables mentioned in the FROM clause. If it's ambiguous (because more than one has a column of the same name) then you need to be explicit by qualifying the name. Usually the qualifier is an alias but it could also be the table name itself if an alias wasn't specified.
There's a concept of a "natural join" which joins tables on common column(s) between two tables. Not all systems support that notation but I think MySQL does. I believe these systems usually collapse the joined pairs into a single column.
select q1.job_id, count(q1.a_id) from qualified as q1, wish as w1
where q1.a_id = w1.a_id and q1.job_id = w1.job_id
group by job_id;
I don't think I've worked on any systems that would have accepted the query above because the grouping column would have been strictly unclear even though the intention really is not. So if it truly does work correctly on MySQL then my guess is that it recognizes the equivalence of the columns and cuts you some slack on the syntax.
By the way, your query appears to be incorrect because you only included a single column in a join that requires two columns. You also included a third table which means that your result will effectively do a cross join of every row in that table. The grouping is going to still going to reduce it to one row per job_id but the count is going to be multiplied by the number of rows in the job table. Perhaps you added that table thinking it would hurt to add it just in case you need it but that is not what it means at all.
Your query will list non-existing jobs in case the database has orphan records in applicant and qualified, and might also omit jobs that have no qualified and willing candidates.
I'm not exactly sure, because I have no idea if there's any database that will accept COUNT(a-id) when there's no information about the table from which to take this value.
edit: Interestingly it looks like both of these problems are shared by both of the solutions, but shawnt00 has a point: your solution makes a huge pointless cartesian of three tables: see it without the group by.
My current best guess for a working answer would therefore be http://sqlfiddle.com/#!9/09d0c/6

What do the dots mean in this SQL query?

I'm new to MySQL. Can anyone describe lines below which I get theme from the demo of the jqgrid, what is the meaning of a.id? What is the meaning of these dots?
$SQL = "SELECT a.id, a.invdate, b.name, a.amount,a.tax,a.total,a.note FROM invheader a, clients b WHERE a.client_id=b.client_id ORDER BY $sidx $sord LIMIT $start , $limit";
You can find the example here:
http://trirand.com/blog/jqgrid/jqgrid.html
in the advanced>Multi select
You've asked several questions here. To address the dots:
In the FROM clause, a is used as an alias for the invheader table. This means you can reference that table by the short alias a instead of the full table name.
Therefore, a.id refers the the id column of the invheader table.
It is generally considered bad practice to simply give your tables the aliases a, b, c, etc. and I would recommend you use something more useful.
I suggest you read some basic MySQL tutorials as this is a fundamental principal.
The dot(.) is used to separate the board scope.So Songs.songId mean that first find the table named Songs and then in the Songs table find the field named songId.
In my opinion, that DOT NOTATION is used for fetching information from right side of the syntax. That means, a.id which means you fetch the data from "a table". In this case, you use the aliases name, then it runs '.id'which means it fetches data 'ID'from a table.If it is wrong, please comment that wrong statement. thank you

Do i really need to include table names or AS in JOINS if columns are different?

I noticed te other day I can joins in mysql just as easily by doing,
SELECT peeps, persons, friends FROM tablea JOIN tableb USING (id) WHERE id = ?
In stead of using,
SELECT a.peeps, a.persons, b.friends FROM tablea a JOIN tableb b USING (id) WHERE id = ?
It only works if there is no matching column names, why should I do the second rather than the first?
No, you don't need to, but in my humble opinion you really should. It's almost always better in my experience to be explicit with what you're trying to do.
Consider the feelings of the poor guy (or girl) who has to come behind you and try to figure out what you were trying to accomplish and in which tables each column resides. Explicitly stating the source of the column allows one to look at the query and glean that information without deep knowledge of the schema.
Query 1 will work (as long as there are no ambiguous column names).
Query 2 will
be clearer
be more maintainable (think of someone who doesn't know the database schema by heart)
survive the addition of an ambiguous column name to one of the tables
So, don't be lazy because of that pitiful few saved keystrokes.
It's not necessary if you have no duplicate column names. If you do, the query will fail.

MS-Access design pattern for last value for a grouping

It's common to have a table where for example the the fields are account, value, and time. What's the best design pattern for retrieving the last value for each account? Unfortunately the last keyword in a grouping gives you the last physical record in the database, not the last record by any sorting. Which means IMHO it should never be used. The two clumsy approaches I use are either a subquery approach or a secondary query to determine the last record, and then joining to the table to find the value. Isn't there a more elegant approach?
could you not do:
select account,last(value),max(time)
from table
group by account
I tested this (granted for a very small, almost trivial record set) and it produced proper results.
Edit:
that also doesn't work after some more testing. I did a fair bit of access programming in a past life and feel like there is a way to do what your asking in 1 query, but im drawing a blank at the moment. sorry.
After literally years of searching I finally found the answer at the link below #3. The sub-queries above will work, but are very slow -- debilitatingly slow for my purposes.
The more popular answer is a tri-level query: 1st level finds the max, 2nd level gets the field values based on the 1st query. The result is then joined in as a table to the main query. Fast but complicated and time-consuming to code/maintain.
This link works, still runs pretty fast and is a lot less work to code/maintain. Thanks to the authors of this site.
http://access.mvps.org/access/queries/qry0020.htm
The subquery option sounds best to me, something like the following psuedo-sql. It may be possible/necessary to optimize it via a join, that will depend on the capabilities of the SQL engine.
select *
from table
where account+time in (select account+max(time)
from table
group by account
order by time)
This is a good trick for returning the last record in a table:
SELECT TOP 1 * FROM TableName ORDER BY Time DESC
Check out this site for more info.
#Tom
It might be easier for me in general to do the "In" query that you've suggested. Generally I do something like
select T1.account, T1.value
from table T as T1
where T1 = (select max(T2.time) from table T as T2 where T1.account = T2.Account)
#shs
yes, that select last(value) SHOULD work, but it doesn't... My understanding although I can't produce an authorative source is that the last(value) gives the last physical record in the access file, which means it could be the first one timewise but the last one physically. So I don't think you should use last(value) for anything other than a really bad random row.
I'm trying to find the latest date in a group using the Access 2003 query builder, and ran into the same problem trying to use LAST for a date field. But it looks like using MAX finds the lates date.
Perhaps the following SQL is clumsy, but it seems to work correctly in Access.
SELECT
a.account,
a.time,
a.value
FROM
tablename AS a INNER JOIN [
SELECT
account,
Max(time) AS MaxOftime
FROM
tablename
GROUP BY
account
]. AS b
ON
(a.time = b.MaxOftime)
AND (a.account = b.account)
;