Can deletion only be done with Row-Identifier Columns? - socrata

From what I have seen so far, the only types of data that can be deleted are rows that are set by the row identifier. There doesn't seem to be more information regarding this on the API documentation.
i.e, If I wanted to delete 10 rows, do I have to set the row identifier for each row on the Socrata Dataset Metadata page? Is there no way to do all deletions at once?
What if a certain row has null values (meaning it cannot be set as a row-identifier), how do we delete those rows?
Feedback appreciated, thank you.

All datasets have an Internal identifier by default, as described here:
http://dev.socrata.com/docs/row-identifiers.html
You can use the internal identifier to delete rows in the same way you would delete rows by a 'Publisher-Specified Identifier' (established using the Socrata Dataset Metadata page). Information on deleting a row by its identifier is here:
https://dev.socrata.com/publishers/direct-row-manipulation.html
You can delete rows one at a time or in bulk as described here:
dev.socrata.com/publishers/upsert.html
To find the internal identifiers simply use $select=:id in your SODA query. For example:
https://soda.demo.socrata.com/resource/4tka-6guv.json?$select=:id,region

Related

MySQL count selected rows in one table to update value in another table

I have created a table ("texts" table) for storing ocr text from scanned documents. The table now has 100,000 + records. It stores a separate record for each page in the document. I set up the table originally so it stored the documents' title and its location against each record, which was obviously bad design as the info was duplicated for many records. I have subsequently created a separate table which now only stores one record for each document ("documents" table). The original table still contains a record for each page in the document, but the only columns now are the ocr text and the id of the document record in the documents table.
The documents table has a column "total_pages". I am trying to update this value using the following query:
UPDATE documents SET total_pages=(SELECT Count(*) from texts where texts.docs_id=documents.id)
This just seems to take forever to execute and I have had to crash out of it on a couple of occasions. There are over 8000 records in the documents table.
I have tested the query by limiting it to just one document
UPDATE documents SET total_pages=(SELECT Count(*) from texts where texts.docs_id=documents.id and documents.id=1)
This works eventually with just one record, but it takes a very long time to execute. I am guessing that my full query needs a bit of optimization! Any help greatly appreciated.
This is your query:
UPDATE documents
SET total_pages = (SELECT Count(*)
from texts
where texts.docs_id = documents.id)
For performance, you want an index on texts(docs_id). That will probably fix your performance problem. In fact, it might make it unnecessary to store this value in the master table.
If you do decide to store the count, be sure that you keep the value up-to-date. That would typically require a trigger to handle inserts and dates (and perhaps updates, if doc_id changes).

Delete sql rows with multiple values within where IDs do not have a match in another table

I have two table:
Table: Options
Options
Id xItems
- ItemA,ItemB,ItemC,etc
Table: Items
Items
Id
-
I am attempting to delete all Items rows that are not listed within Options.xitems
I attempted to execute the SQL statement
DELETE FROM items
Where items.id NOT IN (SELECT xitems FROM options)
However the problem is that multiple values are contained within XItems and I only managed to delete rows where Item.Id was the first or only value.
Would appreciate any kind help
EDIT: The following update added from the OP's post as an Answer.
The server is MySQL(tags edited accordingly) which allows one to enter an SQL statement below to execute against any database table or tables. I am a front end dev and get confused with this stuff.
John, I ran the code you posted. Here is the acutal code I applying against backedup test tables
DELETE FROM xbak514q_ecom_prodoptionsel
WHERE NOT FIND_IN_SET(xbak514q_ecom_prodoptionsel.id, (SELECT xprodoptionsel FROM xbak514q_ecom_prodoptions))
which returned the following error:
A problem was encountered while executing the SQL statement submitted.
The error was reported as: The MySQL extension encountered a problem
submitting an SQL statement. MySQL reported the error as: Subquery
returns more than 1 row
This database was configured by a software company who set up an e-comm site. The Items, Product options and selection items(add ons) are quite extensive. Should I consider reformatting the tables?
Again thanks for your kind help
have you tried using find_in_set()??
DELETE FROM items
WHERE NOT FIND_IN_SET(items.id, (SELECT xitems FROM options))
FIDDLE DEMO
NOTE:
find_in_set() is only for MySQL but since you have it tagged for both this may or may not be the solution. however the function looks for a comma separated list that is a single string or item and takes the first argument as the search string
RECOMMENDATION.
you should NEVER store data in the database as a comma separated list like that.. it causes HUGE issues in the future. please consider normalizing your database. if you want a way to do that just post a comment and I'll write up a query that will normalize it for you.

best way to get the last inserted record in sql server

Hi all I having a Identity column and a Computed primary key column in my table I need to get the last inserted record immediately after inserting the record in to database, So I have written the following queries can some one tell which is the best one to choose
SELECT
t.[StudentID]
FROM
[tbl_Student] t
WHERE
t.ID = IDENT_CURRENT('tbl_Student')
The other is using MAX as follows
Select
MAX(StudentID)
from tbl_Student
From the above two queries which is the best one to choose.
MAX and IDENT_CURRENT, according to technet, would behave much the same and both would be equally unreliable.
"IDENT_CURRENT is not limited by scope and session; it is limited to a specified table. IDENT_CURRENT returns the identity value generated for a specific table in any session and any scope. For more information, see IDENT_CURRENT (Transact-SQL)."
Basically, to return the last insert within the current scope, regardless of any potential triggers or inserts / deletes from other sessions, you should use SCOPE_IDENTITY. Of course, that's assuming you're running the query in the same scope as the actual insert in the first place. :)
If you are, you also have the alternative of simply using OUTPUT clause to get the inserted ID values into a table variable / temporary table, and select from there.
The original answer, where my assumptions about IDENT_CURRENTwhere wrong.
Use the first one. IDENT_CURRENT should give you the last item for the current connection. If someone else would insert another student concurrently IDENT_CURRENT will give you the correct value for both clients, while MAX might give you a wrong value.
EDIT:
As it was mentioned in the other answer IDENT_CURRENTand MAXare equally unreliable in case of concurrent usage. I would still go for IDENT_CURRENT but if you want to get the last identity used by the current scope or session you can use the functions ##IDENTITY and SCOPE_IDENTITY. This technet article explains the detailed differences between IDENT_CURRENT, ##IDENTITY and SCOPE_IDENTITY.

There is a method in mysql that can INSERT data only if COUNT is equal to zero?

Considering a registration script, i've first to check if an email is already present into the databae.
If it's present no data have to be insert, if not, i can procede with the INSERT INTO
In any case at the end of query i've to know the result for comunicate it at the final user. Acqually i've already done some script, but it requires at least two queries. My goal is to do it with only one query
First you'll want to put a unique key on the e-mail address field. This will prevent you from inserting multiple records with the same e-mail address.
Once you've done that, you can use INSERT IGNORE and checked the number of affected rows returned from the query. If it's zero, you know it was a duplicate. If it's one, then you know it wasn't. Alternatively, you can just use a regular INSERT and catch the duplicate key error generated by the database to know if it was a duplicate record or not.

insert a row into a sql table from a number greater than 1 and increment all contents of column in following rows

I'm quite new to sql. I am using a mysql db with opensource cms. I want to insert a row into the zone table which has all of the locale names stored inside.
I want to insert a row at position 3561, and increment the value of zone id for all of the following rows. Can you help?
Also, if you know of any good tutorial resources that you could recommend and perhaps a decent online reference (both free please - I'm skint) then I'd be grateful.
Cheers
You don't want to do this. The zone_id should be an id in the zone table that serves no purpose other than identifying the row in the table. Generally, these are auto-incremented ids that simply add 1 to the previous largest id.
You can insert, delete, and modify the rows in the zone table. The id will always refer to the same row. This helps ensure relational integrity. So, you can refer to a row in the table using the id, rather than some other feature that might get updated.
If, for some reason, you need to output the rows in the table with a sequential id, there are ways to do this in most databases, including MySQL.
Thanks for your help. I received an answer from the opencart forum and it seems there is a php function to achieve this built into the admin.