Updating JSON in SQLite with JSON1 - json

The SQLite JSON1 extension has some really neat capabilities. However, I have not been able to figure out how I can update or insert individual JSON attribute values.
Here is an example
CREATE TABLE keywords
(
id INTEGER PRIMARY KEY,
lang INTEGER NOT NULL,
kwd TEXT NOT NULL,
locs TEXT NOT NULL DEFAULT '{}'
);
CREATE INDEX kwd ON keywords(lang,kwd);
I am using this table to store keyword searches and recording the locations from which the search was ininitated in the object locs. A sample entry in this database table would be like the one shown below
id:1,lang:1,kwd:'stackoverflow',locs:'{"1":1,"2":1,"5":1}'
The location object attributes here are indices to the actual locations stored elsewhere.
Now imagine the following scenarios
A search for stackoverflow is initiated from location index "2". In this case I simply want to increment the value at that index so that after the operation the corresponding row reads
id:1,lang:1,kwd:'stackoverflow',locs:'{"1":1,"2":2,"5":1}'
A search for stackoverflow is initiated from a previously unknown location index "7" in which case the corresponding row after the update would have to read
id:1,lang:1,kwd:'stackoverflow',locs:'{"1":1,"2":1,"5":1,"7":1}'
It is not clear to me that this can in fact be done. I tried something along the lines of
UPDATE keywords json_set(locs,'$.2','2') WHERE kwd = 'stackoverflow';
which gave the error message error near json_set. I'd be most obliged to anyone who might be able to tell me how/whether this should/can be done.

It is not necessary to create such complicated SQL with subqueries to do this.
The SQL below would solve your needs.
UPDATE keywords
SET locs = json_set(locs,'$.7', IFNULL(json_extract(locs, '$.7'), 0) + 1)
WHERE kwd = 'stackoverflow';
I know this is old, but it's like the first link when searching, it deserves a better solution.

I could have just deleted this question but given that the SQLite JSON1 extension appears to be relatively poorly understood I felt it would be more useful to provide an answer here for the benefit of others. What I have set out to do here is possible but the SQL syntax is rather more convoluted.
UPDATE keywords set locs =
(select json_set(json(keywords.locs),'$.**N**',
ifnull(
(select json_extract(keywords.locs,'$.**N**') from keywords where id = '1'),
0)
+ 1)
from keywords where id = '1')
where id = '1';
will accomplish both of the updates I have described in my original question above. Given how complicated this looks a few explanations are in order
The UPDATE keywords part does the actual updating, but it needs to know what to updatte
The SELECT json_set part is where we establish the value to be updated
If the relevant value does not exsit in the first place we do not want to do a + 1 on a null value so we do an IFNULL TEST
The WHERE id = bits ensure that we target the right row
Having now worked with JSON1 in SQLite for a while I have a tip to share with others going down the same road. It is easy to waste your time writing extremely convoluted and hard to maintain SQL in an effort to perform in-place JSON manipulation. Consider using SQLite in memory tables - CREATE TEMP TABLE... to store intermediate results and write a sequence of SQL statements instead. This makes the code a whole lot eaiser to understand and to maintain.

Related

SQLAlchemy db.text().bindparams() without clobbering

I just discovered that if you use the same name in bindparams twice in the same query, the second value clobbers the first one. For a contrived example:
db.session.query(MyTable).filter(
db.or_(
db.text("my_table.field = :value").bindparams(value=value1),
db.text("my_table.field = :value").bindparams(value=value2),
)
)
Here you would only get things with value2. value1 would not appear in the query.
Is there a general purpose way to fix this?
Btw, db.text() bits in my real query access nested jsonb properties, so please don't answer telling me to just use the column objects in this query in place of db.text().

MySQL SET Type in PostgreSQL? [duplicate]

This question already has an answer here:
convert MySQL SET data type to Postgres
(1 answer)
Closed 9 years ago.
I'm trying to use MySQL SET type in PostgreSQL, but I found only Arrays, that has quite similar functionality but doesn't met requirements.
Does PostgreSQL has similar datatype?
You can use following workarounds:
1. BIT strings
You can define your set of maximum N elements as simply BIT(N).
It is little bit awkward to populate and retrieve - you will have to use bit masks as set members. But bit strings really shine for set operations: intersection is simply &, union is |.
This type is stored very efficiently - bit per bit with small overhead for length.
Also, it is nice that length is not really limited (but you have to decide it upfront).
2. HSTORE
HSTORE type is an extension, but very easy to install. Simply executing
CREATE EXTENSION hstore
for most installations (9.1+) will make it available. Rumor has it that PostgreSQL 9.3 will have HSTORE as standard type.
It is not really a set type, but more like Perl hash or Python dictionary: it keeps arbitrary set of key=>value pairs.
With that, it is not very efficient (certainly not BIT string efficient), but it does provide functions essential for sets: || for union, but intersection is little bit awkward: use
slice(a,akeys(b)) || slice(b,akeys(a))
You can read more about HSTORE here.
What about an array with a check constraint:
create table foobar
(
myset text[] not null,
constraint check_set
check ( array_length(myset,1) <= 2
and (myset = array[''] or 'one'= ANY(myset) or 'two' = ANY(myset))
)
);
This would match a the definition of SET('one', 'two') as explained in the MySQL manual.
The only thing that this would not do, is to "normalize" the array. So
insert into foobar values (array['one', 'two']);
and
insert into foobar values (array['two', 'one']);
would be displayed differently than in MySQL (where both would wind up as 'one','two')
The check constraint will however get messy with more than 3 or 4 elements.
Building on a_horse_with_no_name's answer above, I would suggest something just a little more complex:
CREATE FUNCTION set_check(in_value anyarray, in_check anyarray)
RETURNS BOOL LANGUAGE SQL IMMUTABLE AS
$$
WITH basic_check AS (
select bool_and(v = any($2)) as condition, count(*) as ct
FROM unnest($1) v
GROUP BY v
), length_check AS (
SELECT count(*) = 0 as test FROM unnest($1)
)
SELECT bool_and(condition AND ct = 1)
FROM basic_check
UNION
SELECT test from length_check where test;
$$;
Then you should be able to do something like:
CREATE TABLE set_test (
my_set text[] CHECK (set_check(my_set, array['one'::text,'two']))
);
This works:
postgres=# insert into set_test values ('{}');
INSERT 0 1
postgres=# insert into set_test values ('{one}');
INSERT 0 1
postgres=# insert into set_test values ('{one,two}');
INSERT 0 1
postgres=# insert into set_test values ('{one,three}');
ERROR: new row for relation "set_test" violates check constraint "set_test_my_set_check"
postgres=# insert into set_test values ('{one,one}');
ERROR: new row for relation "set_test" violates check constraint "set_test_my_set_check"
Note this assumes that for your set, every value must be unique (we are talking sets here). The function should perform very well and should meet your needs. However this has the advantage of handling any size sets.
Storage-wise it is completely different from MySQL's implementation. It will take up more space on disk but should handle sets with as many members as you like, provided that you aren't running up against storage limits.... So this should have a superset of functionality in comparison to MySQL's implementation. One significant difference though is that this does not collapse the array into distinct values. It just prohibits them. If you need that too, look at a trigger.
This solution also leaves the ordinality of input data intact so '{one,two}' is distinct from '{two,one}' so if you need to ensure that behavior has changed, you may want to look into exclusion constraints on PostgreSQL 9.2.
Are you looking for enumerated data types?
PostgreSQL 9.1 Enumerated Types
From reading the page referenced in the question, it seems like a SET is a way of storing up to 64 named boolean values in one column. PostgreSQL does not provide a way to do this. You could use independent boolean columns, or some size of integer and twiddle the bits directly. Adding two new tables (one for the valid names, and the other to join names to detail rows) could make sense, especially if there is the possibility of needing to associate any other data to individual values.
some time ago I wrote one similar extension
https://github.com/okbob/Enumset
but it is not complete
some more complete and close to mysql is functionality from pltoolkit
http://okbob.blogspot.cz/2010/12/bitmapset-for-plpgsql.html
http://pgfoundry.org/frs/download.php/3203/pltoolbox-1.0.2.tar.gz
http://postgres.cz/wiki/PL_toolbox_%28en%29
function find_in_set can be emulated via arrays
http://okbob.blogspot.cz/2009/08/mysql-functions-for-postgresql.html

SQL select everything with arbitrary IN clause

This will sound silly, but trust me it is for a good (i.e. over-engineered) cause.
Is it possible to write a SQL query using an IN clause which selects everything in that table without knowing anything about the table? Keep in mind this would mean you can't use a subquery that references the table.
In other words I would like to find a statement to replace "SOMETHING" in the following query:
SELECT * FROM table_a WHERE table_a.id IN (SOMETHING)
so that the results are identical to:
SELECT * FROM table_a
by doing nothing beyond changing the value of "SOMETHING"
To satisfy the curious I'll share the reason for the question.
1) I have a FactoryObject abstract class which grants all models that extend it some glorious factory method magic using two template methods: getData() and load()
2) Models must implement the template methods. getData is a static method that accepts ID constraints, pulls rows from the database, and returns a set of associative arrays. load is not static, accepts an associative array, and populates the object based on that array.
3) The non-abstract part of FactoryObject implements a getObject() and a getObjects() method. These call getData, create objects, and loads() the array responses from getData to create and return populated objects.
getObjects() requires ID constraints as an input, either in the form of a list or in the form of a subquery, which are then passed to getData(). I wanted to make it possible to pass in no ID constraints to get all objects.
The problem is that only the models know about their tables. getObjects() is implemented at a higher level and so it doesn't know what to pass getData(), unless there was a universal "return everything" clause for IN.
There are other solutions. I can modify the API to require getData to accept a special parameter and return everything, or I can implement a static getAll[ModelName]s() method at the model level which calls:
static function getAllModelObjects() {
return getObjects("select [model].id from [model]");
}
This is reasonable and may fit the architecture anyway, but I was curious so I thought I would ask!
Works on SQL Server:
SELECT * FROM table_a WHERE table_a.id IN (table_a.id)
Okay, I hate saying no so I had to come up with another solution for you.
Since mysql is opensource you can get the source and incorporate a new feature that understands the infinity symbol. Then you just need to get the mysql community to buy into the usefulness of this feature (steer the conversation away from security as much as possible in your attempts to do so), and then get your company to upgrade their dbms to the new version once this feature has been implemented.
Problem solved.
The answer is simple. The workaround is to add some criteria like these:
# to query on a number column
AND (-1 in (-1) OR sample_table.sample_column in (-1))
# or to query on a string column
AND ('%' in ('%') OR sample_table.sample_column in ('%'))
Therefore, in your example, two following queries should return the same result as soon as you pass -1 as the parameter value.
SELECT * FROM table_a;
SELECT * FROM table_a WHERE (-1 in (-1) OR table_a.id in (-1));
And whenever you want to filter something out, you can pass it as a parameter. For example, in the following query, the records with id of 1, 2 and 6 are filtered.
SELECT * FROM table_a WHERE (-1 in (1, 2, 6) OR table_a.id in (1, 2, 6));
In this case, we have a default value like -1 or % and we have a parameter that can be anything. If the parameter is the default value, nothing is filtered.
I suggest % character as the default value if you are querying over a text column or -1 if you are querying over the PK of the table. But it totally depends to you to substitute % or -1 with any reserved character or number that you decide on.
similiar to #brandonmoore:
select * from table_a where table_a.id not in ('0')
How about:
select * from table_a where table_a.id not ine ('somevaluethatwouldneverpossiblyexistintable_a.id')
EDIT:
As much as I would like to continue thinking of a way to solve your problem, I know there isn't a way to so I figure I'll go ahead and be the first person to tell you so I can at least get credit for the answer. It's truly a bittersweet victory though :/
If you provide more info though maybe I or someone else can help you think of another workaround.

Query on custom metadata field?

This is a request from my client to tweak an existing Perl script. However, it is the actual database structure on their end that confuses me.
The requirement looks pretty simple:
only pull records where _X begins with 1, 2, or 9.
However, the underlying database is not that simple, here is the guideline from their DBA:
"_X is a custom metadata field. The database stores this data in rows, not columns, within the customData table. In order to query the custom data table in an efficient manner you need to know the Field_ID for the custom field you get that from the fielddef table:
SELECT Field_ID FROM FieldDef WHERE Name = "_X";
This returns:
10012
"Now you can query CustomData. For example:
SELECT Record_ID FROM CustomData where Field_ID="10012" AND StringValue="2012-04";
He also suggests that in my case, probably it would be:
"SELECT Record_ID FROM CustomData where Field_ID="10012" AND (StringValue LIKE '1%' || StringValue LIKE '2%' || StringValue LIKE '9%')
The weird thing is that the existing Perl script doesn't contain anything like "Select Record_ID FROM" but all like "SELECT StringValue FROM".
So that is why I am very confused here: What is "store in rows, not in columns"? Why first query the Field_ID table then CustomData? I would not be able to communicate with any of them during this weekend but really wish to get some idea on the whole thing, hope experts can help me a little on sorting out the whole structure.
More info(Table schema):
http://pastebin.com/ZiDTCCC0
The existing perl script:(focus on lines 72-136)
http://pastebin.com/JHpikTeZ
Thanks in advance.
What they seem to be using is some kind of Entity-Attribute-Value model, with the entities stored as ints and explained in another table (FieldDef).
You explained pretty well how you queried it (although you can do it in one query, with a join or a subquery), and your problem seems to be that you don't know how the Perl script does it. Unfortunately, without us seeing the Perl script, we can't either :]

linq-to-sql How can I get a few rows that don't match my existing rows?

I have a few rows of data pulled into business objects via linq-to-sql from large tables.
Now I want to get a few rows that don't match to test my comparison functions.
Using what I thought would work I get a NotSupportedException:
Local sequence cannot be used in LINQ to SQL implementation of query operators except the Contains() operator.
Here's the code:
//This table has a 2 field primary key, the other has a single
var AllNonMatches = from c in dc.Acaps
where !Matches.Rows.Any((row) => row.Key.Key == c.AppId & row.Key.Value == c.SeqNbr)
select c;
foreach (var item in AllNonMatches.Take(100)) //Exception here
{}
The table has a compound primary key: AppId and SeqNbr.
The Matches.Rows is defined as a dictionary of keyvaluepair(appid,seqnbr).
and the local sequence it is referring to appears to be the local dictionary.
Could you provide more information on the structure and the name(s) of the table(s) plz?
Not sure what you're trying to do...
edit:
Ok.. I think I get it now...
It appears you can't merge/join local tables (dictionary) with a SQL table.
If you can, I'm afraid I don't know how to do it.
The simplest solution I can think of is to put those results in a table ("Match" for instance) with foreign keys related to your table "Acaps" and then use linq-to-sql, like:
var AllNonMatches = dc.Acaps.Where(p=>p.Matchs==null).Take(100).ToList();
Sorry I couldn't come up with any better =(
What about this:
var AllNonMatches = from c in dc.Acaps
where !(Matches.Rows.ContainsKey(c.AppId) && Matches.Rows.ContainsValue(c.SeqNbr))
select c;
That will work fine. I have also used a bitwise AND operator (&&) - I think thats the right term to help improve performance over the standard AND operator.