Couchbase N1ql queries - couchbase

I have two question regarding N1QL query in Couchbase.
1: Let suppose I have user table where userid is document key and then i
fire a query like this
select * from mybucket use keys["1234"];
2: Let suppose userid is not a document key and then i create a secondary index on userid
select * from mybucket where userid=1234;
So my question is, which query would perform faster ?
Second question is,
Let suppose I have user table where userid is document key
select * from mybucket where meta().id="1234";
This query does not run and give me "No index available on keyspace".
It is a document key, it should run like "use keys". I tried to create a secondary index on userid but it says index can not be created since this field is not the part of document(obviously, it is a document key)

The first query will run fastest. Naming the specific key directly in a USE KEYS clause lets Couchbase retrieve the record directly in a single request. The second approach, using an index, will be slightly slower, because the system will first have to make a request to the index to get the document id, and then retrieve the record itself. The second approach will still be very very fast, but not quite as fast as the first one.
Yeah, depending on what version you are using, we may not be fully optimizing that third case. Use USE KEYS if you can.

Related

Is it correct to delete a tuple from my query result which has been populated from a sql statement?

I am learning sql queries. If my query result comes back like the picture given query result, is that a valid query result? If not what should I do to get back the result as solid table where it shouldn't let me edit the result. I am using phpmyadmin.
The query was to list all the student that doesn't provide the NOK details.
my sql statement is given below:
select CONCAT(s.fname,' ', s.lname) as Names,s.bannerNo
from Student s
where NOT EXISTS (select*from NOK n where n.studentID = s.bannerNo)
Student table:
Student (bannerNo,fName,lName, street,city,postcode,mobile,email,DOB,gender,category,nationality,special needs, comments,status,courseNo,staffNo)
Primary Key bannerNo
NOK Table:
NOK (StudentID,fName,lName,relationship,street,city,postcode,phoneNo)
Primary Key StudentID
Foreign key StudentID references Student(bannerNo)
If I understand correctly you want to delete a tuple on the visual tool (phpmyadmin) that shows you the result of the query.
I don't know the tool, but it does show the "Delete" button, as your image proves.
Let me translate this into SQL terms: you are seeing a "resultset" and you want to know if it's valid to manipulate it, so it will affect the underlying data. Did I get it right?
If so, then this question is similar to the manipulation of data using views. A view is a projection of one or more tables. Then... are views updatable? The answer is: some of them are.
Which ones? That depends on the database engine. Oracle is quite good at those; other database don't really allow it but in a small number of cases.
So, what's the main idea, then? Well in most databases the main concept is about "traceability". Does each resulting row correspond to a single row in one underlying table? If this is the case, then most likely you'll be able to update/delete it. If not, you're out of luck.
Why? Because, the database engine needs to have some way back to the source of the data you are viewing. If your query includes the primary key of the table, that's good news. If not, then it will be more difficult. Maybe some other column has a unique constraint or something that can be of use.
Bottom line, it depends on:
The database engine (MySQL in this case).
The "traceability" of the specific query. Can it be traced back to the rows? Seems so in your query, since the PK bannerNo is right there.

Query optimation for insert to database

I need a solution to insert a lot of rows in concurrent time in my sql DB.
I have a rule, that everytime I insert to my transaction table, I need a unique ID that's composed by currentTime+transactionSource+sequenceNumber. my problem is, when I test my service using Jmeter, the service is down when the concurrent insert process is up to 3000 rows. the problem relies on, the duplication of the unique ID I generate. so, there are some duplications. in my assumption, the duplication happen because a previous insert process hasnt finished, but there's another insert process. So,it generates unique ID duplication.
Can anyone give me suggestion in what the best way for doing this? Thank you.
MySQL has three wonderful methods to ensure that an id is unique:
auto_increment columns
uuid()
uuid_short()
Use them! The most common way to implement a unique id is the first one:
create table t (
t_id int auto_increment primar key,
. . .
)
I strongly, strongly advise you not to maintain your own id. You get race conditions (as you have seen). Your code will be less efficient than the code in the database. If you need the separate components, you can implement them as columns in the table.
In other words, your fundamental problem is your "rule". And there are zillions of databases in the world that work perfectly well without such a rule.
Why don't you let the database handle the insert id and then update the row with a secondary field containing the format you want ? If you have dupplicates, you can always append the row id to this identifier so it will always be unique.

Clustered index in django

I have two tables in my models:
1)Owner:
OwnerName
2)Car
OwnerKey
CarName
CarPrice
Here while creating a row in Owner table, I also add the Cars for that owner Car table. So all the cars for a particular owner are stored sequentially in the Car table.Now if I want to ask whether should I use cluster indexing or not? Once the cars for a particular owner are saved, no cars are then added for that owner neither any car is deleted, just the price is changed. What should I do for a faster access? And how to implement cluster index via django?
You question is asking for information that requires studying the SQL, not just the Django code.
If you have only a thousand owners and cars, things will probably be fast enough. If you have a million owners or cars, you need indexes, not necessarily "clustered".
A "clustered" key is implemented in MySQL as a PRIMARY KEY. You can have only one per table, and its values must be unique.
I Django, do something like this to get a PK:
column1 = models.IntegerField(primary_key = True)
Please provide the table schema you already have. That is, go beyond Django and get SHOW CREATE TABLE. (What you have provided is too vague; hence my answer is too vague.)
References:
https://code.djangoproject.com/ticket/8316
Two primary keys specified in MySQL database -- It's really talking about having a "composite" PRIMARY KEY.
There are no "clustered" indexes other than the PRIMARY KEY. However, a secondary key can have similar performance if it is a "covering index". That term refers to an index that contains all the columns that are found in the SELECT. Also, the columns are ordered correctly.
Let's see your SELECT statements in order to better judge what is needed.

Store Follows of users in a table

I've got the following situation: I want to store data, which represents, if a user is following another user. Another table, which I cannot touch, stores the users, where the username is the primary key (unfortunatly no id...).
The fact is, if one user follows another one, it doesn't mean, that the other one is following the first one.
Right now, I designed the table with two varchar's (128) and a UNIQUE INDEX on these two varchar's which represent the usernames.
The problem is, that I need to parse some old-styled system now, and I finished like 15% and I've got 550k entries on this table already.
The index is bigger then 16MB, and the data just 14MB.
What could I do, to save this data in a better way? As said, I cannot use id's instead of the usernames, because the user-table uses the username as primary key.
As you have noticed, creating a seperate index on all columns essentially forces MySQL to duplicate all data in the index.
Instead of creating a seperate unique index, you can create a primary key consisting of both of your fields. MySQL uses the primary key as a clustered index making sure your uniqueness constraint is still satisfied without increasing the size of your database.
You might consider building your own index table that contains ID > username.
You could then use the ID's to map the followers.
This will cause for some extra overhead if you want to retrieve all the data.

Mysql deduce foreign key relationship for random queries

I am an MySQL novice and am looking for the solution to the following problem:
I would like to create a CMS with cppcms which shall be capable to have modules. Since I want to reduce the chance of (accidental) access to private data, I want a module which handles data access and rights. Since this module is supposed to be unaware of data structures created by other modules I would like it to deduce the data owner through foreign key relations. My idea would be to search for a path (over foreign keys) which links a row to a user id.
Sum up:
What I am trying to do
Taking a random query, determine the affected rows
for the affected rows determine a relationship/path (via foreign keys) to a user/userid (a column in an existing table)
return only the rows for which a relationship could be determined and a condition holds (e.g. the userid found in the related query matches a fixed user id, such as the user currently accessing the system)
(As far as I know foreign keys only enforce the existence of a key in another table, however the precondition I assume is, that every row is linked to a user over a path of foreign key relations)
My Problem/Question:
Is there an existing solution/Better approach to the problem? Prepared statements wont do the trick since I don't know all datastructures/queries in advance.
How do I get the foreign key relations? Is there another way besides "SHOW CREATE TABLE" and then parsing the result string?
How can I determine the rows that would be affected, without modifing them? I would like to filter this set afterwards by determining if I can link it to the current user (not the mysql user but system user).
Could I try executing the query, and then select the affect rows, and if I determine an access violation simply do a rollback? Problem with this: how to do the changes to the subset of rows for which it is legal (e.g. I attempt to change 5 rows, may only change 2, how to only change those 2). One idea was to search a way to create a temporary table with the result set; this solution has several drawbacks: foreign key relations are not possilbe for temporary tables, they are 'lost'.
P.S.: I am coding in c++, therfore I would prefer cpp-compatible library recommendations, however I am open to other suggestions. While googling I stumbled over doctrine and Iam currently researching it.
P.P.S.: Database engine is InnoDB (has to because of the foreign keys)
UPDATE: Explanation Attempt of Part 2:
I am trying to filter which collumns a user is allowed to see of tables. To do so I would like to find a connection in the database over foreign keys (By foreign keys I ensure that I can get to all data over joins, and they are a hint on which columns I have to join). Since I plan on a complexer system (e.g. forum) I don't want to join all data in a temporary table and run a user query on those. I would rather evaluate the userquery and check for the result if I can map it with a join to the users id. For example I could use this to enforce that an edit button is only enabled for the posts created by the user. (I know there are easier ways to do this, but I basically want to allow programmers to write their own queries without giving them the chance to edit or view data that they are not allowed to see. My assumption is that the programmer is not an evildoer but simply forgetting constraints, thus I want to enforce them in software).
Getting here would be pretty good, but I have a little more complex need.
First a basic example. Let's say its like facebook and all the friends of a person are allowed to see his pictures.
pictures = id **userid** file (bool)visibleForFriends album
friendship = **userid1** **userid2**
users = userid
What I want to happen is:
Programmer input "SELECT * FROM pictures WHERE album=2"
System gets all matching records (e.g. set of ids)
System sees foreign key userid, tries to match current userid against the pictures userid, adds all matching to the returned result part
System notices special column visibleForFriends
System tries to determin all Friends (SELECT userid1 FROM friendship WHERE userid2=currentUserID join (have to read up on joins) SELECT userid2 FROM friendship WHERE userid1 =currentUserID)
System adds all rows where visibleForFriends is true and pictures.userid=Result from 5.
While the Friendship part is some extra code (I think doable if igot started on the first bit), I still need to figure out how to automatically follow the foreign keys to see the connection. Ignoring the special Friendship case (special case), I would like the system to work on this as well:
pictures = id **albumid** file (bool)visibleForFriends album
albums = id **userid**
users = userid
Now the system should go pictures.albumid ==> albums.id -> albums.userid ==> users.userid.
I hope the examples clarified the question a bit. One problem is, that in point one from the example (programmer query input) I dont want to let "DELETE *" take effect on anything not owned by the user. So I have to filter which rows to actually delete.
In response to part of your answer (part 1), providing the Mysql user you access the database with has access rights to information_schema then you can use the following query to understand existing foreign key relations within a specific database:
SELECT
TABLE_NAME,
COLUMN_NAME,
REFERENCED_TABLE_NAME,
REFERENCED_COLUMN_NAME
FROM
information_schema.KEY_COLUMN_USAGE
WHERE
TABLE_SCHEMA = 'dbname' AND REFERENCED_COLUMN_NAME IS NOT NULL;
I am slightly confused by the part 2 and am unsure how to give an appropriate response to this section. I hope you find the above query helpful though in your project!
Is there an existing solution/Better approach to the problem?
Yes, I think so. You're describing a multi-tenant database. In a multi-tenant database in which the users share tables (also known as "shared everything"), each table should have a column for the user id. In effect, each row knows its owner.
This will vastly simplify your SQL, since you need no joins to determine who a row belongs to. it will probably speed up your SQL a lot, too.
This SO answer has a decent summary of the issues and alternatives.