I have such problem with couchbase design: Query for view returns 40000 records. I need to add additional filter on them and get "top 100 order by".
I made such filtration/sorting in my application code this means I must fetch 40000 records from Couchbase (and this is time-consuming). Is there a way to execute filter/sort onto couchbase node (without fetching whole 4000 records to appserver)?
My data is touristic tours ie
{
"OT": "tour",
"dd": 20140720,
"city": 1206,
"hotel": 9656,
"stars": 2,
"resort": 23415,
"country": 34,
"price": 24139,
"priceType": 1,
"tickets": "QQYY",
"nights": 5,
"food": 4,
"oper": 18,
"adult": 1,
"ch": 0,
"ch1": 0,
"ch2": 0,
"ch3": 0,
"avail": 1,
"stop": "Q"
}
and i need to select top 10 cheapest tours from London to turkey between 20140622 and 20140710 in 4 or 5 stars hotel...
My view looks like:
function (doc, meta) {
if(meta.type==='json' && doc.OT==='tour'){
emit(["A",doc.country,doc.city,doc.adult,doc.ch,doc.priceType,doc.dd,doc.price]);
emit(["R" + doc.resort,doc.country,doc.city,doc.adult,doc.ch,doc.priceType,doc.dd,doc.price]);
}
}
RESORT+COUNTRY+DEPARTURE_CITY+ADULT_COUNT+CHILD_COUNT+PRICE_TYPE+DEPARTURE_DATE allows me to select ~ 40000 records from 3000000+ (for DEPARTURE_DATE range) but sometimes (according user input) I still need to filter out them by Stars (stars IN (4,5) for example). Also view sorting is "BY Date,Price" this is'nt applicable, for example I got
20140101 110$ <- top but not cheapest
20140101 120$
20140102 100$ <- cheapest but not top
20140102 105$
At other side sometimes I also need to fetch N cheapest tours where DepartureDate BETWEEN x and Y and Price BETWEEN A and Z.
All those scenarios require additional filters to filter out from HUGE dataset (even high-selective VIEW as shown above still produce HUGE dataset in my case) and I do not want fetch whole dataset to client (AppServer) for such processing... I realize processing on Couchbase nodes will consume more CPU on them but I prefer to add more Couchbase nodes to cluster...
Anyway someone need to do this filtering work, I believe it's much more optimal to do it where data actually placed without additional network overhead...
Dimzon,
You can sort on the value of the key emitted by the view using the descending parameter. There is also a limit parameter that will restrict the number of responses you obtain per query.
Anon,
Andrew
Related
An SQL table schema,
time, country, activer_users
If I just want to show the total number of active users over time, Below simple slect wil do that
SELECT time, sum(active_users) as activer_users GROUP BY time ORDER BY time
returned data will be like,
[{
"time": 1585878969,
"active_users": 2300
},....]
If I want active_users over time by country, then
SELECT time, country, sum(active_users) as activer_users GROUP BY time ORDER BY time, country
returned data will be like,
[{
"time": 1585878969,
"active_users": 1300,
"country": "India"
}, {
"time": 1585878969,
"active_users": 1000,
"country": "China"
}....]
I want data in the below format,
[{
"time": 1585878969,
"India": 1300,
"China": "1000"
}....]
Is this possible, to create dynamic columns from the value of a field and its value based on another field..
if suchthing is possible, what should be the query for that
Other helpful users may correct me, but I think is not possible altering MySQL responses like this. MySQL always responds in a COLUMN-VALUE way, so you would have to create a column e.g. "China" and store this data in there to get a native response like this.
My question is about creating a proper schema or way of storing some data I will be collecting.
My app runs Laravel 6.
So I have a number of 'campaigns', an example of which is like this:
{
"campaign_name": "Campaign 1",
"keywords": ["keyword 1", "keyword 2", "keyword 3"], // there may be hundreds of keywords
"urls": ["google.com", "bing.com", "example.com"], // there may be many urls
"business_names": ["Google", "Bing, "Example"], // there may be many business_names
"locations": [
{
"address": "location 1", //this is a postal address
"lat": "-37.8183",
"lng": "144.957"
},
{
"address": "location 2", //this is a postal address
"lat": "-37.7861",
"lng": "145.312"
}
// there may be 50-100 locations.
]
}
Each url (and each business name) will get matched up with each keyword along with each location.
ie:
google.com
- keyword 1 location 1
- keyword 1 location 2
- keyword 1 location 3
- keyword 2 location 1
- keyword 2 location 2
// etc etc. there may be hundreds of keywords and hundreds of locations.
bing.com
- keyword 1 location 1
- keyword 1 location 2
// etc etc as above.
Each of these concatenations will have time series data points that I want to store and ultimately query.
I see how a number of tables may be setup to handle this, but is there a way to slightly simplify this by storing some json?
Most of my migrations on projects have been pretty simple with just a single relation but this is a bit harder for me.
Any help is appreciated. I would ideally like to avoid a number of tables and complex pivots or associations if possible (understanding the benefits of normalization...)
I'm fairly new to couchbase and have tried to find the answer to a particular query I'm trying to create with not much success so far.
I've debated between using a view or N1QL for this particular case and settled with N1QL but haven't managed to get it to work so maybe a view is better after all.
Basically I have the document key (Group_1) for the following document:
Group_1
{
"cbType": "group",
"ID": 1,
"Name": "Group Atlas 3",
"StoreList": [
2,
4,
6
]
}
I also have 'store' documents, their keys are listed in this document's storelist. (Store_2, Store_4, Store_6 and they have a storeID value that is 2, 4 and 6) I basically want to obtain all 3 documents listed.
What I do have that works is I obtain this document with its id by doing:
var result = CouchbaseManager.Bucket.Get<dynamic>(couchbaseKey);
mygroup = JsonConvert.DeserializeObject<Group> (result.ToString());
I can then loop through it's storelist and obtain all it's stores in the same manner, but i don't need anything else from the group, all i want are the stores and would have prefered to do this in a single operation.
Does anyone know how to do a N1QL directly unto a specified document value?
Something like (and this is total imaginary non working code I'm just trying to clearly illustrate what I'm trying to get at):
SELECT * FROM mycouchbase WHERE documentkey IN
Group_1.StoreList
Thanks
UPDATE:
So Nic's solution does not work;
This is the closest I get to what I need atm:
SELECT b from DataBoard c USE KEYS ["Group_X"] UNNEST c.StoreList b;
"results":[{"b":2},{"b":4},{"b":6}]
Which returns the list of IDs of the Stores I want for any given group (Group_X) - I haven't found a way to get the full Stores instead of just the ID in the same statement yet.
Once I have, I'll post the full solution as well as all the speed bumps I've encountered in the process.
I apologize if I have a misunderstanding of your question, but I'm going to give it my best shot. If I misunderstood, please let me know and we'll work from there.
Let's use the following scenario:
group_1
{
"cbType": "group",
"ID": 1,
"Name": "Group Atlas 3",
"StoreList": [
2,
4,
6
]
}
store_2
{
"cbType": "store",
"ID": 2,
"name": "some store name"
}
store_4
{
"cbType": "store",
"ID": 4,
"name": "another store name"
}
store_6
{
"cbType": "store",
"ID": 6,
"name": "last store name"
}
Now lets say you wan't to query the stores from a particular group (group_1), but include no other information about the group. You essentially want to use N1QL's UNNEST and JOIN operators.
This might leave you with a query like so:
SELECT
stores.name
FROM `bucket-name-here` AS groups
UNNEST groups.StoreList AS groupstore
JOIN `bucket-name-here` AS stores ON KEYS ("store_" || groupstore.ID)
WHERE
META(groups).id = 'group_1';
A few assumptions are made in this. Both your documents exist in the same bucket and you only want to select from group_1. Of course you could use a LIKE and switch the group id to a percent wildcard.
Let me know if something doesn't make sense.
Best,
Try this query:
select Name
from buketname a join bucketname b ON KEYS a.StoreList
where Name="Group Atlas 3"
Based on your update, you can do the following:
SELECT b, s
FROM DataBoard c USE KEYS ["Group_X"]
UNNEST c.StoreList b
JOIN store_bucket s ON KEYS "Store_" || TO_STRING(b);
I have a similar requirement and I got what I needed with a query like this:
SELECT store
FROM `bucket-name-here` group
JOIN `bucket-name-here` store ON KEYS group.StoreList
WHERE group.cbType = 'group'
AND group.ID = 1
I want to create tablestopsfor all stops with these columns id, stop_name, stop_lat, stop_long, route, arrivaltime but I dont know how can I store the arrivaltime into the table since this column is a big array
Like this:
{
"id": 1
"stops_name": "Amersham ",
"arrival_time": {
"mon-fri": [ "05:38", "06:07","06:37",.....50 entries],
"sat": ["05:34","06:01","06:31",...........50 entries],
"son": ["06:02","06:34","07:04",...........50 entries]
},
"stops_lat": 83.837994,
"stops_long": 18.700423
}
Is that to manage with mysql?
Generally speaking you would split the "arrival times" out into a new table, referencing back to the table of stops. You would also generally store each time as a single row, and then select the entire collection of rows.
This works best because it lets you query on the 'time' column and search for time ranges, etc and only get the relevant rows.
For the "day", I would most likely use a Set to have a column that can be 1 or more values. Also consider that likely you may need to store info on public holidays or other special dates as well:
https://dev.mysql.com/doc/refman/5.6/en/set.html
Stops: id, stops_name, stops_lat, stops_long (1, "Amersham", 83.837994, 18.700423)
Stops_arrivals: id, stops_id, day, time (1, 1, "Mon", "05:38"), (2, 1, "Mon", "06:07"), etc
I am trying to learn mongodb. Suppose there are two tables and they are related. For example like this -
1st table has
First name- Fred, last name- Zhang, age- 20, id- s1234
2nd table has
id- s1234, course- COSC2406, semester- 1
id- s1234, course- COSC1127, semester- 1
id- s1234, course- COSC2110, semester- 1
how to insert data in the mongo db? I wrote it like this, not sure is it correct or not -
db.users.insert({
given_name: 'Fred',
family_name: 'Zhang',
Age: 20,
student_number: 's1234',
Course: ['COSC2406', 'COSC1127', 'COSC2110'],
Semester: 1
});
Thank you in advance
This would be a assuming that what you want to model has the "student_number" and the "Semester" as what is basically a unique identifier for the entries. But there would be a way to do this without accumulating the array contents in code.
You can make use of the upsert functionality in the .update() method, with the help of of few other operators in the statement.
I am going to assume you are going this inside a loop of sorts, so everything on the right side values is actually a variable:
db.users.update(
{
"student_number": student_number,
"Semester": semester
},
{
"$setOnInsert": {
"given_name": given_name,
"family_name": family_name,
"Age": age
},
"$addToSet": { "courses": course }
},
{ "upsert": true }
)
What this does in an "upsert" operation is first looks for a document that may exist in your collection that matches the query criteria given. In this case a "student_number" with the current "Semester" value.
When that match is found, the document is merely "updated". So what is being done here is using the $addToSet operator in order to "update" only unique values into the "courses" array element. This would seem to make sense to have unique courses but if that is not your case then of course you can simply use the $push operator instead. So that is the operation you want to happen every time, whether the document was "matched" or not.
In the case where no "matching" document is found, a new document will then be inserted into the collection. This is where the $setOnInsert operator comes in.
So the point of that section is that it will only be called when a new document is created as there is no need to update those fields with the same information every time. In addition to this, the fields you specified in the query criteria have explicit values, so the behavior of the "upsert" is to automatically create those fields with those values in the newly created document.
After a new document is created, then the next "upsert" statement that uses the same criteria will of course only "update" the now existing document, and as such only your new course information would be added.
Overall working like this allows you to "pre-join" the two tables from your source with an appropriate query. Then you are just looping the results without needing to write code for trying to group the correct entries together and simply letting MongoDB do the accumulation work for you.
Of course you can always just write the code to do this yourself and it would result in fewer "trips" to the database in order to insert your already accumulated records if that would suit your needs.
As a final note, though it does require some additional complexity, you can get better performance out of the operation as shown by using the newly introduced "batch updates" functionality.For this your MongoDB server version will need to be 2.6 or higher. But that is one way of still reducing the logic while maintaining fewer actual "over the wire" writes to the database.
You can either have two separate collections - one with student details and other with courses and link them with "id".
Else you can have a single document with courses as inner document in form of array as below:
{
"FirstName": "Fred",
"LastName": "Zhang",
"age": 20,
"id": "s1234",
"Courses": [
{
"courseId": "COSC2406",
"semester": 1
},
{
"courseId": "COSC1127",
"semester": 1
},
{
"courseId": "COSC2110",
"semester": 1
},
{
"courseId": "COSC2110",
"semester": 2
}
]
}