MySQL to MongoDB conversion - mysql

How do I convert the following into MongoDB query ?
sets_progress = Photo.select('count(status) as count, status, photoset_id')
.where('photoset_id IN (?)', sets_tracked_array)
.group('photoset_id, status')

There is no 1 to 1 mapping of a SQL query to a NoSQL implementation. You'll need to precalculate your data to match the way you want to access that data.
If it is small enough, then this query will need to change into a map-reduce job. More here: http://www.mongodb.org/display/DOCS/MapReduce
Here's a decent tutorial that takes a query that GROUP's and converts to map-reduce: http://www.mongovue.com/2010/11/03/yet-another-mongodb-map-reduce-tutorial/

Related

Knex.js Select Average and Round

I am switching an application from PHP/MYSQL to Express and am using knex to connect to the MYSQL database. In one of my queries I use a statement like such (I have shortened it for brevity.)
SELECT ROUND(AVG(Q1),2) AS Q1 FROM reviews WHERE id=? AND active='1'
I am able to use ROUND if I use knex.raw but I am wondering if there is a way to write this using query builder. Using query builder makes dealing with the output on the view side so much easier than trying to navigate the objects returned from the raw query.
Here is what I have so far in knex.
let id = req.params.id;
knex('reviews')
//Can you wrap a ROUND around the average? Or do a Round at all?
.avg('Q1 as Q1')
.where('id', '=', id)
Thanks so much!
You can use raw inside select. In this case:
knex('reviews')
.select(knex.raw('ROUND(AVG(Q1),2) AS Q1'))
Check the docs here for more examples and good practices when dealing with raw statements.

Multiple, unknown number of fields passed into a query

Is it possible to create a generic query that would work for different types of documents? For example I have "cases" and "factories",
They have different set of fields. e.g:
{
id: 'case_o1',
name: 'Case numero uno',
amount: 40
}
{
id: 'factory_002',
location: 'Venezuela',
workers: 200,
operating: true
}
Is it possible to create a generic query where I would pass the type of an entity (case or factory) and additional parameters and it would filter results based on those?
I could of course use javascript view, but it doesn't allow me to filter by multiple fields. Let's say I want to fetch all factories located in Venezuela, with number of workers between 20 and 55.
I started with this, but then I got stuck:
select * from `mybucket` as entity
where position(meta(entity).id, $entity_type) == 0
How do I pass multiple predicates and have the query to recognize them?
I can of course list fields like this:
where position(meta(entity).id, $entity_type) == 0
and entity.location == 'Venezuela'
and entity.workers > $workers_min
and entity.workers < $workers_max
but then
I'm gonna have to create a separate query for each entity
And even then it won't solve my problem - I have no idea how to ignore predicates, what if next time $workers_min and $workers_max are not passed, does it mean I have to create a query for every single predicate (column)?
For security reasons I cannot generate free-form queries and pass them to Couchbase server, all the queries are already stored in the database, our api just picks them up out of a document and executes them
I think it's possible to create a query that would be "short-circuiting" for args that's undefined (e.g. WHERE $location IS MISSING OR entity.location == $location or something like that)
Is it possible at all to create a query that would be able to effectively filter and order a dataset based on arbitrary parameters? Or there's no way?
#Agzam. Sorry. I were writting my comment when you said it. But anyway. What you are asking for is possible by using coalesces in a not too complex expressions, but it is a REALLY bad idea because this will drastically throw down most of internal database optimizations. Including the use of any existing index. So, except if you are dealing with a relatively small database (and you are sure it will remain being approximately the same size), I suggest you to better try distinct approach… This is, in fact, the reason I implmented sqlapi.
If you need to have all querys previously stored in database, it probably could be much better to sort given arguments by its name and precalculate and store precalculated querys for each possible combination.
You can do it by assigning a default value to the variable when is not used. For instance if $location is not used you can set it to -1 as default value.
Then the where condition would be:
WHERE ($location=-1 OR entity.location = $location)

Django mysql count distinct gives different result to postgres

I'm trying to count distinct string values for a fitered set of results in a django query against a mysql database versus the same data in a postgres database. However, I'm getting really confusing results.
In the code below, NewOrder represents queries against the same data in a postgres database, and OldOrder is the same data in a MYSQL instance.
( In the old database, completed orders had status=1, in the new DB complete status = 'Complete'. In both the 'email' field is the same )
OldOrder.objects.filter(status=1).count()
6751
NewOrder.objects.filter(status='Complete').count()
6751
OldOrder.objects.filter(status=1).values('email').distinct().count()
3747
NewOrder.objects.filter(status='Complete').values('email').distinct().count()
3825
print NewOrder.objects.filter(status='Complete').values('email').distinct().query
SELECT DISTINCT "order_order"."email" FROM "order_order" WHERE "order_order"."status" = Complete
print OldSale.objects.filter(status=1).values('email').distinct().query
SELECT DISTINCT "order_order"."email" FROM "order_order" WHERE "order_order"."status" = 1
And here is where it gets really bizarre
new_orders = NewOrder.objects.filter(status='Complete').values_list('email', flat=True)
len(set(new_orders))
3825
old_orders = OldOrder.objects.filter(status=1).values_list('email',flat=True)
len(set(old_orders))
3825
Can anyone explain this discrepancy? And possibly point me as to why results would be different between postgres and mysql? My only guess is a character encoding issue, but I'd expect the results of the python set() to also be different?
Sounds like you're probably using a case-insensitive collation in MySQL. There's no equivalent in PostgreSQL; the closest is the citext data type, but usually you just compare lower(...) of strings, or use ILIKE for pattern matching.
I don't know how to say it in Django, but I'd see if the count of the set of distinct lowercased email addresses is the same as the old DB.
According to the Django docs something like this might work:
NewOrder.objects.filter(status='Complete').values(Lower('email')).distinct()

How can i repeat MySQL's WHERE 1 query in MongoDB?

I can do this in MySQL:
WHERE 1 AND 1 AND 1
How can i repeat it in MongoDB? What is MongoDB's equivalent for WHERE 1 ?
UPDATE:
So. I don't know how choose best answer ^^ and expanded question. As #mark-hillick noticed - i'm searching the best way to build query.
Now I'm using this way (express+mongoose):
//req.query - get/post object in Express
for (var q in req.query) {
if (req.query[q]) { //simplified example
query[q] = req.query[q];
};
}
Collection.find(query)
Your suggestions?
There is a SQL-MongoDB Mapping Chart here that you will find useful.
It has a tonne of examples on what you do within MongoDB when you want to do the same operation as "WHERE" in MySQL. For example -
SELECT a,b FROM users WHERE age=33
is
db.users.find({age:33}, {a:1,b:1})
or
SELECT * FROM users WHERE a=1 and b=1
is
db.users.find({a:1,b:1})
MongoDB is document oriented database and documents in MongoDB consists key-value pairs. So, in MongoDB you can't run single value query as you did in MySql. Assuming you hold your data in the field name a, similar query in MongoDB could be like :
db.test.find({$and : [{a:1},{a:1}, {a:1}]});
If' you're trying to build query clauses, AND is implicit in Mongo. Therefore, if you have the following;
db.col.find({name:"dave"})
you could just add another;
db.col.find({name:"dave", age:33})
and so on.

Mysql match...against vs. simple like "%term%"

What's wrong with:
$term = $_POST['search'];
function buildQuery($exploded,$count,$query)
{
if(count($exploded)>$count)
{
$query.= ' AND column LIKE "%'. $exploded[$count] .'%"';
return buildQuery($exploded,$count+1,$query);
}
return $query;
}
$exploded = explode(' ',$term);
$query = buildQuery($exploded,1,
'SELECT * FROM table WHERE column LIKE "%'. $exploded[0] .'%"');
and then query the db to retrieve the results in a certain order, instead of using the myIsam-only sql match...against?
Would it dawdle performance dramatically?
The difference is in the algorithm's that MySQL uses behind the scenes find your data. Fulltext searches also allow you sort based on relevancy. The LIKE search in most conditions is going to do a full table scan, so depending on the amount of data, you could see performance issues with it. The fulltext engine can also have performance issues when dealing with large row sets.
On a different note, one thing I would add to this code is something to escape the exploded values. Perhaps a call to mysql_real_escape_string()
You can check out my recent presentation I did for MySQL University:
http://forge.mysql.com/wiki/Practical_Full-Text_Search_in_MySQL
Slides are also here:
http://www.slideshare.net/billkarwin/practical-full-text-search-with-my-sql
In my test, using LIKE '%pattern%' was more than 300x slower than using a MySQL FULLTEXT index. My test data was 1.5 million posts from the StackOverflow October data dump.