ThinkingSphinx: When using search_for_ids how to get the associated sphinx weights? - thinking-sphinx

Is it possible when using search_for_ids to also get the associated sphinx weights?

Answered by Pat on github:
It is indeed possible, albeit a little less elegant than usual:
ThinkingSphinx.search_for_ids('ruby', :select => '*, weight()').raw
This will return an array of hashes with Sphinx's raw results.

Related

Why do I get a NullReferenceException when using ToDictionary on an Entity Framework query?

I'm very surprised, it seems my lambda expressions are executed as C# code, instead of being converted to SQL.
If that is really the case, it's a bit sad. For example:
context.Set<Post>().ToDictionary(post => post.Id, post => post.Comments.Count())
This code will apparently load the posts into C# objects first, and then count the comments. I came to that conclusion because in a similar piece of real-world code, I was having a NullReferenceException because post.Comments was null (note that in my code, the posts were loaded without the Comments relation just before executing this line of code).
Using this instead would then be much more efficient:
context.Set<Post>()
.Select(post => new { Key = post.Id, Value = post.Comments.Count() })
.ToDictionary(entry => entry.Key, entry => entry.Value)
Since I believe this code is generic enough to work in any situation, I wonder if
Am I understanding correctly what is happening?
Why hasn't it been implemented as a generic solution for ToDictionary, as it has been for ToArray and ToList?
There is no Queryable.ToDictionary method (check here), so ToDictionary takes context.Set<Post>() as IEnumerable. That means that, as you correctly understood, context.Set<Post>() is first evaluated and then processed in-memory.
That's highly inefficient, because now for each Post, comments are loaded by a separate query, if lazy loading is enabled, otherwise Post.Comments is null.
So projecting to an anonymous type is the only option to do this efficiently.

Using cts query to retrieve collections associated with the given document uri- Marklogic

I need to retrieve the collections to which a given document belongs in Marklogic.
I know xdmp command does that. But I need to use it in cts query to retrieve the data and then filter records from it.
xdmp:document-get-collections("uri of document") can't be run inside cts-query to give appropriate data.
Any idea how can it be done using cts query?
Thanks
A few options come to mind:
Option One: Use cts:values()
cts:values(cts:collection-reference())
If you check out the documentation, you will see that you can also restrict this to certain fragments by passing a query as one of the parameters.
**Update: [11-10-2017]
The comment attached to this asked for a sample of restricting the results of cts:values() to a single document(for practical purposes, I will say fragment == document)
The documentation for cts:values explains this. It is the 4th parameter - a query to restrict the results. Get to know this pattern as it is part of many features of MarkLogic. It is your friend. The query I would use for this problem statement would be a cts:document-query();
An Example:
cts:values(
cts:collection-reference(),
(),
(),
cts:document-query('/path/to/my/document')
)
Full Example:
cts:search(
collection(),
cts:collection-query(
cts:values(
cts:collection-reference(),
(),
(),
cts:document-query('/path/to/my/document')
)
)
)[1 to 10]
Option two: use cts:collection-match()
Need more control over returning just some of the collections from a document, then use cts:colection-match(). Like the first option, you can restrict the results to just some fragments. However, it has the benefit of having an option for a pattern.
Attention:
They both return a sequence - perfect for feeding into other parts of your query. However, under the hood, I believe they work differently. The second option is run against a lexicon. The larger the list of unique collection names and the more complex your pattern match, the longer for resolution. I use collection-match in projects. However, I usually use it when I can limit the possible choices by restricting the results to a smaller number of documents.
You can't do this in a single step. You have to run code first to retrieve collections associated with a document. You can use something like xdmp:document-get-collections for that. You then have to feed that into a cts query that you build dynamically:
let $doc-collections := xdmp:document-get-collections($doc-uri)
return
cts:search(collection(), cts:collection-query($doc-collections))[1 to 10]
HTH!
Are you looking for cts:collection-query()?
Insert two XML files to the same collection:
xquery version "1.0-ml";
xdmp:document-insert("/a.xml", <root><sub1><a>aaa</a></sub1></root>,
map:map() => map:with("collections", ("coll1")));
xdmp:document-insert("/b.xml", <root><sub2><a>aaa</a></sub2></root>,
map:map() => map:with("collections", ("coll1")));
Search the collection:
xquery version "1.0-ml";
let $myColl:= xdmp:document-get-collections("/a.xml")
return
cts:search(/root,
cts:and-query((cts:collection-query($myColl),cts:element-query(xs:QName("a"),"aaa")
)))

Which way is better in optimal terms for use Active Record query

Actually I am working in a WebSite that uses MySql DataBase and I want to know which of these two forms is better to make:
Using the object to create others queries:
#inform=Dailyinform.where(:scheduled_start => (##datestart..##dateend)).order("scheduled_start DESC").searchServer(##serverSearch).searchDomain(##domain_name)
#dailyInforms=#inform.where.not(:status => 'COMPLETED').searchFailed.page(params[:page]).per_page(5000)
#restarts=#inform.searchRelaunched.order("scheduled_start DESC").page(params[:page]).per_page(5000)
or Directly with the ActiveRecord Model - Base:
#dailyInforms=Dailyinform.where.not(:status => 'COMPLETED').where(:scheduled_start => (##datestart..##dateend)).order("scheduled_start DESC").searchFailed(##serverSearch).page(params[:page]).per_page(5000)
#restarts=Dailyinform.where(:scheduled_start => (##datestart..##dateend)).order("scheduled_start DESC").search2(##serverSearch).page(params[:page]).per_page(5000)
My question arose because i felt that the second way is faster that the first way. Maybe is just my impression! Thanks
When it comes to performance I don't think there will be any difference. Active record is good at being lazy when it comes to making queries, it won't perform any until you actually try to use the result.
For this reason I would say that the first method is preferable since it is more DRY.

mongodb translation for sql INSERT...SELECT

How to simply duplicate documents from collectionABC and copy them into collectionB if a condition like {conditionB:1} and add a timestamp like ts_imported - without knowing the details contained within the original documents?
I could not find a simple equivalent for mongodb which is similar to mysql's INSERT ... SELECT ...
You can use javascript from mongoshell to achieve a similar result:
db.collectionABC.find({ conditionB: 1 }).
forEach( function(i) {
i.ts_imported = new Date();
db.collectionB.insert(i);
});
I realise that this is an old question but...there is a better way of doing it now. MongoDB has now something called aggregation pipeline (v 3.6 and above, maybe some older ones too - I haven't checked). The aggregation pipeline allows you to do more complex things like perform joins, add fields and save documents into a different collection. For the OP's case, the pipeline would look like this:
var pipeline = [
{$match: {conditionB: 1}},
{$addFields: {ts_imported: ISODate()}},
{$out: 'collectionB'}
]
// now run the pipeline
db.collectionABC.aggregate(pipeline)
Relevant docs:
Aggregation pipeline
$out stage
some important limits
Mongodb does not have that kind of querying ability whereby you can (inside the query) insert to another collection based upon variables from the first collection.
You will need to pull that document out first and then operate on it.
You could technically use an MR for this but I have a feeling it will not work for your scenario.
It seems that this works in the context of sequence generation
http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/

Why can't we use :key => value in CodeIgniter's DB library?

I've used to use something like this with the regular PHP's PDO:
$data = array(
":name" => "james",
":location" => "Palo Alto, CA"
);
SQL:
SELECT * FROM people WHERE name LIKE :name and location = :location
When i've started to use Codeigniter, it won't let me use namespace keys anymore, it only accepts the traditional ? marks.
Anyway to fix that?
Unfortunately, no, there isn't. Not with CodeIgniter natively.
It helps to remember that CodeIgniter's roots are in PHP4 compliant code (and some of the things they did are not even the most recent PHP 4 -- they use a custom file searching system which is substantially slower than glob which was around by PHP 4.3 (4.4? It was around for the minimum required version), this means that the old '?' was really the best option at the time.
If you feel better about using the newer style, then you might be better off using the PDO classes. They're better and faster anyway. (Frankly, I only use CI's DB classes for compliance. I have a very strong preference for PDO's, especially since all of the modern frameworks seem to use them). I will warn you, though, that the use of PDO's completely invalidates the ActiveRecord framework offered by CodeIgniter. You will not be able to use $this->db->select()->from('table')->where($array)->limit(1,4); More importantly, you will need to know the differences between the different dialects of SQL, something CodeIgniter lets you avoid (and your code won't be DB agnostic anymore).
Maybe you will be more comfortable using Active Record in Codeigniter and doing something like
$this->db->like();
Look here: http://codeigniter.com/user_guide/database/active_record.html