Report Reindex taking too long after destroy - mysql

I have an product report that lists all products registered.
When i destroy(delete) one of items from de product list i need the item to be removed from the report list.
I do use Sunspot Solr with Mysql.
I tried the following way:
after_destroy { ProductsReport.reindex; Sunspot.commit }
But because of my gigantic list of products it takes too long to execute.
Is that a simple or more performing way to do it?
By the way, my system is built in Ruby on Rails.
Thanks in advance.

You may very well be able to optimize this operation, but the details of how to do it depend on your data model and your Solr setup. I also question whether a full reindex is needed on each delete. Can you just delete the Solr document for the deleted record?
Regardless, I recommend updating your search cluster asynchronously using a queueing service. Popular options for Rails apps include DelayedJob and Resque.

The previous answer is correct - instead of reindexing, you should only remove the document in question from Solr, There is no need to reindex all the documents if only one document has changed.
In Sunspot you do can do this with Sunspot.remove(Doc).

Related

Manually input orders in woocommerce databases?

I have a question about manually inputting information in a MySQL databases (specifically in meaning Woocommerce order databases). I have some orders, that I get from another database and have to input them in in a Woocommerce database, which by now I found out that it consists out of several different ones, and not from just one. The Woocommerce is a plugin in Wordpress. Does anybody have any idea on how that could be done?
Some additional information: I am working with Wordpress 4.4.2 and Woocommerce 2.5.5.
I would not try to directly insert data into woocommerce's database because this method is not really documented and would require intricate knowledge of woocommerce's database.
I would rather use woocommerce's rest api to create the orders. The documentation has examples for various programming languages. There is another bulk creation api as well, but found no examples on how exactly to use that one, since the example in the documentation is on bulk updates only.

Data Migrating document for Couchbase (i.e changing existing field type)?

I am coming from object relation database background, I understand Couchbase is schema-less, but data migration will still happen as the application develop.
In SQL we have management tool to alter table, or I can write migration script with SQL to do migration from version 1 table to version 2 table.
But in document, say we have json Document UserProfile:
UserProfile
{
"Owner": "Rich guy!",
"Car": "Cool car"
}
We might want to add a last visit field there, allow user have multiple car, so the new updated document will become follows:
UserProfile
{
"Owner": "Rich guy!",
"Car": ["Cool car", "Another car"],
"LastVisit": "2015-09-29"
}
But for easier maintenance, I want all other UserProfile documents to follow the same format, having "Car" field as an array.
From my experience in SQL, I could write migration script which support migrating different version of table. Migrate from version 1 table to version 2...N table.
So how can I should I write such migration code? I will have to really just writing an app (executable) using Couchbase SDK to migrate all the documents each time?
What will be the good way for doing migration like this?
Essentially, your problem breaks down into two parts:
Finding all the documents that need to be updated.
Retrieving and updating said documents.
You can do this in one of two ways: using a view that gives you the document ids, or using a DCP stream to get all the documents from the bucket. The view only gives you the ids of the documents, so you basically iterate over all the ids, and then retrieve, update and store each one using regular key-value methods. The DCP protocol, on the other hand, gives you the actual documents.
The advantage of using a view is that it's very simple to implement, works with any language SDK, and it lets you write your own logic around the process to make it more robust and safe. The disadvantage is having to build a view just for this, and also that if the data keeps changing, you must retrieve the ENTIRE view result at once, because if you try to page over the view with offsets, the ordering of results can change, thus giving you an inconsistent snapshot of the data.
The advantage of using DCP to stream all documents is that you're guaranteed to get a consistent snapshot of your data even if it's constantly changing, and also that you get the whole document directly as part of the stream, so you don't need to retrieve it separately - just update and store back to the database. The disadvantage is that it's currently only implemented in the Java SDK and is considered an experimental feature. See this blog for a simple implementation.
The third - and most convenient for an SQL user - way to do this is through the N1QL query language that's introduced in Couchbase 4. It has the same data manipulation commands as you would expect in SQL, so you could basically issue a command along the lines of UPDATE myBucket SET prop = {'field': 'value'} WHERE condition = 'something'. The advantage of this is pretty clear: it both finds and updates the documents all at once, without writing a single line of program code. The disadvantage is that the DML commands are considered "beta" in the 4.0 release of Couchbase, and that if the data set is too large, then it might not actually work due to timing out at some point. And of course, that fact that you need Couchbase 4.0 in the first place.
I don't know of any official tool currently to help with data model migrations, but there are some helpful code snippets depending on the SDK you use (see e.g. bulk updates in java).
For now you will have to write your own script. The basic process is as follow:
Make sure all your documents have a model_version attribute that you increment after each migration.
Before a migration update your application code so it can handle both the old and the new model_version, and so that new documents are written in the new model.
Write a script that iterate through all the old model documents in your bucket(you need a view that emits the document key), make the update you want, increment model_version and save the document back.
In a high concurrency environment it's important to have good error handling and monitoring, you could have for example a view that counts how many documents are in each model_version.
You can use Couchmove, which is a java migration tool working like Flyway DB.
You can execute N1QL queries with this tool to migrate your documents and keep tracking of your changes.
If I understood correctly, the crux here is getting and then 'update every CB docs'. This can be done with a view, provided that you understand that views are only 'eventually consistent' (unlike read/write actions which are strongly consistent).
If (at migration-time) no new documents are added to your bucket, then your view would be up-to-date and should return the entire set of documents to be migrated. easy.
On the other hand, if new documents continue to be written into your bucket, and these documents need to be migrated, then you will have to run your migration code continually to catch all these new docs (since the view wont return them until it is updated, a few seconds later).
In this 2nd scenario, while migration is happening, your bucket will contain a heterogeneous collection of docs: some that have been migrated already, some that are about to be migrated and some that your view has not 'seen' yet (because they were recently added) and would only be migrated once you re-run the migration code.
To make the migration process efficient, you'll need to find a way to differentiate between already-migrated items and yet-to-be-migrated items. You can add a field to each doc with its 'version number' and update it during the migration. Your view should be defined to only select documents with older 'version number' and ignore already-migrated items.
I suggest you read more about couchbase views - here and on their site.
Regarding your migration: There are two aspects here: (1) getting the list of document ids that need to be updated and (2) the actual update.
The actual update is simple: you retrieve the doc and save it again with the new format. There's no explicit schema. Where once you added column in SQL and populated it, you now just add a field in the json-doc (of all the docs). All migrated docs should have this field. Side note: Things get little more complicated if (while you're migrating) the document can be updated by another process. This requires special handling (read aboud CAS if that's the case).
Getting all the relevant doc-keys requires that you define a view and query it. Its beyond the scope of this answer (and is very well documented). Once you have all the keys, you simply iterate them one by one and update them.
With N1QL, Couchbase provides the same schema migration capabilities as you have in RDBMS or object-relational database. For the example in your question, you can place the following query in a migration script:
UPDATE UserProfile
SET Car = TO_ARRAY(Car),
LastVisit = NOW_STR();
This will migrate all the documents in your bucket to your new schema. Note that update statements in Couchbase provide document-level atomicity, not statement-level atomicity. But since this update is idempotent (repeatable), you can run it multiple times if you run into errors. Note: similar to the last paragraph of David's answer above.

Configuring Sphinx to index a dynamic set of tables

I'm in the process of setting up a new WordPress 3.0 multisite instance and would like to use Sphinx on the database server to power search for the primary website. Ideally, this primary site would offer the ability to search against its content (posts, pages, comments, member profiles, activity updates, etc.) as well as all of the other sites that are a part of the network. Because we'll be adding new sites to the network on a regular basis, I'd like to be able to dynamically add those newly generated tables to the Sphinx .conf file (instead of editing the file and reindexing every time we add a new site).
Unfortunately, MySQL doesn't seem to support wildcards when specifying the table(s) in a query string. The best solution I've come across for grabbing a dynamic set of tables is grepping but I'm pretty certain I don't know how to do this within the .conf file (unless it's possible through magical sorcery).
Is it possible to dynamically specify tables to add to the Sphinx index? Or is this going to cause such performance issues that I'm using the wrong tool?
You could try to dynamically modify the .conf file instead.
You could query from a MySQL view that aggregates the many tables. You'd have to recreate the view with each change to the list of blogs, but I believe that all the hooks exist to support that and it should be easy enough to construct the view query.
The bigger problem may be in trying to find a suitable unique record ID for the posts in Sphinx. It has to be a straight INT, but the post IDs from the different blogs will collide with each other.
I think you can create triggers (INSERT/UPDATE/DELETE) in MySQL on the interested tables (e.g. posts, comments etc) and migrate the data to centralized global tables that are indexed by Sphinx in real time.
The point is how you can create those triggers automatically? Either you can run a cron job to scan for new tables in MySQL, or I believe you can write a simple Wordpress plugin that hook when a blog is activated.

Mysql adapter for Zend_Translate

I'm currently in the planning phase of a rather large project that I'll develop in the Zend Framework. One of the problems I'm facing is that the customers will want to translate not only the content but also the interface. I'm currently using gettext and poedit to manage my language files but this is not an option for the customer as they, for one, wont have FTP access to the site.
Hence, I'm thinking of a mysql back end with an interface in the front end for the customer to manage his own translations of the interface. There is however still no mysql adapater for Zend_Translate.
So, does anybody now of an adapter script for Zend_Translate so it can work with a mysql table? Or any arguments against using mysql and possible other solutions for this problem?
You could solve this problem on different ways:
Extend Zend_Translate_Adapter to create your own. All new adapters are only responsible from getting the translations out from the source. That is, you would need only to fetch the translations from the database. Look at other adapters and see how they are implemented.
Fetch the data from the database and pass it to Zend_Translate_Adapter_Array
Use Zend_Translate_Adapter_Csv or Ini. As there would be more reading the writing on the translations, this solution would cut down the number of queries to the database. When the client adds a new language or changes an existing one, simply write it to a file, not the database.
If you decide to go with the database adapter, maybe you could "tag" somehow the translations, so that on the home page you fetch only the translations for the home page, on the contact page only the translations for the contact page...
HTH!
Default Zend adapters handle caching well, so I'd stick to them, unless you really need database.
Instead storing the translation data in the database, you may directly operate on the translation files (e.g. po templates). This would be the best choice if you just needed to add (append to file) new translation strings.
You may use Zend_Translate's option to log untranslated messages (to file or any log adapter, including database),
and then handle the logs, or even create listener translating the saved strings.
Here's how: http://cloetensbrecht.be/zend_translate_mysql.html

XML/MySQL: Storing complete XML in MySQL and performing CRUD operations

Is is possible to store directly complete XML file into MySQL db and perform CRUD operations on XML data present in MySQL ?
Scenario: I am getting XML file which has product related details like product_id, product_spec, product_price and many more and I have to store all this details into MySQL database and whenever user enter portal he selects for particular product from the product catalog and depending upon his selection shopping cart is populated i.e, depending upon user's selection shopping cart performs read operation on MySQL db to get the relevant data for the product, as an aside XML file which I am getting from 3rd party is very large as it has millions of products with all relevant details.
If it is possible than what are ways to do it ?
I'd appreciate guidance.
for very large XML files I think what you want is a streaming parser, as opposed to a DOM based parser. See http://us3.php.net/manual/en/book.xml.php and Google examples and read books. The question is too broad to offer much more than that I think.
Beginning with MySQL 5.1.5, two functions providing basic XPath 1.0 capabilities are available as explained in the chapter 11.10 XML Functions in the reference manual.
Check out MySQL 5.1's New XML Functions or Using XML in MySQL 5.1 and 6.0 for some examples.