I read an article around schema-less database which sounds cool. (http://bret.appspot.com/entry/how-friendfeed-uses-mysql)
But what isn't clear to me is how do they run search queries on this data? Since the data is in JSON format how do we look for it?
For attributes that are needed for filtering / searching, they are first indexed using a separate table. This makes the data more transparent.
Let me quote what this post says: http://bret.appspot.com/entry/how-friendfeed-uses-mysql
Indexes are stored in separate tables. To create a new index, we create a new table storing the attributes we want to index on all of our database shards.
I'd imagine they have a separate search engine with its own index - probably not even in MySQL, something like Solr.
They're using sphinx for that
Related
Is it possible to search for something that is in two databases? For example, I want to do a "starts with" search on a column in Postgres as well as a column in MySQL where one is "name" and one is "email"
Copying over data is not reliable as new data will be created in both databases constantly.
Yes, it is possible. For the "starts with" part, you should be able to use the standard Postgres string functions, of which starts_with is one, and indexing on the desired columns.
Getting the data from MySQL is the more complicated part.
You would most likely want to use a foreign data wrapper (e.g. FDW) from Postgres to access the MySQL data, and then handle the unioning of it (or other desired processing) with the Postgres data for returning the combined data set.
You could write your own FDW if you have particularly specific requirements, or you could try an open source one, such as this one from EnterpriseDB. EnterpriseDB is a Postgres consultancy and offers their own Postgres version, but the doc on the Github page for this says it is compatible with base Postgres as well as their own version.
I'm building a Laravel app the core features are driven with rather large JSON objects. (the largest ones are between 1000-1500 lines).
I know there are better data base choices than MySQL for storing files and blocks of data, but for various reasons I will need to use MySQL for the application.
So my question is, how to I store my JSON objects most effective in MySQL? I will not need to do any queries on the column that holds the data, there will be other columns for identifying it. Something like this:
id, title, created-at, updated-at, JSON-blobthingy
Any ideas?
You could use the JSON data type if you have MySQL version 5.7.8 or above.
You could store the JSON file on the server, and simply reference its location via MySQL.
You could use also one of the TEXT types.
The best answer i can give is to use MySQL 5.7. On this version the new column type JSON are supported. Which handles large JSON very well (obviously).
https://dev.mysql.com/doc/refman/5.7/en/json.html
You could compress the data before inserting it if you don't need it searchable. I'm using the 'zlib' library for that
Simply, you can use the type longblob which can handle up to 4GB of data for the column holding the large JSON object where you can insert, update, and read this column normally as if it is text or anything else!
I'm using Slick 2.1.0 with Scala to do insertions and queries into a database. However, I might be using it for table creation as well, with a possible need to update the table's schemas. Can schema updates like this be done with Slick, or can it only do table creation?
I would say no, the fact that it maps tables and queries to entities makes it static, if you alter a table you would need to also modify your code to represent the table and you have to do it manually, the best that slick can do is generate the schema on demand using the schema generator tool, also I never saw something like this on the documentation.
If you use Play! there are some alternatives like using evolutions, but as far as I know this is as far as it can get.
I am evaluating a Mondrian-Saiku solution for a client.
After analyzing their current database schemas, I realize that what constitutes as their 'fact table data' is currently being stored in XML's. The XML 's themselves are stored as blob datatypes in a MySQL table. Think of it like this: the table holds all the transactions of the company; the details of each transaction are stored in their own XML; each XML string is stored as one of the field values in a given transaction row.
This presents a slight dilemma since the Mondrian XML schema requires the explicit use of column names.
Short of having to extract and transfer the XML data to new tables (not realistic for my purposes due to the size of data and dependencies from other systems), is there any way I can work my client's existing setup for the purposes of a Mondrian-Saiku implementation?
You need to expose the data in a traditional table way. What is the database here? Can you create a database view which does some xml processing on the XML in the blob and exposes the columns?
Alternatively maybe something like composite or jboss teiid can help here. These tools allow you to expose as a standard looking table, virtually anything. It may not be quick enough though!
I got indexed a Mysql database using Solr and everything is perfect. Now i got another database which uses exactly the same schema as my first database but with different data in it.
What i want is to use Solr to index also the second database using the same solr schema that i created for my first database since are completely the same!
I read that Solr cores allows you to run multiple instances that use different configuration sets and indexes, but in my case i got the same exactly configuration, the only thing that changes is the database name.
My question is what is the best way two create two Solr instances that use the same configuration?
Cheers
You could use two cores and share a schema. Just read the Wiki. But in practice you might want to keep the flexibility and just copy the schema for a second core.
How about using only one solr instance but have a field in the schema that contains a value which indicates which db/source the record came from.