This is probably a noob question, so I apologize in advance.
The HBase console, as far as I understand, is an extension (or a script running over) JIRB. Also, it comes with several HBase-specific commands, one of which is 'get' - to retrieve columns\values from a table.
However, it seems like 'get' only writes to screen and doesn't output values at all.
Is there any native hbase console command which will allow me to retrieve a value (e.g. a set of rows\columns), put them into a variable and retrieve their values?
Thanks
No, there is not a native console command in 0.92. If you dig into the source code, there is a class Hbase::Table that could be used to do what you want. I believe this is going to be more exposed in 0.96. At this point, I have resorted to adding my own Ruby to my shell to handle a variety of common tasks (like using SingleColumnValueFilters on scans).
Related
I have many, large JSON files that I'd like to run some analytics against. I'm just getting started with SparkSQL and am trying to make sure I understand the benefits between having SparkSQL read the JSON records into an RDD/DataFrame from file (and have the schema inferred) or to run a SparkSQL query on the files directly. If you have any experience using SParkSQL either way I'd be interested to hear which method is preferred and why.
Thank you, in advance, for your time and help!
You can call explain() as an action instead of show() or count() on a dataset. Then Spark will show you the selected physical plan.
You can find the above picture here. As far as I know there should be no difference. But I prefer to use the read() method. When I use an IDE, I can see all the available methods. When you do it with SQL, there could be a mistake like slect instead of select, but you will get the error first, when you run your code.
My php script pulls about 1000 names from the mysql db on a certain page. These names are used for a javascript autocomplete script.
I think there's a better method to do this. I would like to update the names with a cronjob once a day (php) and store the names locally in a text file? Where else can I store it? It's not sensitive info.
It should be readable and writable to php.
Since you only need the data updated once a day, have a cron-script generate a static json file in some fixed location. Then read this with ajax on the client and make sure it caches it on the client.
Or potentially, generate the file whenever the database is updated (if this is applicable, I don't know your application)
You could try Memcache. But that could be like using a sledge-hammer to crack a nut.
Edit What about storing the data as simple file and let users (JavaScript) download it. Clients would not query the server for every key stroke because they could search for matching values themself. Format could be JSON because it is simple and JavaScript native.
It's unlikely reading from a text file will be much faster than a database query - MySQL already does a lot of caching that should make your query speedy.
If you need to make this query often and performance is a problem often you could consider using a caching module for PHP.
Related
The best way of PHP Caching
I have lots of stuff in an app.config, and when changes are necessary, an app restart is required. Bad for my 24x7 web server system (it really is 24x7, not even 23x7). I would like to use a good strategy for keeping the config information in a DB table and query/use it as needed. I googled around a bit and am coming up dry. Does anyone have any suggestions before I re-invent the wheel?
Thanks.
I needed exactly this for my recent application, and couldn't use any application server specific techniques as I needed some console apps run on cronjobs to access them too.
I basically made a couple of small tables to create a registry-style configuration database. I have a table of keys (which all have parent-keys so they can be arranged in a tree structure) and a table of values which are attached to keys. All keys and values are named, so my access functions look like this:
openKey("/my_app");
createKey("basic_settings");
openKey("basic_settings");
createValue("log_directory","c:\logs");
getValue("/my_app/basic_settings","log_directory");
The tree structure allows you to logically separate similar data (e.g. you can have a "log_directory" value under several different keys) and avoids having the overly verbose names you find in properties files.
All the values are just strings (varchar2 in the db), so there's some overhead in converting booleans and numbers: but it's only config data, so who cares?
I also create a "settings_changed" value that has a datetime string in it: so any app can quickly tell if it needs to refresh it's configuration (you obviously need to remember to set it when you change anything though).
There may be tools out there to do this kind of thing already: but this was only a days worth of coding and works a treat. I added command line tools to edit and upload/download parts or all of the tree, then made a quick graphical editor in Java Swing.
Is possible? I plan to backup mysql database using vb.net. If you know any beginner tutorial on how to do this, then please let me know.
If I understand the question correctly, you want to start mysql.exe from VB.Net and supply some command-line parameters to tell it what to do (in your case, do a database backup). (FWIW, you might be better off using mysqldump.exe for that.)
You can start any process via the System.Diagnostics.Process.Start method. You'll probably want either this variant that gives you full control over the startup, or this one that just lets you specify the process (EXE) name and arguments. Here's a page with lots of examples. They're in C#, but it shows using these methods; the syntactic differences don't (I suspect) get in the way much.
Off-the-cuff example:
Imports System.Diagnostics
Process.Start("mysqldump.exe", "myDatabaseName")
' Or include the path:
Process.Start("C:\Program Files\MySQL\MySQL Server\bin\mysqldump.exe", "myDatabaseName")
I'd love to do this:
UPDATE table SET blobCol = HTTPGET(urlCol) WHERE whatever LIMIT n;
Is there code available to do this? I known this should be possible as the MySQL Docs include an example of adding a function that does a DNS lookup.
MySQL / windows / Preferably without having to compile stuff, but I can.
(If you haven't heard of anything like this but you would expect that you would have if it did exist, A "proly not" would be nice.)
EDIT: I known this would open a whole can-o-worms re security, however in my cases, the only access to the DB is via the mysql console app. Its is not a world accessible system. It is not a web back end. It is only a local data logging system
No, thank goodness — it would be a security horror. Every SQL injection hole in an application could be leveraged to start spamming connections to attack other sites.
You could, I suppose, write it in C and compile it as a UDF. But I don't think it really gets you anything in comparison to just SELECTing in your application layer and looping over the results doing HTTP GETs and UPDATEing. If we're talking about making HTTP connections, the extra efficiency of doing it in the database layer will be completely dwarfed by the network delays anyway.
I don't know of any function like that as part of MySQL.
Are you just trying to retreive HTML data from many URLs?
An alternative solution might be to use Google spreadsheet's importHtml function.
Google Spreadsheets Lets You Import Online Data
Proly not. Best practises in a web-enviroment is to have database-servers isolated from the outside, both ways, meaning that the db-server wouldn't be allowed to fetch stuff from the internet.
Proly not.
If you're absolutely determined to get web content from within an SQL environ, there are as far as I know two possibilities:
Write a custom MySQL UDF in C (as bobince mentioned). The could potentially be a huge job, depending on your experience of C, how much security you want, how complete you want the UDF to be: eg. Just GET requests? How about POST? HEAD? etc.
Use a different database which can do this. If you're happy with SQL you could probably do this with PostgreSQL and one of the snap-in languages such as Python or PHP.
If you're not too fussed about sticking with SQL you could use something like eXist. You can do this type of thing relatively easily with XQuery, and would benefit from being able to easily modify the results to fit your schema (rather than just lumping it into a blob field) or store the page "as is" as an xhtml doc in the DB.
Then you can run queries very quickly across all documents to, for instance, get all the links or quotes or whatever. You could even apply XSL to such a result with very little extra work. Great if you're storing the pages for reference and want to adapt the results into a personal "intranet"-style app.
Also since eXist is document-centric it has lots of great methods for fuzzy-text searching, near-word searching, and has a great full-text index (much better than MySQL's). Perfect if you're after doing some data-mining on the content, eg: find me all documents where a word like "burger" within 50 words of "hotdog" where the word isn't in a UL list. Try doing that native in MySQL!
As an aside, and with no malice intended; I often wonder why eXist is over-looked when people build CMSs. Its a database that can store content in its native format (XML, or its subset (x)HTML), query it with ease in its native format, and can translate it from its native format with a powerful templating language which looks and acts like its native format. Sometimes SQL is just plain wrong for the job!
Sorry. Didn't mean to waffle! :-$