How do you database access (I/O) to/from Magento Commerce? - mysql

So, I want to import, export and modify the database. I have read that I have to do that by XML, but I don't really understand their doc system and I haven't found any good tutorials out there that explain this. I am slowly reading the very expensive and short book which is somewhat answering my questions, but I crave more.
As a second question, I want to have a order system where I can send out information or emails with my own code. I assume this would be some type of plug-in that would override or be called at a certain time. Any info would be helpful.

Some parts of the magento data can be imported/exported via the backend (System->Import/Export), namely products and customers.
If you want to deal with the complete DB - use your DB tool of choice (I prefer mysqldump).
When dealing with exported CSV.. use OpenOffice, from my experience it deals better with the separation characters than Excel.
As for your second question - as far as I understood, you will have to develop a module if you want to do something different than the existing functionality and keep the original mail functions. If you don't want to/have to keep the original functions, you can opt to overwrite the module, which is much easier as far as I can see. Google search for "overriding magento module" should turn up atleast one decent tutorial.

I found what I was looking for here:
(on magento site: Resources -> Magento Core API -> Product API or whichever API you want)
The problem is there is no Order API yet (or none that I've seen)
http://www.magentocommerce.com/wiki/doc/webservices-api/api/catalog_product#examples
This details how you'd write an external php script and obtain,edit or delete products (or anything else with an API).
Modules still look daunting, but I am reading through the (very thin) magento book (the only one available).
I hope this helps someone else.

Related

Converting large JSON file to XLS/CSV file (Kickstarter campaigns)

As part of my Master's thesis, I'm trying to run some statistics on which factors affect whether crowdfunding campaigns get funded or not. I've been trying to get data from the largest platform Kickstarter.com. Unfortunately, they have removed all the non-successful campaigns from their website (unless you have the direct link).
Luckily, I'm not the only one looking for this data.
Webrobots.io have a scraper robot which crawls all Kickstarter projects and collects data in JSON format (http://webrobots.io/kickstarter-datasets/).
The latest dataset can be found on:
http://webrobots.io/wp-content/uploads/2015/10/Kickstarter_2015-10-22.json_.zip
However, my programming skills are limited, and I don't know how to convert it into an excel file where I can manipulate the data and run my analysis. I found a few online converters, but the file is far too big for it (approx 300 mb).
Can someone please help me get the file converted?
It will earn you an acknowledgement in my Master's thesis when it gets published :)
Thanks in advance!!!
I guess the answer for this varies massively on a few things.
What subject is the masters covering? (mainly to appease many people who will probably assume you're hoping for people to do your homework for you! This might explain why the thread has been down-voted already)
You mention your programming skills are limited... What programming skills do you have? What language would you be using to achieve this goal? Bear in mind that even with a fully coded solution, if it's not in the language you know, you might not be able to compile it!
What kind of information do you want from the JSON file?
With regards to question 3, I've looked in the JSON file and it contains hierarchical data which is pretty difficult to replicate in a flat file i.e. an Excel or CSV file (I should know, we had to do this a lot in a previous job of mine).
But, I would look at the following plan of action to achieve what you're after:
Use a JSON parser to serialize the data into a class structure (Visual Studio can create the classes for you... See this S/O thread - How to show the "paste Json class" in visual studio 2012 when clicking on Paste Special?)
Once you've got the objects in memory, you can then step through them one by one and pick out the data you want and append them to a comma-separated string (in C# I'd use the StringBuilder) and write the rows of data out to a file on disk.
Once this is complete, you'll have the data you want.
Depending on what data you want from the JSON file, step 2 could be the most difficult part as you'd need to step into the different levels of the data hierarchy.
Hope this points you in the right direction?
You may want to look at this Blog.
http://jdunkerley.co.uk/2015/09/04/downloading-and-parsing-met-office-historic-station-data-with-alteryx/
He uses a process with Alteryx that may line up with what you are trying to do. I am looking to do something similar, but haven't tried it yet. I'll update this answer if I get it to work.

Solr and Lucene, including in web application

I want to add simple search functionality into an existing Java web application.
Search should be done on existing database fields.
It is an web-applicaiton deployed on wildfly, REST-Services and MySql-DB.
After some research, my first impression was, using Solr I will get what I want.
BUT: As I'am not allowed to deploy one more application to customers environments, Solr doesn't fit any more.
As I understood, there are two ways to fix this:
Using EmbeddedSolr
"Self-build solr" (http://javaskeleton.blogspot.de/2011/07/adding-solr-to-existing-web-application.html)
Which way should I go, to implement search to my web-app ?
Or should I switch to Lucene ?
The 2nd way seems old and although the post seems to be removed I think I got what the author meant, from the title.
IMO the 1st way is better because you will be using Solr as it should be used, as a black box, without mixing up things with your webapp.
Having said that, keep in mind that the embedded solr isn't a good choice for a production environment because it is a standalone module, mainly not scalable.
I suggest you to write your Solr client code in a decoupled way: your webapp should deal only with SolrServer abstract class. Behind the scenes you'll instantiate an EmbeddedSolrServer at the moment. Later, if you want to scale your search service, this design will let you to switch to another impl (LBHttpServer, SolrCloud) with a small refactoring effort.
So I will describe my way that I've chosen.
First of all, yes Lucene is my friend.
In my web-app I creaded an #WebListener. This will create one Index, and delete it if already exists, at start of my web-app.
The content of the Index are some database-filed values of three Objects, that have to be searched.
In my SearchService(REST) I build up my QuerySearch, and acces this Index.
Additonally I want extend existing REST-Services (not yet done). So when editing objectTypes (CUD) included in Index, the Index have to be update.
Feel free, to give me some suggestions or best practices.

How do I extend this Netbeans JSF2 CRUD example to have a single create and edit form for all entities?

I recently discovered this very useful Netbeans tutorial for creating a simple JSF 2 CRUD application http://netbeans.org/kb/docs/web/jsf20-crud.html. The final product has somewhat limited usability as one is confronted with a myriad of web pages. I would like an example of how to consolidate the Create and Edit forms (using the same project if possible). This seems more in keeping with how a person would actually enter such information and would reduce the risk of data entry mistakes. Why enter a client and their billing address on separate screens? One should be able to add or remove addresses, if need be, on the client's edit form. Or if a new client has multiple addresses, enter them all on the client's create form. The application just seemed incomplete with no further tips on how to improve it. If one has knows of a useful book that covers this, then I would gladly read that as well. Thanks.
I didn't realize the complexity of my problem and found that I couldn't get what I needed using JSF2 with the information resources available. Through my searches, I also found that many others were asking about Master-Detail CRUD applications, which I then learned was what I needed, but in slightly different ways and not getting any solid examples. A problem properly stated is half solved and I didn't know the problem statement. Armed with more knowledge, I was shocked to find that the answers were not readily available outside of some videos on YouTube showcasing Oracle ADF. In the end, I was able to quickly build the application I desired using the Play! Framework. In a way, by not having my question answered I was able to find a solution that would prove to be a better fit for my needs; though I would have gladly bought a cookbook if someone had pointed one out.

How can I get started on programmatically analyzing web site content?

I've been looking for a new hobby programming project, and I think it would be interesting to dabble in ways to programmatically gather information from websites and then analyze that data to do things like aggregate or filter it. For example, if I wanted to write an application that could take Craiglist listings and then do something like display only the ones matching a specific city not just a geographical area. That's just a simple example, but you could go as advanced and sophisticated as how Google analyzes a site's content to know how to rank it.
I know next to nothing about that subject and I think it would be fun to learn more about it, or hopefully do a very modest programming project in that topic. My problem is, I know so little that I don't even know how to find more information about the subject.
What are these types of programs called? What are some useful keywords to use when searching on Google? Where can I get some introductory reading material? Are there interesting papers I should read?
All I need is someone to disabuse me of my ignorance, so that I can do some research on my own.
cURL (http://en.wikipedia.org/wiki/CURL) is a good tool to fetch a website's contents and hand it off to a processor.
If you are proficient with a particular language, see if it supports cURL. If not, PHP (php.net) may be a good place to start.
When you have retrieved a website's content via cURL, you can use the language's text processing functionality to parse the data. You can use regular expressions (http://www.regular-expressions.info/) or functions such as PHP's strstr() to find and extract the particular data you seek.
Programs that "scan" other sites are usually called web crawlers or spiders.
I recently completed a project that uses Google Search Appliance that basically crawls the whole .com domain of the web server.
GSA is very powerful tool that pretty much indexes all the urls it encounters and serves the results.
http://code.google.com/apis/searchappliance/documentation/60/xml_reference.html

Mechanical Turk: Using HTML in the API

Question for anyone who's used Mechanical Turk: Is it possible to take an HTML template created on Mechanical Turk's website, and then create more HITs based on that template from the command line tools or API?
According to the API docs, it's not possible to create new HTML and add it...from the API. However, what I want to do here is use a HIT template I already created. It would seem like there should be a way to use that template (and load up new data in the API), since Amazon already approved it and I'm using it for HITs already. But I haven't seen a way in the documentation to do so.
The main reason I want the HTML is so I can apply styles that I can't apply by using a questions file. If there was some sort of "rich" question file, that might solve the problem.
You could post a job on Mechanical Turk to have a person take your template and insert your data into it for each HIT you want to create.
(yes, this is at least half sarcasm)
I know this is an old question, but the API has been updated to allow this using HITLayout: http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_HITLayoutArticle.html.
As far as I know, I haven't seen a way to use manually created questions from the API.
If you're planning on doing programmatic access, it may be easier to use the API in its entirety (i.e., specify your questions via XML and create HITs from that question):
http://www.codeplex.com/MTurkDotNet (.NET SDK)
The API is pretty easy to use, and there several code samples.
Alternatively, you can use the "External Question" question type which may be better suited -- you can host the entire question form yourself.