Is there any way to analyze unstructured data of Sakai LMS? Does anyone have any experience of Sakai unstructured data analytics? Please share your experience and thoughts. I am trying to learn.
Thank you.
Related
I work for web hosting company looking to integrate different data sources with BigQuery but the question now is what would be an ideal reporting/BI tool to get the data from BigQuery so proper/fast/easy retrieval/analysis/ reporting can be done with it.
I'm looking into the options suggested by google here: https://cloud.google.com/bigquery/partners/ but I was wondering if someone out there has possibly a more hands-on experience that could make a recommendation.
the company works with a mysql based billing system (with client, support, service data) which is the main source of info, along with other chat, cms and inhouse-developed systems that provide other sources of information that allow to maintain the web infrastructure where the business depends on.
Thank you.
It's really hard to answer this. Depends on the personnel you have at hand.
We are doing for idea validation mostly Data Studio.
Some personnel knows Tableau, but once you are out from GCP, all become a slow process, queries and interface updates in 30-60 seconds, as they all relay and store on their own the data.
We have wired some data to ElasticSearch as well, and we use Kibana.
But once it's all validated, we consolidated into our own Dashboards the reports. Mainly because we are mostly developers and can do the programming. If you have a data analyist or data scientist with their own tools, let them use what they are comfortable with.
Always do iteration and versioning, you as a developer should be driven by a good product manager who tells exactly what charts to build out.
I need some step-by-step tutorial or some papers that I can learn how to create my own JSON schema mining tool. It would be great if someone can share sources or ideas. Thanks!
We need to pull data into our data warehouse. One of the data source is from internal.
We have two options:
1. Ask the data source team to expose the data through API.
2. Ask the data source team dump the data at a daily base, grant us a read only db credential to access the dump.
Could anyone please give some suggestions?
Thanks a lot!
A lot of this really depends on the size and nature of the data, what kind of tools you are using, whether the data source team knows of "API", etc.
I think we need a lot more information to make an informed recommendation here. I'd really recommend you have a conversation with your DBAs, see what options they have available to you, and seriously consider their recommendations. They probably have far more insight than we will concerning what would be most effective for your problem.
API solution Cons:
Cost. Your data source team will have to build the api. Then you will have to build the client application to read the data from the api and insert it into the db. You will also have to host the api somewhere as well as design the deployment process. It's quite a lot of work and I think it isn't worth it.
Pefromance. Not necessary but usually when it comes to datawarehouses it means one has to deal with a lot of data. With an api you will most likely have to transform your data first before you can use bulk insert features of your database
Daily db dumps solution looks way better for me but I would change it slightly if I were you. I would use a flat file. Most databases have a feature to bulk insert data from a file and it usually turns out to be the fastest wat to accomplish the task.
So from what I know from your question I think you should the following:
Aggree with you data source team on the data file format. This way you can work independently and even use different RDBMS.
Choose a good share to which both the data source team db and your db have fast access to.
Ask the data source team to implement the export logic to a file.
Implement the import logic from a file.
Please note items 3 and 4 should take only a few lines of code. As I said most databases have built in and OPTIMIZED functionality to export/import data to a file.
Hope it helps!
A friend wants to start scraping data for a data-heavy site he wants me to try to build. I'm a (relatively new) Rails developer and don't know much about the data side of all this. If he's contracting out the scraping, any idea what sort of format can/should I get the data in to easily import it into a PostgreSQL database once I get the site started up?
Hope this isn't too vague a question. I don't know where to start looking for this.
CSV file format is compatible with almost any database systems and it is quite a good starter. Even, if you change your mind later, as for what database system you'll use, you don't have to worry too much about changing the format.
If you thinking about data mining, then probably NoSQL database systems can be a better solution (MongoDB, CouchDB, etc.). Then, then file format can be JSON as well.
We have been using MongoDB as our main datastore for a little while but I've been finding it difficult to please our business users with it's lack of relations.
What I'm hoping to do is dump data from Mongo to MySQL to allow these heavy MySQL users the ability to query in a format they are used to.
Does anyone know an elegant way to do this?
I'm hoping to avoid some heavy scripting to get there...
Read all the data ! Write all the data !
Use your language specific driver for MongoDb and Mysql to fetch all the data from MongoDb and insert into Mysql as per your table schemas. You will need some coding for sure.
Update: You can give a try to this tool :
http://grahamis.com/blog/2010/06/10/squealer-intro/
https://github.com/delitescere/squealer
Strange question. Do you except arbitrary documents would fit into a RDBMS schema?
You must be dreaming. Try the CSV export...everything else is up to you and your coding skills.
The answer is Kettle.
Thanks to Adam our engineer he found the solution.
http://kettle.pentaho.com/
They have added a tool to connect to Mongo.
You can then migrate the data to MySQL.