I was hoping someone could help me answer a couple of questions regarding Tableau. I am not as familiar with the platform, but I have a client who is looking for a reporting/analytics/data visualization platform that they could use for many of the internal apps (for their employees) and external (customer facing internet with login) applications.
The driver is that each of their internal teams has used many disparate technologies such as SSRS, Crystal, custom ASP.NET controls (Kendo/Telerik, etc), but now they have the opportunity to choose a common platform that could serve most/all of the future reporting and data visualization needs for enterprise and customer facing solutions.
They are looking for a platform that provides everything from simple grids with basic filter/sort/group, all the way to rich charting and ad-hoc reporting with slicing and dicing of data.
They will not always be creating dashboards in these apps since they are customer-facing, but they may want to have dashboards for internal (intranet) apps. They will definitely want the ability to build true internal BI dashboards to report on data from all these online apps across all customers, to whom they provide their SaaS/customer-facing web apps.
One of our main concerns revolves around security of data, as some of these customer-facing web apps are multi-tenant, so we'd need to ensure that data is always filtered by the client tenant id. Also we have a very customized security model, with data driven roles, permissions that may prevent showing certain types of data (e.g. SSN, Salary, etc) etc.
Does Tableau fits this model, can it meet most/all of these requirements, or is it meant more for internal data?
It should be quite possible by setting up a reverse proxy that would front end your multi tenant web application. There is a document on how to setup Apache as reverse proxy with Tableau with/without SSL.
I am familiar with how to configure Apache as reverse proxy and so here are the details with Apache Web server on how to setup reverse proxy rules.
There may be some documentation for front ending with IIS/Nginx so you should do some googling by yourself.
You need to harden your webserver configuration by limiting access from the external firewall to read only pages and the internal user can access allpages. Since you mentioned that the external users are allowed access to readonly pages, I presume all the requests from external requests will be only GET requests and a few PUT/POST requests when users choose to use filters. So you can block external users from any request except GET. Exceptions should be made for the pages that allow applying filters and grouping.
In your mutitenant application make sure you refer to the tableau URL's by the apache server url that is exposed to the outside world. If any url not configured in apache is used, users will recieve a access denied error. You need to create a role that has readonly access to tableau pages for external users. To address mulitenancy you need to set a cookie or something to identify the tenant and something similar to identify the user. To filter SSN and some more information you can use mod_proxy_html which filters content. You can also use mod_security module of Apache to block SSNs and Credit Card Numbers.
References:
Configuring Apache Server as Proxy with Tableau
Apache mod Proxy documentation
Blocking POST requests
mod_security FAQs
Yes to most of your questions -- with just a little fine print.
First remember Tableau is primarily about visualizing data, so it is great for publishing readonly interactive views of data. If you want allow end users to edit data, you'll have to do that by another means. Fortunately, the Tableau JavaScript API lets you interact closely with Tableau with your custom Javascript code. So if your needs are mostly about visualization, but want want to be able to trigger some custom code to modify data in some of your apps, you should be fine. But Tableau is not designed for creating custom CRUD apps as a rule.
The great thing about Tableau server is that many people can learn to use it and publish their own visualizations -- even if they don't know how to program. That doesn't mean they will win visualization design awards the first time, or that they shouldn't learn something about how databases work if they want have good performance. But it does mean the people that know their data best can learn to design and publish their own visualizations without having to wait three months on a backlog queue so the one IT guy can change the color of a button or add a field. It still would be good to get good system, database and visualization folks to help train, organize data, set governance and security rules, optimize, etc, but business users can learn to be the ones with hands on control over how their information is presented. That's a good thing.
The security question has several moving parts, and usually there are usually good answers from Tableau depending on what you're trying to accomplish. Tableau server does support multi-tenancy using sites. There is fairly flexible permissions and group policy system. It can use SAML for authentication, and has several features providing access to specific to the user/tenant. It works with almost every database, and you can in some cases push your security enforcement to the database server -- SQL server for instance. There is a trusted ticket feature where you can defer some authorization decisions to another server, say a web portal server. Useful when Tableau visualizations are embedded in some other web page.
Most security use cases can be supported out of the box, but there are some complex custom access control situations that are tricky to implement currently in Tableau server. Nothing you've listed sounds out of the normal swim lane, but the only way to know whether your security model is too complex is to dive into the details. Hopefully they will release a custom access control API for users who want to extend it.
At the high level, you sure can use Tableau to build customer-facing dashboards. You can quickly build and deploy those and as others mentioned, you can iFrame them with Javascript APIs, you can customize most of it. But it doesn't provide complete flexibility for user interaction, which you can if you use other technologies. Other options include hand coding framework and then using charting applications.
For simple dashboards, Tableau would be the obvious choice if you have already bought core-licenses. But when looking at what's going on in the industry, Tableau will not be able to fulfill all needs.
If using Tableau
1. Building Charts/Tables/Visualization is a super simple, efficient way.
2. You can expose low grained data to customers, because of Tableau's propitiatory columnar database engine, you can potentially expose millions of records via a dashboard.
3. You can use Tableau's security and access control mechanism.
4. As other user mentioned, you can use trusted ticketing mechanism to integrate easily with other applications (portals etc).
Challenges with Tableau approach.
1. If you have late arriving transactions (in Internet world it's so common to mark a click as fraudulent after few days) with late arriving transactions, you have to have full refresh the extracts, which means if you are showing say 13 months worth of data, you have refresh it all, all the time. Now with bigData, business needs all data all the time, which means you would end up extracting millions of records, throughout the day.
2. Very little flexibility in user interactions, like menus,drop downs etc. you have to work with what's been provided by Tableau.
3. If you have multiple charts on same dashboard page, not so user friendly way to download underlying data.
4. Many other challenges, in laying out visualizations on dashboard page, as there is no easy way to control canvas with pixel control, white spaces etc.
You should be very careful, after analyzing your use case, whether Tableau would be the right product before you invest in it.
Tableau's primary power comes from its desktop tool for data visualization/exploration and not from pre-built dashboards.
Best of luck.
Since Tableau public is also based on Tableau, I assume that you can put your dashboards in public using your own Tableau infrastructure.
Related
I have some experience in MS Access, but mostly only as an offline DB tool.
I have begun working with both Seller and Vendor Central at my new company, and am in charge of scrubbing the vast amount of data for trends and whatnot. At the moment our company is solely relying on exporting reports from Seller Central directly, and cross referencing documents. I was hoping to get us started with a rudimentary database hooked into Seller directly. Our company already has a MWS Developer ID, and I see an MWS Access Key and whatnot.
I'm surprised to not finding any resources as to how I should actually connect MWS to Access. I feel confident that I can find some success by dabbling with the API once I get it connected, but I can't actually find any references on how to actually establish that connection.
Any resources you guys can forward me? Maybe I'm searching for the wrong terms. Everything I search just comes up with data service companies advertising their tools.
Well, the interface to AWS is going to be web service based. And access unfortantly does not have a built in web services interface.
So, your choices are:
Write some VBA code to hit/use/consume AWS web services. Web services are just that -a web API. (likly REST services. REST is just a fancy term that you have to type in a given URL.
So, what you looking to search for?
How can I consume web based data in Access.
Say this answer on SO
Making a SOAP request from Access 2007
The main issue is that Access does not have really good tools for consuming web data.
However, most web front "store" applcations tend to have a user area in which you can export the daily sales or data say to csv. You now can import that data into Access (or Excel).
And they often have a report area - you can generate a report, and then download again in some format like xml or csv (and again, import into Access or Excel).
If you don't want to have to maually import the data?
Then you have to code out web requests. And that can be painful.
This unfortantly means you can use say a linked table (ODBC) like you can for Acces say to some database.
So, you can start to write web interface code (it will be SOAP or REST.
Believe it or not, there was a SOAP add-in tool kit for Access 2003. But, no one used it, so they dropped it. (of course now 17 years later -gee, a truckload of people GET IT - and now see the need to consume web data!
So, you question and what to learn about?
You asking how does one consume web services.
Well, using a tool designed to work with web services helps a lot. (that's why I suggest Visual Studio and .net). If they have a WSDL for you? Then you can point Visual Studio at the web (WSDL), and it will crank out a set of "methods" and properites for you. (it will create a class. But then again, did you use and write class objects in VBA? (it does support you creating classes. But the SOAP tool kit (no longer avaiable) would write this code for you!
So, if you want to go beyond their built-in repoting tools (that let you export + download the data in some format like csv for use with Access or Excel)?
Then you have to write writing code to make web calls.
This is not a lot different in the past. If you wanted some data from the accounting system? Well, you can/could/usually do some export with the accounting package to spit out a csv file of some sort. You then import into Access.
However, if you had better skills, you might link up to the database from Access, using ODBC and then write some SQL queries against that data. So, it really comes down to skill level here. Some could not be bothered to learn say SQL and a query. So, they just export the data out of accounting, and then import into access.
The problem is now you can't link to that web site, and use SQL queries of data. You have to use web service calls. (at least if you want to make some of this process automatic).
So, you might be just fine by exporting data/files from the AWS services, and then just import into Excel or Access. As such, you not writing any code, and you just use the Access GUI to import data.
But, some want to just hit a button in Access, and see all the orders and sales from today - and have Access pull that data from the web site with one click.
For some simple data pulls? You could make a web call from Access. But for complex web interfaces? Then you need to use tools that support web interfacing (say like Visual Studio .net).
For a simple data pull? I'll use VBA and MSXML.
But, if the parameters and data call is complex? Then I write it in .net, and THEN expose that code as a consuming library to MS-Access.
So, once you signed up for AWS and what ever web services? Then they will supply you with the web calls, and documentation. You then are free to use your programming tools of choice to interface. But, this can be quite a bit of work. So, you might use VBA, but .net is much better for this type of work. (and it also a lot more difficult to code out).
As a developer who has done this, I would write a "sync" program that connects to MWS, pulls back your data, and then inserts that into MS Access. In my case, it was a C# .NET Core app with SQL Server and I used the available MWS SDK that Amazon provides for free to handle all the API calls to MWS. You can create a schedule so your app pulls the data on an interval, or make it manual where you push a button to sync it into your system.
Of course you can use Java or PHP instead of C#, or you can roll your own MWS API calls. Or like you mention there are several third party vendors that have out-of-the-box ready solutions.
I haven't used MS Access in 20 years or so, so I'm not sure about calling MWS directly. I would gather it could be done, but is probably too much work, but I could be wrong. A .NET app can insert into MS Access, no problem, but also handle the HTTP calls to MWS for you.
I work for web hosting company looking to integrate different data sources with BigQuery but the question now is what would be an ideal reporting/BI tool to get the data from BigQuery so proper/fast/easy retrieval/analysis/ reporting can be done with it.
I'm looking into the options suggested by google here: https://cloud.google.com/bigquery/partners/ but I was wondering if someone out there has possibly a more hands-on experience that could make a recommendation.
the company works with a mysql based billing system (with client, support, service data) which is the main source of info, along with other chat, cms and inhouse-developed systems that provide other sources of information that allow to maintain the web infrastructure where the business depends on.
Thank you.
It's really hard to answer this. Depends on the personnel you have at hand.
We are doing for idea validation mostly Data Studio.
Some personnel knows Tableau, but once you are out from GCP, all become a slow process, queries and interface updates in 30-60 seconds, as they all relay and store on their own the data.
We have wired some data to ElasticSearch as well, and we use Kibana.
But once it's all validated, we consolidated into our own Dashboards the reports. Mainly because we are mostly developers and can do the programming. If you have a data analyist or data scientist with their own tools, let them use what they are comfortable with.
Always do iteration and versioning, you as a developer should be driven by a good product manager who tells exactly what charts to build out.
I am very new to database design and structuring - I have had no formal training and am purely self taught so I apologize in advance if this is a bland question.
I am designing a web app and am thinking to the future as users will have to be able to interact with each other sharing part of their data. I am wondering if there is a standard convention to controlling access to tables in MySQL and how I should generally tackle this problem with code written in NodeJS, ExpressJS, KnexJS, and BookshelfJS.
For example: a user will be matched with another user, both users will be able to see location, favourite book, etc but not able to see last name, birth date. etc.
How do I control this?
If anyone could point me to a few resources they have found helpful that would be great as well.
You seem to have learned a bit of MySQL and its access control features. Well, database user level access control IS NOT used by modern applications -- that could make resource management, like connection pools, very hard to implement. Usually SQL databases backing web applications have a single or, at most, two users: one for general data access and one for admin purposes.
The kind of access control you mentioned MUST be handled by your application code, YOUR code. There are libraries that help take care of authentication (e.g. passport) and authorization but ultimately it is YOUR CODE responsibility.
So my answer to your "How do I control this?" question is:
With YOUR code.
This is the whole point of Software Development.
We have a multitenant site where each clients data is partitioned by the equivalent of a customer_id. We currently have a basic (custom) reporting system where we maintain a set of SQL queries with parameters that get replaced (based on the logged in user) to enforce the data separation. While its not ideal it works for our current needs.
However, we are now having more and more sophisticated reporting requests including the ability for ad-hoc reporting. Most off the shelf reporting tools assume you have the ability to expose the entire dataset. We need the ability to restrict the data available to the tool to a specific customer.
For background our app is a Symfony2 application backed by MySQL and it would be nice to (relatively) easily embed the tool within the app.
Specific solutions/software are appreciated along with general approaches to multitenant reporting.
Let us say I want a "hello world" program to be run in the server, but many number of clients can execute the program whenever they want.
An example of this kind of applications based on which I am asking the question is
google docs
If the application is simple, then there's no special principle. As an example, since we're talking about Google, let's look at their main entry page. Ignoring the menu links, it's a basic web page that uses HTML to post the search terms to a back end server application that performs the search. Literally thousands of users can use this web page at the same time.
The more users, the more web servers running your web page you'll need.
If the application is more complicated, like Google Docs, then you have to figure out a way for each user of the application to save information separately and securely. You'd probably start with a user id.
The more users, the more disk storage you'll need. System managed storage would be helpful.
As far as software, you can use any language you wish to develop the application. JQuery is popular. There's Java EE, Ruby, .NET, and a host of other languages. You have the choices of MySQL, Oracle, or DB2 for a relational database.