Get usage metrics for carbon or workshop in Palantir Foundry - palantir-foundry

We have a couple of workshop applications and carbon workspaces where we are interested in some aggregated user/usage metrics for example daily/weekly unique users. Are these numbers available somewhere such that we could present them in a simple overview?
I'm aware of the Usage tab in OMA however it is not granular enough on a per application basis for our need, but something similar to that for workshops/carbon would be great!

You will have to request the data from your Palantir customer support team for that.
The data could be available but maybe it's not being ingested into your instance at the moment.

Related

trying to understand gcp cloud costs and determine free or low cost relational database hosting?

I was originally planning to use Azure SQL for a client's database but Azure said that the estimated cost was going to be something around $250/month for the most basic configuration. I remember when using Azure for my own experimentation in the past, that Azure costs were higher than expected so I decided to look at GCP as an alternative.
GCP offered me a free trial credit of $300 so I accepted that by default. I created a new SQL Server instance via my GCP account, created the most basic database configuration, then connected via SSMS and created a single database table with a single Id column. That's it. Now, 2 days later with no additional usage of this database table, my GCP free trial credit has been burned down by $15. Based on this trend, a SS instance on GCP seems to cost about as much as an Azure SQL instance. Am I inferring this correctly?
Can you recommend a good quality option which provides free relational database hosting for low volume, low transaction databases? SQL Server would be great but MySQL should work too. I'm assuming that MySQL is fairly equivalent for simple databases?
I don't know about costs related to other cloud providers, but gcp's are usually really competitive on the market. With cloud SQL you pay per instance/h and you pay more/less based on different factors. Use the google cloud price calculator to have a general idea of the costs, and adjust cloud sql accordingly: https://cloud.google.com/products/calculator
Additionally, here you may find all the information regarding Pricing details of Cloud SQL.

What is a good AWS solution (DB, ETL, Batch Job) to store large historical trading data (with daily refresh) for machine learning analysis?

I want to build a machine learning system with large amount of historical trading data for machine learning purpose (Python program).
Trading company has an API to grab their historical data and real time data. Data volume is about 100G for historical data and about 200M for daily data.
Trading data is typical time series data like price, name, region, timeline, etc. The format of data could be retrieved as large files or stored in relational DB.
So my question is, what is the best way to store these data on AWS and what'sthe best way to add new data everyday (like through a cron job, or ETL job)? Possible solutions include storing them in relational database like Or NoSQL databases like DynamoDB or Redis, or store the data in a file system and read by Python program directly. I just need to find a solution to persist the data in AWS so multiple team can grab the data for research.
Also, since it's a research project, I don't want to spend too much time on exploring new systems or emerging technologies. I know there are Time Series Databases like InfluxDB or new Amazon Timestream. Considering the learning curve and deadline requirement, I don't incline to learn and use them for now.
I'm familiar with MySQL. If really needed, i can pick up NoSQL, like Redis/DynamoDB.
Any advice? Many thanks!
If you want to use AWS EMR, then the simplest solution is probably just to run a daily job that dumps data into a file in S3. However, if you want to use something a little more SQL-ey, you could load everything into Redshift.
If your goal is to make it available in some form to other people, then you should definitely put the data in S3. AWS has ETL and data migration tools that can move data from S3 to a variety of destinations, so the other people will not be restricted in their use of the data just because of it being stored in S3.
On top of that, S3 is the cheapest (warm) storage option available in AWS, and for all practical purposes, its throughout is unlimited. If you store the data in a SQL database, you significantly limit the rate at which the data can be retrieved. If you store the data in a NoSQL database, you may be able to support more traffic (maybe) but it will be at significant cost.
Just to further illustrate my point, I recently did an experiment to test certain properties of one of the S3 APIs, and part of my experiment involved uploading ~100GB of data to S3 from an EC2 instance. I was able to upload all of that data in just a few minutes, and it cost next to nothing.
The only thing you need to decide is the format of your data files. You should talk to some of the other people and find out if Json, CSV, or something else is preferred.
As for adding new data, I would set up a lambda function that is triggered by a CloudWatch event. The lambda function can get the data from your data source and put it into S3. The CloudWatch event trigger is cron based, so it’s easy enough to switch between hourly, daily, or whatever frequency meets your needs.

Service for accessing multiple user cloud storage accounts from single server

I'm working on a free educational web app for school music teachers and students that will allow them to collaborate and share mp3 recordings. Since earning revenue is not the goal, I'm looking for ways to reduce file storage costs. A single teacher assignment might produce hundreds of recorded responses. Instead of saving these recordings to my own storage (or to a service like Amazon's S3), I was wondering if there are any cloud storage services that teachers could sign up for - similar to something like Google Drive - and which they could then give my server app access to for storing their class's recordings. I'd still manage the info for the recordings and other data in a single database on my own server, but I'd save any large files to the shared storage provided to me by each teacher. I haven't found any examples of this sort of thing with services like Google Drive or Dropbox, but if it's possible with those or any other services, I'd appreciate a link to some info. The expectation would be that a teacher could pay the storage company for its service according to the school's usage. The service would have to be simple for teachers to sign up for and provide me access to, which I think puts some of the developer-oriented services out of reach.
Suggestions for different strategies are also welcome. I'd prefer not to handle financial transactions (so I don't want to rent space to people).

How to synchronize market data frequently and show as a historical timeseries data

http://pubapi.cryptsy.com/api.php?method=marketdatav2
I would like to synchronize market data on a continuous basis (e.g. cryptsy and other exchanges). I would like to show latest buy/sell price from the respective orders from these exchanges on a regular basis as a historical time series.
What backend database should I used to store and render or plot any parameter from the retrieved data as a historical timeseries data.
I'd suggest you look at a database tuned for handling time series data. The one that springs to mind is InfluxDB. This question has a more general take on time series databases.
I think it needs more detail about the requirement.
It just describe, "it needs sync time series data". What is scenario? what is data source and destination?
Option 1.
If it is just data synchronization issues between two data based, easiest solution is CouchDB NoSQL Series (CouchDB, CouchBase, Cloudant)
All they are based on CouchDB, anyway they provides data center level data replication feature (XCDR). So you can replicate the date to other couchDB in other data center or even in couchDB in mobile devices.
I hope it will be useful to u.
Option 2.
Other approach is Data Integration approach. You can sync data by using ETL batch job. Batch worker can copy data to destination periodically. It is most common way to replicate data to other destination. There are a lot of tools it supports ETL line Pentaho ETL, Spring Integration, Apache Camel.
If you provide me more detail scenario, i can help u in more detail
Enjoy
-Terry
I think mongoDB is a good choice. Here is why:
You can easily scale out, and thus be able to store tremendous amount of data. When using an according shard key, you might even be able to position the shards close to the exchange they follow in order to improve speed, if that should become a concern.
Replica sets offer automatic failover, which implicitly could be an issue
Using the TTL feature, data can be automatically deleted after their TTL, effectively creating a round robin database.
Both the aggregation and the map/reduce framework will be helpful
There are some free classes at MongoDB University which will prevent you to avoid the most common pitfalls

Database schema/design for storing metrics

For clarification, I don't want to store metrics on the database itself - rather, I want to build a database to store metrics from the various controls we measure at my organization for easy reporting. A little background: as manager, I pull metrics from various applications - our two ticketing systems (yeah, I know), our phone system, alerts from our event management software (i.e., Nagios), etc. I report on these on a weekly basis and keep an Excel spreadsheet with the historical data. The spreadsheet is really, really big and inflexible.
I'm not new at writing web apps, but I'm still new to the database design arena. I want to code an app with some awesome rickshaw javascript graphs for some great historical data (and to wow the senior management team with crazy colors and interactivity).
Where do I start with the database? I could create one table for all metrics, but how to index those into the various types (for instance, phone metrics has abandon rate, total inbound calls, total outbound calls, total time on call, average talk time, average hold time, max hold time, etc.). That's one messy, unorganized table.
I could create one table for each type (phone, ticket, event, etc.) but that seems hard to add metrics to the pile later.
I'm hoping someone here has some experience and can give me some pointers on what direction I should head.
PS: It will need to be SQLite or MySQL, just due to the resources I have available at this time.
MySQL design for such a system can be made considering following:
A Table for each type of metrics group for example an entity of ticket system can be a single ticket
If a ticket is connected to single user you may include user name in the previous ticket table otherwise to keep it flexible i would say create a table for each connected element for example ticket is assinged to staff and has multiple telephone calls associated to it so you would need calls table and staff table.
In order to map multiple items create mapping tables for example stafftickets and ticketcalls to associate staff with multiple tickets and tickets with multiple calls
Once you have defined these entities then you can sit on mySQL phpmyadmin and create tables that will work.
For charting side of things use D3.js and just spit out json and use javascript or json2 to bind it to your graphs etc.