Is it possible to partially calculate the graphs of Open Trip Planner? - gtfs

I have a question about Open Trip Planner. In the project I'm working on I've automated the download of GTFS files whenever an update is available.
Every time my server downloads a new update, however, it is recreating the entire graph (including the map and the GTFS that have not been modified).
Is it possible to update only the graphs related to specific GTFS in OTP?
In the guide it says that it is possible to save graphs, but this part of the tutorial is very poor, or maybe I cannot find it.
link to guide I am referring: http://docs.opentripplanner.org/en/latest/Basic-Tutorial/
I am using the following command to start OTP:
java -Xmx512m -jar "/otp-1.4.0-shaded.jar" --build "/downloads" --inMemory

Related

How to run my colab notebook periodically

I've written a script in Google Colab that performs some queries via API to a server where I host some information, let's say weather data from several weather stations we own, that data is downloaded in a JSON format then I save those files in my Google Drive as I'll use them later in the same script to generate some condensed tables by country and by month, those summary files are also stored in my Google Drive account. After that, I send the summary files via email to some people.
I need to run this script every other day and, well, it is no big deal to do that, but I'd like to make it happen without human intervention, mainly because of scalability. One of the main points is that some libraries are not easy to install locally and it is way easier to make it work in Colab, that's why I don't use some solutions like crontab or windows scheduler to perform this, and I need to make the Colab notebook to run periodically.
I've tried solutions like PythonAnywhere, but I've spent too much time trying to modify the script to work with PyDrive and, so far, haven't achieved it. I've also tried to run it using GCP but haven't found how to access my Google Drive files yet using Google Cloud Functions. I've been also working on building an image using Docker and if I deploy it to someone else, it'll be working, but the idea is to make the running schedulable not only deployable. I've also read the Colabctl option but haven't been able to make it run yet.
Please, I'm open to receiving any suggestions in order to achieve my goal.
Thank you,
Billy.
there is a paid scheduler https://cloud.google.com/scheduler
or you can use https://github.com/TensorTom/colabctl with your own cron

Running python script on database file located in google drive

I have a database file which is located on my own google drive (private) and it's updated on a daily basis.
I wrote a python script in order to track the data from the database file, however I need to download the DB file to my pc, and then running my script locally, every single day.
I am wondering if there are any better options, so I wouldn't have to download the DB file and move it manually.
From a slight searching on the web I found that there is no way to run the script in the google drive folder (obviously due to security issues), and using google cloud platform is not a real option since it's not a free service (and as I understood there is no free trial).
Anyways, any method that would make my life easier would be happily accepted.
Sorry
That's not possible AFAIK. At least in the way you have asked the question.
It may be that you are looking for a Database hosting service, which, as a rule, are not free. I remember seeing a SQL viewer around, I don't know if it is still available and I don't think it was accessible via a local Python script.
Google Cloud Platform, as other services, do however offer a free tier of services - though this depends on how much use you need to give it and also on your location. Though this can get quite involved quite quickly. There are also various options to choose from. For example Big Query may be something you are interested in.
A Possible Workaround
I assume that the reason you don't want to download it is because it is tedious, and not because your network connection or hard drive can't handle it, if so:
The simple workaround may just be to find a way to automatically download the database via the (Google Drive API)[https://developers.google.com/drive] with your Python Script, modify it, or read it, and then upload it again if you need to. This can be done all within one script using the API. So all you would need to do is run the Python script and you wouldn't need to manually download and move it into a folder. You would also not need to worry about running over the free tier and getting billed for it.
You could even use the Google Drive Desktop software to keep the database file synced locally. Since it is local, I doubt the software would be able to block a Python script on the file system.
If you don't even want to think about it, you could set up the Python script with a CRON job or a Scheduled task if you are on Windows.

Multiple app were running in background , how to close them properly, autodesk forge

It is a forge app question. The app is created in visual studio and forge app. I was debugging. I am working on a forge application where I am using Data management, design automation and model derivatives. Design automation app were closing properly. After integrating Model Derivative I was passing rfa file by mistake and app was running endlessly.. The model is very small, but initially I did some mistake while passing the model to model derivative. I was passing rfa file, later realise model derivatives do not read rfa file. In the mean time I tried so many times. After passing a rvt file to model derivative, it was very fast and was running perfect. I had deleted those hanged apps. When they were running endlessly and did not show the model in the viewer, I closed the app and rerun the app or deleted the app and created a new one and rerun. Then when I saw data usage,it shows multiple apps were consuming cloud credit.Is it possible, multiple app were running simultaneously and the jobs which I had deleted did not close properly? The test model was really small.
I had stopped the app abruptly and debug again. I can guess, multiply apps were running one upon another. It is my guess. I can be wrong.
Please let me know few known reasons, for app running in background and how to properly end a job in forge cloud platform. I simply stopped debugging in visual studio and deleted the app in forge and created a new one.
In the usage graph, it is showing overlapping colors and usage of several app. My new model for design automation is big, my worry is I will lost all of my cloud credit if I do not close the jobs properly.
How to stop a project running in forge cloud. I am sure I could not stop app running in cloud properly before creating a new forge app and debug. All of them were running. I can be wrong, because I am new to forge cloud.
I described above, what can go wrong. Please let me know what could go wrong. Model was really small.
May be Ngrok was not closed properly. No idea.
I thought I closed each job, before starting a new one.
Model Derivative API doesn't support RFA format currently. With my experience, it will the following message while submitting a translation job on the RFA file with POST Jobs. Therefore, I'm not sure why you're saying the job or the app keeps running endlessly. Could you share more details on that??
{
"diagnostic": "Failed to trigger translation for this file."
}
Besides, I'm confused with your statement here. What kind of app was deleted? The Forge app inside your myapp page? It's unclear to me.
I had deleted those hanged apps. When they were running endlessly and
did not show the model in the viewer, I closed the app and rerun the
app or deleted the app and created a new one and rerun.
To cancel or delete translation jobs, you can call DELETE manifest. According to our engineering team, each translation job has designed expire time (e.g. 2 hours). If your translation job cannot be completed within the period, then the job will be canceled immediately by our service.
Note. To cancel a Desing Automation WorkItem, you can call DELETE workitems/:id to stop the job.
Lastly, back to the cloud credits question, Model Derivative API is charging per translation job while calling POST Jobs, not by the processing hour. Currently, only the Design Automation API is charging by the processing hour of your work item, see the pricing table here: https://forge.autodesk.com/pricing.
If you have further questions, you can drop me a line to forge[DOT]help[AT]autodesk[DOT]com.
Cheers,

Getting historical data in Fi ware using Cosmos

I'm trying to get all the historic information about a sensor of Fi Ware.
I've seen that Orion uses Cygnus to store historics in Cosmos. Is that information accesible or is it only possible to use IDAS to get it?
Where could I get more info about this?
The way you can consume the data is, in an incremental approach from the learning curve point of view:
Working with the raw data, either "locally" (i.e. logging into the Head Node of the cluster) by using the Hadoop commands, either "remotely" by using the WebHDFS/HttpFS REST API. Please observe within this approach you have to implement whichever analyzing logic you need, since Cosmos only allows you to manage, as said, raw data.
Working with Hive in order to query the data in a SQL-like approach. Again, you can do it locally by invoking the Hive CLI, or remotely by implementing your own Hive client in Java (there are some other languages) using the Hive libraries.
Working with MapReduce (MR) in order to implement strong analysis. In order to do this, you'll have to create your own MR-based application (typically in Java) and run it locally. Once you are done with the local run of the MR app, you can go with Oozie, which allows you to run such MR apps in a remote way.
My advice is you start with Hive (the step 1 is easy but does not provide any analyzing capabilities), first locally trying to execute some Hive queries, then remotely implementing your own client. If this kind of analysis is not enough for you, then move to MapReduce and Oozie.
All the documentation regarding Cosmos can be found in the FI-WARE Catalogue of enablers. Within this documentation, I would highlight:
Quick Start for Programmers.
User and Programmer Guide (functionality described in sections 2.1 and 2.2 is not currently available in FI-LAB).

Sofware Engineering: Combining many modular programs in Unix [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Upfront my question is: Are there any standard/common methods for implementing a software package that maintains and updates a MySQL database?
I'm an undergraduate research assistant and I've been tasked with creating a cron job that updates one of our university's in house bioinformatics databases.
Instead of building on monolithic binary that does every aspect of the work, I've divided the problem into subtasks and written a few python/c++ modules to handle the different tasks, as listed in the pipeline below:
Query the remote database for a list of updated files and return the result for the given time interval (monthly updated files / weekly / daily);
Module implemented in python. URL of updated file(s) output to stdout
Read in relative URL's of updated files and download to local directory
Module implemented in python
Unzip each archive of new files
Implemented as bash script
Parse files into CSV format
Module implemented in C++
Run MySQL query to insert CSV files into database
Obviously just a bash script
I'm not sure how to go about combining these modules into one package that can be easily moved to another machine, say if our current servers run out of space and the DB needs to be copied to another file-system (It's already happened once before).
My first thought is to create a bash script that pipes all of these modules together given that they all operate with stdin/stdout anyway, but this seems like an odd way of doing things.
Alternatively, I could write my C++ code as a python extension, package all of these scripts together and just write one python file that does this work for me.
Should I be using a package manager so that my code is easily installed on different machines? Does a simple zip archive of the entire updater with an included makefile suffice?
I'm extremely new to database management, and I don't have a ton of experience with distributing software, but I want to do a good job with this project. Thanks for the help in advance.
Inter-process communication (IPC) is a standard mechanism of composing many disparate programs into a complex application. IPC includes piping the output of one program to the input of another, using sockets (e.g. issuing HTTP requests from one application to another or sending data via TCP streams), using named FIFOs, and other mechanisms. In any event, using a Bash script to combine these disparate elements (or similarly, writing a Python script that accomplishes the same thing with the subprocess module) would be completely reasonable. The only thing that I would point out with this approach is that, since you are reading/writing to/from a database, you really do need to consider security/authentication with this approach (e.g. can anyone who can invoke this application write to the database? How do you verify that the caller has the appropriate access).
Regarding distribution, I would say that the most important thing is to ensure that you can find -- at any given version and prior versions -- a snapshot of all components and their dependencies at the versions that they were at the time of release. You should set up a code repository (e.g. on GitHub or some other service that you trust), and create a release branch at the time of each release that contains a snapshot of all the tools at the time of this release. That way if, God forbid, the one and only machine in which you have installed the tools fails, you will still be able to instantly grab a copy of the code and install it on a new machine (and if something breaks, you will be able to go back to an earlier release and binary search until you find out where the breakage happened)
In terms of installation, it really depends on how many steps are involved. If it is as simple as unzipping a folder and ensuring that the folder is in your PATH environment variable, then it is probably not worth the hassle to create any special distribution mechanism (unless you are able to do so easily). What I would recommend, though, is clearly documenting the installation steps in the INSTALL or README documentation in the repository (so that the instructions are snapshotted) as well as on the website for your repository. If the number of steps is small and easy to accomplish, then I wouldn't bother with much more. If there are many steps involved (like downloading and installing a large number of dependencies), then I would recommend writing a script that can automate the installation process. That being said, it's really about what the University wants in this case.