Where to find GTFS realtime file - gtfs

I have been doing extensive research on GTFS and GTFS-Realtime. All I want to be able to do, is find out how late a certain bus would be. I can't seem to find where I can connect to, to properly search for a specific bus number. So my questions are:
Where/ how can I find the GTFS-Realtime file feed
How can I properly open the file, and make it location specific.
I've been trying to use http://www.yrt.ca/en/aboutus/GTFS.asp to download the file, but can't figure out how to open the csv file properly.

According to What is GTFS-realtime?, the GTFS-realtime data is not in CSV format. Instead, it is based on Protocol Buffers:
Data format
The GTFS-realtime data exchange format is based on Protocol Buffers.
Protocol buffers are a language- and platform-neutral mechanism for serializing structured data (think XML, but smaller, faster, and simpler). The data structure is defined in a gtfs-realtime.proto file, which then is used to generate source code to easily read and write your structured data from and to a variety of data streams, using a variety of languages – e.g. Java, C++ or Python.

Related

Backup core data, one entity only

My application requires some kind of data backup and some kind of data exchange between users, so what I want to achieve is the ability to export an entity but not the entire database.
I have found some help but for the full database, like this post:
Backup core data locally, and restore from backup - Swift
This applies to the entire database.
I tried to export a JSON file, this might work except that the entity I'm trying to export contains images as binary data.
So I'm stuck.
Any help exporting not the full database but just one entity or how to write a JSON that includes binary data.
Take a look at protobuf. Apple has an official swift lib for it
https://github.com/apple/swift-protobuf
Protobuf is an alternate encoding to JSON that has direct support for serializing binary data. There are client libraries for any language you might need to read the data in, or command-line tools if you want to examine the files manually.

Big ( 1GB) JSON data handling in Tableau

I am working with a large twitter dataset in the form of a JSON file. When I try to import that into Tableau, there is an error and the upload fails on the account of data upload limit of 128 MB.
Due to which I need to shrink the dataset to bring it to 128MB thereby reducing the effectiveness of the analysis.
What is the best way to upload and handle large JSON data in tableau?
Do I need to use an external tool for it?
Can we use AWS products to handle the same? Please advise!
From what I can find in unofficial documents online, Tableau does indeed have a 128 MB limit on JSON file size. You have several options.
Split the JSON files into multiple files and union them in your data source (https://onlinehelp.tableau.com/current/pro/desktop/en-us/examples_json.html#Union)
Use a tool to convert the JSON to csv or Excel (Google for JSON to csv converter)
Load the JSON into a database, such as MySql and use the MySql as the data source
You may want to consider posting in the Ideas section of the Tableau Community pages and add a suggestion for allowing larger JSON files. This will bring it to the attention of the broader Tableau community and product management.

Merits of JSON vs CSV file format while writing to HDFS for downstream applications

We are in the process of extracting source data (xls) and injecting to HDFS. Is it better to write these files as CSV or JSON format? We are contemplating choosing one of them, but before making the call, we are wondering what are the merits & demerits of using either one of them.
Factors we are trying to figure out are:
Performance (Data Volume is 2-5 GB)
Loading vs Reading Data
How much easier it is to extract Metadata (Structure) info from either of these files.
Injected data will be consumed by other applications which support both JSON & CSV.

How can i publish CSV data as Linked data on Web?

My work is mainly focused on conversion of CSV data to RDF data format. After get RDF data ,i need to publish that RDF data as Linked data on web. Actually i want to convert CSV data to RDF data using java programming by myself then i want to publish that RDF data as Linked data on web using any tools.Can anyone help me finding any ways to do this or give me any suggestion or reference ? which tools i should use for this work? Thanks
You can publish your RDF in a variety of ways. Here is a common reference where they explain the steps, software tools and examples: http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf
In a nutshell, once you have your RDF data, you should think about the following:
1) Which tool/set of tools do I want to use to store my RDF data? For instance, I commonly use Virtuoso because I can use it for free and it facilitates the creation of the endpoint. But you can use Jena TDB, Allegro Graph, or many other triple stores.
2) Which tool do I use to make my data derreferenceable? For example, I use Pubby because I can configure it easily. But you can use Jena TDB (for the previous step) + Fuseki + Snorql for the same purpose. See the reference above for more information on the links and features of each tool.
3) Which datasets should I link to? (i.e., which data from other datasets do I reference, in order to make my dataset part of the Linked Data cloud?)
4) How should I link to these datasets? For example, the SILK framework can be used to analyze which of the URIs of your dataset are owl:sameAs other URIs in the target dataset of your choice.
Many people just publish their RDF in their endpoints, without linking it to other datasets. Although this follows the Linked Data principles (http://www.w3.org/DesignIssues/LinkedData.html), it is always better to link to other existing URIs when possible.
This is a short summary, assumming you already have the RDF data created. I hope it helps.
You can use Tarql (https://tarql.github.io/) or if you want to do more advanced mapping you can use SparqlMap (http://aksw.org/Projects/SparqlMap).
In both cases you will end up having a SPARQL endpoint which you can make available on-line and people can query your data.
Making each data item available under its URL is a very good idea, following the Linked Data principles as mentioned by #daniel-garijo in the other answer: http://www.w3.org/DesignIssues/LinkedData.html.
So you can also publish the data-items with all its properties in individual files.

Conversion of GRIB and NetCDF to my database

I have downloaded "High Resolution Initial Conditions" climate forecast data for one day, it was in extension .tar.gz so I extracted it in my local directory and I get the files like in the attached image. I think, that the files without extension are GRIB data (because first word in them is "GRIB"). So I want to get data from the big files (GRIB and NetCDF formats containing climate data like temerature & pressure in grid) to my database, but they are binary. Can you recommend me some easy way for getting data from these files? I can't get any information about handling their datasets on their website.
Converting these files to .csv would be nice, but I can't find a program to convert the GRIB files.
Using python and some available modules it is simple...
The Enthought Python Distribution includes several packages, including netCDF4, to deal with NetCDF files!
I've never worked with GRIB files, but google tells that another python package exists, pygrib2.
Or you can use PyNio, a Python package that allows to read and write netCDF3 and netCDF4 classic format, and to read GRIB1 and GRIB2 files.
I don't know the ammount of data you have, but usually it is crazy to convert it to *.csv! Python is easy to learn, and suitable to work with this kind of data (with matplotlib package you can even plot it). Or, if you really need it in a *.csv, you can select with python a smaller domain, for example, or the needed variables...
For conversion into text, look into http://www.cpc.ncep.noaa.gov/products/wesley/wgrib.html or http://www.cpc.ncep.noaa.gov/products/wesley/wgrib2/
Both are C programs from one of the big names in GRIB.
I'm currently dealing with a similar issue.
In my case I'm trying to rely on the GrADS software, which can "easily" transform GRIB data into other formats.
If your dataset is not huge, then you can export it to csv using this tutorial.
My dataset is 80gb in GRIB binary files, so I'm very restricted in what software I can use to handle it (no R unless I find a computer with more than 80gb of RAM).