What "other features" could be incorporated into a train database? - mysql

This is a mini project for DBMS course. My task is to develop a Database for management of passenger trains.
I'm designing tables for Customers, Trains, Ticket Booking (via Telephone & Internet), Origins and Destinations.
He said, we are free to incorporate other features in our Database Model. Some of the features that we can include are as listed:
Ad-hoc Querying
Data Mining
Demographic Passenger Mapping
Origin and Destination Mapping
I've no clue about what these features mean. I know about datamining but unable to apply it in this context. Can any one kindly expand these features or suggest new ideas?
EDIT: What is Ad-hoc Querying? Give an example in this context.

Data mining would incorporate extracting useful facts/figures out of the data obtained by your system & stored in the database. For example, data mining might discover that trains between city x and y are always 5 minutes late, or is never at more than 50% capacity, etc. So you may wish to develop some tools or scripts that automatically run and generate statistics (graphs are best) which display this information and highlight unusual trends. In the given example, the schedulers could then analyse why the trains are always late (e.g., maybe the train speedos are wrong?).
Both points 3. and 4. are a subset of data mining imo. There is a huge amount of metrics you could try to measure, it is just really whatever you can think of. If you specify what type of data you are going to collect, that will make making suggestions easier.
Basically, data mining just means "sort the data to find interesting facts".
Based on comment below you could look for,
% of internet vs. phone sales
popular destinations & origins
customers age/sex/location
usage vs. time of day
...

Related

IF I wanted to predict future purchases in online shopping using historical data, do I need data science or data analysis or big data?

I wanted to learn to predict future events like......being able to predict number of plane crashes in 2018 using past two decades of plane crash data.....or.....predict how many tee-shirts with justin beibers face on it will be sold by 2018 depending upon fan base from previuos data..........or how many iphones 8's and samsungs s9's will be sold if they decide to launch on the same exact date....predicting somewhat accurate whole sale market.....stuff like that....please suggest a book...i really love head first series....is head first data analysis right for me? ....I dont lnow if i can ask questions other than programming here or not.....but here i am.....By the way does big data have anything to do with this?
it all falls in the category of data science (which is big data and data analysis). What you need for predictions and such stuff is some machine learning approach to data you have or can access about stuff you want to predict.
I'd recommend this, newest series of articles: https://medium.com/machine-learning-for-humans/why-machine-learning-matters-6164faf1df12
Apart from really nice intro, you'll find lots of resources for further learning there.
Also I highly recommend deeplearning.ai and machine learning course from Stanford you can find on Coursera.
Cheers!
I think most of the scenarios that you have asked are a case of Supervised Learning which is a type of machine learning, wherein you have previous data to train your machine learning model with the input and output values and once you have trained a model you feed new input values and it gives you the output which is the prediction.
I would highly recommend the following Machine Learning course by Andrew NG which on Coursera which covers all the basics of ML including Supervised and Unsupervised Learning.
https://www.coursera.org/learn/machine-learning
As for the books the following link from Analytics Vidya is a great place to start with, you can go through the books as they can give you some good basics of statistics and data sciences.
https://www.analyticsvidhya.com/blog/2015/10/read-books-for-beginners-machine-learning-artificial-intelligence/
As for the differences between Data Science, Data Analytics and Big Data. Data science and data analytics are similar in the sense that they both try to find patterns in data and based on those pattern you derive some insights.
Big data on the other hand is basically Data of huge size which is distributed across multiple machines, so you can store and compute large amount of data simultaneously and in parallel.
So you may ask how is big data and machine learning related? well the answer lies in the training of machine learning model, since the accuracy of prediction is to a certain extent depends on the amount of data you train it on. So more the training data better the predictions and in terms of quantity big data way ahead of others, hence the relation.

Downloading Quotes in CSV format from Yahoo Finance - Beta symbol?

By using http://finance.yahoo.com/d/quotes.csv?s=STOCKNAME&f=I am able to download a CSV file, does anyone know what the symbol for beta is? It should go after &f= e.g. the symbol for the stock name is n and it goes in as such: http://finance.yahoo.com/d/quotes.csv?s=STOCKNAME&f=n
Thanks in advance for your help!
Unfortunately you canĀ“t
There is no beta 'symbol' to allow you to download beta using Yahoos CSV API.
With that being said, it may be important to note
Though plenty of financial sites provide them, what risks are you
taking by using one of the betas provided by an outside source? Betas
provided for you by online services have unknown variable inputs,
which in all likelihood are not adaptive to your unique portfolio.
Crucially:
Provided betas are calculated with time frames unknown to their
consumers.
Another problem may be the index used to calculate beta.
Another unknown factor of pre-made betas is the method used to
calculate them.
Yahoo may therefore not provide beta due to it being liable to misinterpretation based on the above (though this is purely speculative).
So then what?
It's actually pretty straight forward to calculate yourself, all you need to do is:
Decide your time horizon for measurement
Decide an appropriate market to measure against
Ensure your chosen investment and markets share matching datapoints across the chose period (for ease of calculation)
Decide an appropriate risk free rate of return
Decide your model of calculation (e.g. regression or the capital asset pricing model, 'CAPM')
The methodology to then perform the calculation is dependant on what you're trying to accomplish and within what (programming) environment.

Database Modeling: Shipping Rules

I'm working on an e-commerce type web application and need to somehow handle calculation of shipping costs. Some rules I've found:
Free shipping
Free shipping with minimum purchase
Free shipping within a certain geographic area
Flat rate shipping
Flat rate + set amount per product
Various rates depending on speed of shipping (shipped immediately and/or how soon it gets to customer)
Based on height, width, depth, weight + shipping distance
Based on rates of various shippers
... and so on.
Any suggestions how to tackle such a problem?
I suggest you take a look at some available open source ecommerce solutions. There are a LOT of them and each one takes a stab at doing exactly what you are trying to do. If it is schema design you are after I wouldn't limit your searches to just MySQL - as long as it contains a relational database it should be easy to dig into the design. I'd take a look at nopcommerce to name just one...
Create a framework where your eCommerce system accepts modules that define shipping rules (and interfaces and calculations, etc). Design it such that you expect these modules to be able to provide all those functions. Let the end users decide which modules to use based on their own needs, as which shipping rules to use is a business decision and not a technology one.

How to use a DHT for a social trading environment

I'm trying to understand if a DHT can be used to solve a problem I'm working on:
I have a trading environment where professional option traders can get an increase in their risk limit by requesting that fellow traders lend them some of their risk limit. The lending trader can either search for traders with certain risk parameters which are part of every trader's profile, i.e. Greeks, or the lending trader can subscribe to requests from certain traders who are looking for risk.
I want this environment to be scalable and decentralized, but I don't know how traders can search for specific profile parameters when the data is contained in a DHT. Could anybody explain how this can be done?
Update:
An example that might make it easier to understand might be SO, but instead of running as a web application, the Risk Exchange runs as a desktop application on each trader's workstation. The request for risk are like questions (which may be tagged by contract, exchange, etc) and each user has a profile which shows their history of requests, their return on borrowed risk, etc.
Obviously the "exchange" can be run on a server, but I was hoping to decentralize it and make it scalable so that the system may support an arbitrary number of traders. How can I search for keywords, tags, and other data pertaining to a trader's profile if this information is stored in a distributed hash table?
Your question holds a contradiction in my ears. DHT is a great way of distributing data in a decentralized manner, but cannot provide the nodes with an information overview. This means that any overview action, such as questioning the network for certain data, will have to be done at a centralized collection point. Solutions to this contradiction has been created, but their fault tolerance does not match a critical system such as financial trading.
So my answer would be to use a centralized server to hold an overview cache of the DHT network.

How do I explain APIs to a non-technical audience?

A little background: I have the opportunity to present the idea of a public API to the management of a large car sharing company in my country. Currently, the only options to book a car are a very slow web interface and a hard to reach call center. So I'm excited of the possiblity of writing my own search interface, integrating this functionality into other products and applications etc.
The problem: Due to the special nature of this company, I'll first have to get my proposal trough a comission, which is entirely made up of non-technical and rather conservative people. How do I explain the concept of an API to such an audience?
Don't explain technical details like an API. State the business problem and your solution to the business problem - and how it would impact their bottom line.
For years, sales people have based pitches on two things: Features and Benefit. Each feature should have an associated benefit (to somebody, and preferably everybody). In this case, you're apparently planning to break what's basically a monolithic application into (at least) two pieces: a front end and a back end. The obvious benefits are that 1) each works independently, so development of each is easier. 2) different people can develop the different pieces, 3) it's easier to increase capacity by simply buying more hardware.
Though you haven't said it explicitly, I'd guess one intent is to publicly document the API. This allows outside developers to take over (at least some) development of the front-end code (often for free, no less) while you retain control over the parts that are crucial to your business process. You can more easily [allow others to] add new front-end code to address new market segments while retaining security/certainty that the underlying business process won't be disturbed in the process.
HardCode's answer is correct in that you should really should concentrate on the business issues and benefits.
However, if you really feel you need to explain something you could use the medical receptionist analogue.
A medical practice has it's own patient database and appointment scheduling system used by it's admin and medical staff. This might be pretty complex internally.
However when you want to book an appointment as a patient you talk to the receptionist with a simple set of commands - 'I want an appointment', 'I want to see doctor X', 'I feel sick' and they interface to their systems based on your medical history, the symptoms presented and resource availability to give you an appointment - '4:30pm tomorrow' - in simple language.
So, roughly speaking using the receptionist is analogous to an exterior program using an API. It allows you to interact with a complex system to get the information you need without having to deal with the internal complexities.
They'll be able to understand the benefit of having a mobile phone app that can interact with the booking system, and an API is a necessary component of that. The second benefit of the API being public is that you won't necessarily have to write that app, someone else will be able to (whether or not they actually do is another question, of course).
You should explain which use cases will be improved by your project proposal. An what benefits they can expect, like customer satisfaction.