Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I want to practice my skills away from a keyboard (i.e. pen and paper) and I'm after simple practice questions like Fizz Buzz, Print the first N primes.
What are your favourite simple programming questions?
I've been working on http://projecteuler.net/
Problem:
Insert + or - sign anywhere between the digits 123456789 in such a way that the expression evaluates to 100. The condition is that the order of the digits must not be changed.
e.g.: 1 + 2 + 3 - 4 + 5 + 6 + 78 + 9 = 100
Programming Problem:
Write a program in your favorite language which outputs all possible solutions of the above problem.
If you want a pen and paper kind of exercises I'd recommend more designing than coding.
Actually coding in paper sucks and it lets you learn almost nothing. Work environment does matter so typing on a computer, compiling, seeing what errors you've made, using refactor here and there, just doesn't compare to what you can do on a piece of paper and so, what you can do on a piece of paper, while being an interesting mental exercise is not practical, it will not improve your coding skills so much.
On the other hand, you can design the architecture of a medium or even complex application by hand in a paper. In fact, I usually do. Engineering tools (such as Enterprise Architect) are not good enough to replace the good all by-hand diagrams.
Good projects could be, How would you design a game engine? Classes, Threads, Storage, Physics, the data structures which will hold everything and so on. How would you start a search engine? How would you design an pattern recognition system?
I find that kind of problems much more rewarding than any paper coding you can do.
There are some good examples of simple-ish programming questions in Steve Yegge's article Five Essential Phone Screen Questions (under Area Number One: Coding). I find these are pretty good for doing on pen and paper. Also, the questions under OOP Design in the same article can be done on pen and paper (or even in your head) and are, I think, good exercises to do.
Quite a few online sites for competitive programming are full of sample questions/challenges, sorted by 'difficulty'. Quite often, the simpler categories in the 'algorithms' questions would suit you I think.
For example, check out TopCoder (algorithms section)!
Apart from that, 2 samples:
You are given a list of N points in the plane by their coordinates (x_i, y_i), and a number R>0. Output the maximum number out of the N given points that can be simultaneously covered by a disk of radius R (for bonus points: complexity?).
You are given an array of N numbers a1 to aN, and you want to compute a1 * a2 * ... * aN / ai for all values of i (so the output is again an array of N elements) without using division. Provide a (non-naive) method (complexity should be in O(N) multiplications).
I also like project euler, but I would like to point out that the questions get really tricky really fast. After the first 20 questions or so, they start to be problems most people won't be able to figure out in 1/2 an hour. Another problem is that a lot of them deal with math with really large numbers, that don't fit into standard integer or even long variable types.
Towers of Hannoi is great for practice on recursion.
I'd also do a search on sample programming interview questions.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I am working with parking occupancy prediction using machine learning random forest regression. I have 6 features, I have tried to implement the random forest model but the results are not good, As I am very new to this I do not know what kind of model is suitable for this kind of problem. My dataset is huge I have 47 million rows. I have also used Random search cv but I cannot improve the model. Kindly have a look at the code below and help to improve or suggest another model.
Random forest regression
The features used are extracted with the help of the location data of the parking lots with a buffer. Kindly help me to improve.
So, your used variables are :
['restaurants_pts','population','res_percent','com_percent','supermarkt_pts', 'bank_pts']
The thing I see is, for a same Parking, those variables won't change, so the Regression will just predict the "average" occupancy of the parking. One of the key part of your problem seem to be that the occupancy is not the same at 5pm and at 4am...
I'd suggest you work on a time variable (ex : arrival) so it's usable.
Itself, the variable cannot be understood by the model, but you can work on it to create categories with it. For example, you make a preprocess selecting only the HOUR of your variable, and then make categories with it (either each hour being a category, or larger categories like ['noon - 6am', '6am - 10am', '10am - 2pm', '2pm - 6 pm', '6 pm - noon'])
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm fairly new to machine learning and I'm working on preprocessing my training data using linear feature scaling.
My question is, given a .csv file where each column of data represents a feature, with what minX and maxX values should I be normalizing my data?
More specifically, should I be normalizing each feature separately (using minX/maxX values from each column), normalizing all the data at once (finding minX/maxX from the entire dataset, ergo all the features), or normalizing on an input-by-input basis?
Normalize each feature separately. What you want is to limit the range of each feature in a well defined interval (i.e. [0,1]).
Use data from training data set only.
If you use Min-Max scaling you are going to have a smaller STD, this is not bad. If use Min-Max or standardization (mu=0, std=1) depends on the application you need to do.
You want all of your features to be in the same range for linear classifiers (and not only them! Also for neural nets!). The reason why you want to scale should be very clear to you before moving forward. Take a look at Andrew Ng's lecture on this subject for an intuitive explanation of what's going on.
Once this is clear, you should have the answer to your question: normalize each feature individually. For example, if you have a table with 3 rows:
row | F1 | F2
1 | 1 | 1000
2 | 2 | 2000
3 | 3 | 3000
You want to scale F1 by taking its max value (3) and its min value (1). You are going to do the same for F2 having 3000 and 1000 as max and min respectively.
This is called MinMax scaling. You can also do scaling based on mean and variance, or follow another approach entirely by thinking that you usually have a "budget" in terms of computational resources and you want to maximize it. In that case, something like Histogram Equalization might be a good choice.
A final note: if you are using decision trees (as a standalone classifier, or in a decision forest or in a boosted ensemble) then don't bother normalizing, it won't change a thing.
I am designing a database that will be based around a group progress quiz. The quiz consists of 55 questions, and ideally a group of 10 people will take the quiz every few weeks, each person taking it once for everyone in the group, including themselves. So, each time the group takes the quiz, 100 pieces of data will be added to the database.
Currently my table for storing the quiz answers will have the following rows:
quiz_taker_id // person taking the quiz
quiz_subject_id // taker is answering questions about this person
quiz_id // identifies if this is the 1st time taking the quiz, 2nd time, etc
question1 // answer to question 1
question2 // answer to question 2
... // etc, for all quiz questions
The quiz answers are incredibly simple, its just a ration of 0-5 on a person's characteristics. Is this a good way to be storing this data? Are there better ways to do this? I am just starting to set up the website and DB, so I want to make sure I am approaching this the right way
Whenever you want to process data in any way (like make postgame stats) it is a good idea to use databases. Your db design is very simple and lacks some flexibility, like say, you want to add more questions later (now you have to add extra column).
So it really depends on what you plan to do with the collected data and if you plan to extend your quiz rules.
This is too long for a comment.
The questions should be in a single table, questions not one table per question. A basic questions table would have each question and its correct answer. This is probably good enough for your problem.
For surveys (and for quizzes, I imagine), there is a versioning problem, because questions can slowly change over time. As a somewhat trivial example, you might start start by asking "What is your gender?" and initially offer two answers "Male", "Female". Over time, you might start adding additional other answers: "Other", "Transsexual", "Hermaphrodite" and so on. When analyzing the answers, you might need to know the version of the question that was asked at a particular time.
This is a survey example, where there is no right answer, but a similar idea applies to quizzes: the questions and answers might evolve somewhat over time, but you still want them to be recognized at Question 2, but you want to know the version being asked.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I was refactoring old code and encountered several IF conditions that were way too complex and long and I'm certain they can be simplified. My guess is that those conditions grew so much because of later modifications.
Anyway, I was wondering if any of you know of a good online simplifier I can use. I'm not interested in any specific language, just a simplifier that would take in for example:
((A OR B) AND (!B AND C) OR C)
And give me a simplified version of the expression, if any.
I've looked at the other similar questions but none point me to a good simplifier.
Thanks.
You can try Wolfram Alpha as in this example based on your input:
http://www.wolframalpha.com/input/?i=((A%20OR%20B)%20AND%20(NOT%20B%20AND%20C)%20OR%20C)&t=crmtb01&f=rc
Try Logic Friday 1 It includes tools from the Univerity of California (Espresso and misII) and makes them usable with a GUI. You can enter boolean equations and truth tables as desired. It also features a graphical gate diagram input and output.
The minimization can be carried out two-level or multi-level. The two-level form yields a minimized sum of products. The multi-level form creates a circuit composed out of logical gates. The types of gates can be restricted by the user.
Your expression simplifies to C.
I found that The Boolean Expression Reducer is much easier to use than Logic Friday. Plus it doesn't require installation and is multi-platform (Java).
Also in Logic Friday the expression A | B just returns 3 entries in truth table; I expected 4.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am working on a system that can create made up fanatsy words based on a variety of user input, such as syllable templates or a modified Backus Naur Form. One new mode, though, is planned to be machine learning. Here, the user does not explicitly define any rules, but paste some text and the system learns the structure of the given words and creates similar words.
My current naïve approach would be to create a table of letter neighborhood probabilities (including a special end-of-word "letter") and filling it by scanning the input by letter pairs (using whitespace and punctuation as word boundaries). Creating a word would mean to look up the probabilities for every letter to follow the current letter and randomly choose one according to the probabilities, append, and reiterate until end-of-word is encountered.
But I am looking for more sophisticated approaches that (probably?) provide better results. I do not know much about machine learning, so pointers to topics, techniques or algorithms are appreciated.
I think that for independent words (an especially names), a simple Markov chain system (which you seem to describe when talking about using letter pairs) can perform really well. Feed it a lexicon and throw it a seed to generate a new name based on what it learned. You may want to tweak the prefix length of the Markov chain to get nicely sounding results (as pointed out in a comment to your question, 2 letters are much better than one).
I once tried it with elvish and orcish names dictionaries and got very satisfying results.