Exposing table name and field names in request URL - mysql

I was tasked to create this Joomla component (yep, joomla; but its unrelated) and a professor told me that I should make my code as dynamic as possible (a code that needs less maintenance) and avoid hard coding. The approach we thought initially is take url parameters, turn them into objects, and pass them to query.
Let's say we want to read hotel with id # 1 in the table "hotels". lets say the table has the fields "hotel_id", "hotel_name" and some other fields.
Now, the approach we took in making the sql query string is to parse the url request that looked like this:
index.php?task=view&table=hotels&hotel_id=1&param1=something&param2=somethingelse
and turned it into a PHP object like this (shown in JSON equivalent, easier to understand):
obj = {
'table':'hotel',
'conditions':{
'hotel_id':'1',
'param1':'something',
'param2':'somethingelse'
}
and the SQL query will be something like this where conditions are looped and appended into the string where field and value of the WHERE clause are the key and value of the object (still in JSON form for ease):
SELECT * FROM obj.table WHERE hotel_id=1 AND param1=something and so on...
The problem that bugged me was the exposing of the table name and field names in the request url. I know it poses a security risk exposing items that should only be seen to the server side. The current solution I'm thinking is giving aliases to each and every table and field for the client side - but that would be hard coding, which is against his policy. and besides, if I did that, and had a thousand tables to alias, it would not be practical.
What is the proper method to do this without:
hard coding stuff
keep the code as dynamic and adaptable
EDIT:
Regarding the arbitrary queries (I forgot to include this), what currently stops them in the back end is a function, that takes a reference from a hard-coded object (more like a config file shown here), and parses the url by picking out parameters or matching them.
The config looks like:
// 'hotels' here is the table name. instead of parsing the url for a table name
// php will just find the table from this config. if no match, return error.
// reduces risk of arbitrary tables.
'hotels' => array(
// fields and their types, used to identify what filter to use
'structure' => array(
'hotel_id'=>'int',
'name'=>'string',
'description'=>'string',
'featured'=>'boolean',
'published'=>'boolean'
),
//these are the list of 'tasks' and accepted parameters, based on the ones above
//these are the actual parameter names which i said were the same as field names
//the ones in 'values' are usually values for inserting and updating
//the ones in 'conditions' are the ones used in the WHERE part of the query
'operations' =>array(
'add' => array(
'values' => array('name','description','featured'),
'conditions' => array()
),
'view' => array(
'values' => array(),
'conditions' => array('hotel_id')
),
'edit' => array(
'values' => array('name','description','featured'),
'conditions' => array('hotel_id')
),
'remove' => array(
'values' => array(),
'conditions' => array('hotel_id')
)
)
)
and so, from that config list:
if a parameters sent for a task is not complete, server returns an error.
if a parameter from the url is doubled, only the first parameter read is taken.
any other parameters not in the config are discarded
if that task is not allowed, it wont be listed for that table
if a task is not there, server returns an error
if a table is not there, server returns an error
I actually patterned this after seeing a component in joomla that uses this strategy. It reduces the model and controller to 4 dynamic functions which would be CRUD, leaving only the config file to be the only file editable later on (this was what I meant about dynamic code, I only add tables and tasks if further tables are needed) but I fear it may impose a security risk which I may have not known yet.
Any ideas for an alternative?

I have no problem with using the same (or very similar) names in the URL and the database — sure, you might be "exposing" implementation details, but if you're choosing radically different names in the URL and the DB, you're probably choosing bad names. I'm also a fan of consistent naming — communication with coders/testers/customers becomes much more difficult if everyone calls everything something slightly different.
What bugs me is that you're letting the user run arbitrary queries on your database. http://.../index.php?table=users&user_id=1, say? Or http://.../index.php?table=users&password=password (not that you should be storing passwords in plaintext)? Or http://.../index.php?table=users&age=11?
If the user connected to the DB has the same permissions as the user sitting in front of the web browser, it might make sense. Generally, that's not going to be the case, so you'll need some layer that knows what the user is and isn't allowed to see, and that layer is a lot easier to write correctly by whitelisting.
(If you've stuck enough logic into stored procedures, then it might work, but then your stored procedures will hard-code column names...)

When composing a SQL query with data from the input, it presents a security risk. But keep in mind that columns values are inserted to the fields by taking input from the user, analyzing it and composing a SQL query with it (except for prepared statements). So when done properly, you have nothing to worry about - simply restrict the user to those column & tables. Open source software's code/database is visible to all - and it doesn't harm the system so much as one would think.

Your aliasses could be a rot13() on the meta/name of your objects.
Although, if you escape the input accordingly when working with those names, I don't see any problem in exposing their names.

Related

Getting the value of a particular cell with AJAX

My goal is very simple and I would guess it is a very common goal among web developers.
I am creating a Rails (5.1) application, and I simply want to use AJAX to get the value of a specific cell in a specific row of a specific table in my database (later I am going to use that value to highlight some text on the current page in the user's browser).
I have not been able to find any documentation online explaining how to do this. As I said, it seems like a basic task to ask of jquery and ajax, so I'm confused as to why I'm having so much trouble figuring it out.
For concreteness, say I have a table called 'animals', and I want to get the value of the column 'species' for the animal with 'id' = 99.
How can I construct an AJAX call to query the database for the value of 'species' for the 'animal' with 'id' = 99 .
Though some DBs provide a REST API, what we commonly do is define a route in the app to pull and return data from the DB.
So:
Add a route
Add a controller/action for that route
In that action, fetch the data from the DB and render it in your preferred format
On the client-side, make the AJAX call to that controller/action and do something with the response.

Thinking sphinx ranking and statistics

I'm trying to set up an ability to get some numbers from my Sphinx indexes, but not sure how to get the info I want.
I have a mysql db with articles, sphinx index set up for that db and full text search, all working. What I want is to get some numbers:
How many times search text (keyword, or key phrase) appears over all articles for all time (more likely limited to "articles from time interval from X and to Y")
Same as previous but for how many times 2 keywords or keyphrases (so "x AND y") appear in same articles
I was doing something similar to first manually using bat file I made
indexer ind_core -c c:\%SOME_PATH%\development.sphinx.conf --buildstops stats.txt 10000 --buildfreqs
Which generated me a txt with all repeating keywords and how often they appear at early development stages, which helped to form a list of keywords I'm interested in. Now I'm trying to do the same but just for a finite list of predetermined keywords and integrated into my rails project to be able to build charts in future.
I tried running some queries like
#testing = Article.search 'Keyword1 AND Keyword2', :ranker => :wordcount
but I'm not sure how it works and how to process the result, as well as if that's what I'm looking for.
Another approach I tried was manual mysql queries such as
SELECT id,title,WEIGHT() AS w FROM ind_core WHERE MATCH('#title keyword1 | keyword2') OPTION ranker=expr('sum(hit_count)');
but I'm not sure how to process results from here either (as well as how to actually implement it into my existing rails project), and it's limited to 20 lines per query (which I think I can change somewhere in settings?). But at least looking at mysql results what I'm interested in is hit_count over all articles (or all articles from set timeframe).
Any ideas on how to do this?
UPDATE:
Current way I found was to add
#testing = Article.search params[:search], :without => {:is_active => false}, :ranker => :bm25
to controller with some conditions (so it doesn't bug out from nil search). :is_active is my soft delete flag, don't want to search deleted entries, so don't mind it. And in view I simply displayed
<%= #testing.total_entries %>
Which if I understand it correct shows me number of matches sphinx found (so pretty much what I was looking for).
So, to figure out the number of hits per document, you're pretty much on the right track, it's just a matter of getting it into Ruby/Thinking Sphinx.
To get the raw Sphinx results (if you don't need the ActiveRecord objects):
search = Article.search "foo",
:ranker => "expr('SUM(hit_count)')",
:select => "*, weight()",
:middleware => ThinkingSphinx::Middlewares::RAW_ONLY
… this will return an array of hashes, and you can use the weight() string key for the hit count, and the sphinx_internal_id string key for the model's primary key (id is Sphinx's own primary key, which isn't so useful).
Or, if you want to use the ActiveRecord objects, Thinking Sphinx has the ability to wrap each search result in a helper object which passes appropriate methods through to the underlying model instances, but lets weight respond with the values from Sphinx:
search = Article.search "foo",
:ranker => "expr('SUM(hit_count)')",
:select => "*, weight()"; ""
search.context[:panes] << ThinkingSphinx::Panes::WeightPane
search.each do |article|
puts article.weight
end
Keep in mind that panes must be added before the search is evaluated, so if you're testing this in a Rails console, you'll want to avoid letting the console inspect the search variable (which I usually do by adding ; "" at the end of the initial search call.
In both of these cases, as you've noted, the search results are paginated - you can use the :page option to determine which page of results you want, and :per_page to determine the number of records returned in each request. There is a standard limit of 1000 results overall, but that can be altered using the max_matches setting.
Now, if you want the number of times the keywords appear across all Sphinx records, then the best way to do that while also taking advantage of Thinking Sphinx's search options, is to get the raw results of an aggregate SUM - similar to the first option above.
search = Article.search "foo",
:ranker => "expr('SUM(hit_count)')",
:select => "SUM(weight()) AS count",
:middleware => ThinkingSphinx::Middlewares::RAW_ONLY
search.first["count"]

How can we use the auth_rule table in Yii2 RBAC?

In Yii 2 RBAC, there is a new table called auth_rule. Can anyone explain its usage with a small example
create table [auth_rule]
(
[name] varchar(64) not null,
[data] text,
[created_at] integer,
[updated_at] integer,
primary key ([name])
);
The basic parts of yiis RBAC-cconcept stayed exactly the same. In both Yii1 and Yii2 you have the following tables:
auth_item: holds the actual rights, groups, roles, etc.
auth_item_child: defines the graph / hierarchy of the items
auth_assignement: assigns an item to a user
In Yii2 you now have a fourth table:
auth_rule: holds reusable rules to check if a right is actually granted
Why is this?
Yii1
The concept behind the rule was already there in Yii1...kind of at least. In Yii1 you had the possibility to define a "bizrule" in auth_item and auth_assignement. "bizrule" and "data" were columns in both those tables.
The contents of the columns were the following:
bizrule: held php-code which had to return a boolean value. This code was executed during rights check with eval(). That way you could control if a right was granted or not even though the user had the item assigned. Example: it makes no sense, but you could give a user a right only on even hours with this bizrule: return date('h') % 2 == 0.
data: held params which could be passed to the bizrule while beeing executed. This data was then available in the scope of the bizrule.
Yii2
The above solution works perfectly, except that the code of a bizrule is not reusable. Therefore this functionality was extracted into its own table.
If you look at the migration-file creating the basic rbac-tables (yii\rbac\migrations\m140506_102106_rbac_init.php) you can see that the item table now has a relation to the rule-table instead of hosting the code in one of its own columns.
There is however no relationship between auth_assignement and auth_rule. In Yii1 this allowed you to disable groups of rights at once. Since you can reuse a rule and attach it to all relevant items this is no longer necessary and was therefore removed.
Example
If you look at the actual implementation of yii\rbac\DbManager and yii\rbac\BaseManager an example shouldn't be necessary. Interesting are the following mthods:
DbManager::addRule(): serializes and persists a rule-instance
DbManager::getRule(): here you can see how the rule is retrieved, unserialized and returned. This means the rule is saved in a serialized format within the data-column of auth_rule.
BaseManager::executeRule(): the rule loaded above is executed via Rule::execute()
If you want to add a rule simply create an instance of yii\rbac\Rule and call DbManager::addRule($rule) with it as its param. This will serialize and save your rule making it reusable elsewhere. Awesome!
Voilà...should be pretty clear now. If you have some open questions or want more details just write a comment.
Cheers and have a good one!
The rule attribute data is serialized.
What does this data look like? Is it like the array below as not yet unserialized?
[
'allow' => true,
'actions' => ['view'],
'roles' => ['viewPost'],
],

Design best practice - model vs controller vs UI - CakePHP, MySQL

I have been struggling for a few days with this problem and finally seek the opinion of the experts and crowd at this website.
I have two tables - one is a template of workflow steps and the other is an instance of these workflow steps called events. The templates table contains information like step name, step type etc - very generic information. The event table contains a reference link back to the workflow step table and an additional column called notes - which stores data that the user logged as they logged a particular workflow step. Both Workflow Steps and Events are linked to a POST on the website
Workflow step templates can exist without events having yet occurred - that is the user may be still on Step 3 or Step 5 and not logged an event for Step 1, 2 , 4 - basically the order of steps is only suggestive but not binding. Workflow Steps have a sequence field that dictate the order in which they should appear on screen.
Events can also occur without a workflow step - in other words, a user can log a note outside the context of workflow steps. These are generic events and directly associated with the POST
I am able to successfully retrieve both of these values for a given POST - they are retrieved as two separate arrays. I am using CakePHP and MySQL
The UI needs to render a screen that shows all the workflow steps in order and corresponding events that have occured in correlation to these steps or outside of these steps. The ordering of the screen will be driven primarily by the sequence of workflow steps and secondarily by created_date for those events that are not associated with a particular workflow step
Problem statement -
1. Do I send two separate arrays (as noted in #4) to the UI and let the UI determine the complex logic of how to interweave the steps and events for display?
2. Do I process the interweaving of steps and events in the controller and then send to the UI a simple array that it can loop through and display?
3. I have tried moving this logic to the database but because of variations explained in #2 and #3 it becomes quite complicated
I am seeking advise on which would be a better option from a design practice as well as from a simplification point of view. I understand that I have given a limited picture here but am hoping that someone on this website may have run into a similar issue elsewhere.
Depending on how you are assigning events to users, I would make a hasOne relation in Event to Workflow. You would need another relationship for you users, hasOne or hasMany.
$hasOne = 'Workflow';
Obviously this would mean that your Event table would have a column called wordflow_id and would be associated with a single row in your workflow. In the controller I would call the Event with by the user.
$this->Event->findAllByUserId($user_id);
This should provide you with an array that might look something like this.
array(
[0] => array(
[Event] => array(
'id' => 1,
'name' => 'blah',
...
),
[User] => array(
'id' => 1,
'name' => 'Charles',
...
),
[Workflow] => array(
[0] => array(
'id' => 1,
'name' => 'more blagblagblag',
...
),
[1] => array(
'id' => 9,
'name' => 'sblagblsagblag',
...
),
[2] => array(
'id' => 42,
'name' => 'mordse d',
...
)
)
)
)
Call all your workflow templates
$this->Workflow->find('all');
Then I would user cake's built in SET:: functionality to print the workflow template in your view and use your Event call to fill in the data.
Please post more detail and your code, models, ect and I'm sure we can get you the exact query/logic you'll need to achieve this.
http://book.cakephp.org/2.0/en/core-utility-libraries/set.html
OK - I have solved this. I ended up moving the functionality to the Model.
I created two SQL queries - one that retrieves all the workflow steps along with any event information that maybe associated with each of them.
Then I created a second SQL that retrieves all those events that are stand-alone and not associated with any particular workflow step
I used UNION ALL to stack them on top of each other
I used a SORT on modified date and squence number so that all the steps and events appear in chronological order and sequence
I then passed this from the Model to the View (via the controller) and let the View iterate and display the elements. This approach actually simplified my View and Controller code immensely and even the Model code is quite simple since all it is a query statement with parameters.

Help me optimize an ActiveRecord object with too many attributes

I'm working on a app which ties to a legacy database. The primary model is based on a stupidly large 100+ column table. I don't know too much about the inner-workings of ActiveRecord but it seems to me that any request on this model is slowing down because it's creating objects with 100+ attributes. Let's call this SlowModel.
Rendering pages with this model sometimes take 17 seconds on my dev computer. Straight up mysql queries only take ~ 0.5 - 1 second.
I've managed to speed up one portion of the app by using a MySQL view that selects a subset of fields (20 or so). We'll call this QuickModel. Using views is OK but isn't the most portable solution.
I will likely continue to try and add this QuickModel into other parts of the site but I was wondering if anyone had other ideas in speeding up the original object. For instance, is there a way to specify in the model what columns activerecord should just ignore and avoid building? Maybe there are specific column types (:text??) that cause bloat in ActiveRecord objects.
Assume that columns have proper indices.
You can specify which columns are returned in the model lookup using the :select option of the ActiveRecord lookup:
SlowModel.all(:select => 'id, col1, col2, col3')
...will load instances of SlowModel with only the specified columns populated.
How about having a completely new QuickModel that sits to its own table... and a QuickModel has_one SlowModel?
You can use SQL to move the most-necessary data into the QuickModel table and only refer to the SlowModel using my_quick_model.slow_model when necessary.
Alternatively, you can add a "select" to the default scope (you can google "rails default scope" for more). By default it'll only fetch the reduced set - but you can ask for all attributes by passing :select => "*" if necessary.
Along the lines of what Winfield is saying, you may want to take a look at using an attribute tracker like SlimScrooge. The tracker attempts to fetch only the data that you're using, which reduces overhead. It attempts to automatically do what Winfield is suggesting.
Example from the Readme:
# 1st request, sql is unchanged but columns accesses are recorded
Brochure Load SlimScrooged 1st time (27.1ms) SELECT * FROM `brochures` WHERE (expires_at IS NULL)
# 2nd request, only fetch columns that were used the first time
Brochure Load SlimScrooged (4.5ms) SELECT `brochures`.expires_at,`brochures`.operator_id,`brochures`.id FROM `brochures` WHERE (expires_at IS NULL)
# 2nd request, later in code we need another column which causes a reload of all remaining columns
Brochure Reload SlimScrooged (0.6ms) `brochures`.name,`brochures`.comment,`brochures`.image_height,`brochures`.id, `brochures`.tel,`brochures`.long_comment,`brochures`.image_name,`brochures`.image_width FROM `brochures` WHERE `brochures`.id IN ('5646','5476','4562','3456','4567','7355')
# 3rd request
Brochure Load SlimScrooged (4.5ms) SELECT `brochures`.expires_at,`brochures`.operator_id,`brochures`.name, `brochures`.id FROM `brochures` WHERE (expires_at IS NULL)