Teradata column usage details - teradata-sql-assistant

I have been a lurking visitor here for years, but I think this is my first time asking a question, so here goes:
Is there a way in Teradata SQL Assistant 16.20 to find how often specific table columns are used in queries, without being a DB admin? Basically, we have a rather large table that keeps growing in number of columns, and we would like to deprecate ones that are not being used by anyone but can't very well ask over 100 users which columns they use. My team "manages" this table to the extent of creating it and populating the data, but we do not have full admin rights to the database, so any solution that would require full DB access isn't really an option without submitting a ticket, which I can do but would rather find a solution I can do on my own.
Thanks in advance!

Related

Best Practice for a database that continues to expand with more tables

I am a Ruby on Rails application developer and have helped with the creation of a database "college_a", containing information about the college I work for.
Initially, "college_a" contained tables for apps, apps_auths, and apps_roles. Over the past years it has expanded to contain tables for employees, job info, and supervisors ... all in the "college_a" database. I am looking at saving data about email accounts and am wondering if there are any drawbacks to continuing to add more tables to "college_a" or should I start looking at creating a new database representing data on email accounts which will need to reference data from the "college_a.people" table.
Not having experience with the maintenance/usage of an ever expanding database or needing to query multiple databases to gather needed data, is why I am asking if there are any negatives from continuing to add more tables to "college_a" database?
Thank you.
Well, there is no problem in keep adding tables, as long as they are well designed and justified. In other words, I don't think there is such thing as too much tables, but you must make an effort to keep it well designed.
This means there shouldn't be any redundancy, you should try to keep the database normalized, you should try to keep a documentation about it, etc.
I found this article that enumerates some database mistakes that I think you may find helpful.

What Are Good Solutions for a Database Table that Gets to Long?

I will describe a problem using a specific scenario:
Imagine that you create a website towhich users can register,
and after they register, they can send Private Messages to each other.
This website enables every user to maintain his own Friends list,
and also maintain a Blocked Users list, from which he prefers not to get messages.
Now the problem:
Imagine this website getting to several millions of users,
and let's also assume that every user has about 10 Friends in the Friends table, and 10 Blocked Users in the Blocked Users table.
The Friends list Table, and the Blocked Users table, will become very long,
but worse than that, every time when someone wants to send a message to another person "X",
we need to go over the whole Blocked Users table, and look for records that the user "X" defined - people he blocked.
This "scanning" of a long database table, each time a message is sent from one user to another, seems quite inefficient to me.
So I have 2 questions about it:
What are possible solutions for this problem?
I am not afraid of long database tables,
but I am afraid of database tables that contain data for so many users,
which means that the whole table needs to be scanned every time, just to pull out a few records from it for that specific user.
A specific solution that I have in my mind, and that I would like to ask about:
One solution that I have in mind for this problem, is that every user that registers to the website, will have his own "mini-database" dynamically (and programmatically) created for him,
that way the Friends table, an the Blocked Users table, will contain only records for him.
This makes scanning those table very easy, because all the records are for him.
Does this idea exist in Databases like MS-SQL Server, or MySQL? And If yes, is it a good solution for the described problem?
(each user will have his own small database created for him, and of course there is also the main (common) database for all other data that is not user specific)
Thank you all
I would wait on the partitioning and on creating mini-database idea. Is your database installed with the data, log and temp files on different RAID drives? Do you have clustered indexes on the tables and indexes on the search and join columns?
Have you tried any kind of reading Query Plans to see how and where the slowdowns are occurring? Don't just add memory or try advanced features blindly before doing the basics.
Creating separate databases will become a maintenance nightmare and it will be challenging to do the type of queries (for all users....) that you will probably like to do in the future.
Partitioning is a wonderful feature of SQL Server and while in 2014 you can have thousands of partitions you probably (unless you put each partition on a separate drive) won't see the big performance bump you are looking for.
SQL Server has very fast response time for tables (especially for tables with 10s of millions of rows (in your case the user table)). Don't let the main table get too wide and the response time will be extremely fast.
Right off the bat my first thought is this:
https://msdn.microsoft.com/en-us/library/ms188730.aspx
Partitioning can allow you to break it up into more manageable pieces and in a way that can be scalable. There will be some choices you have to make about how you break it up, but I believe this is the right path for you.
In regards to table scanning if you have proper indexing you should be getting seeks in your queries. You will want to look at execution plans to know for sure on this though.
As for having mini-DB for each user that is sort of what you can accomplish with partitioning.
Mini-Database for each user is a definite no-go zone.
Plus on a side note A separate table to hold just Two columns UserID and BlockedUserID both being INT columns and having correct indexes, you cannot go wrong with this approach , if you write your queries sensibly :)
look into table partitioning , also a well normalized database with decent indexes will also help.
Also if you can afford Enterprise Licence table partitioning with the table schema described in last point will make it a very good , query friendly database schema.
I did it once for a social network system. Maybe you can look for your normalization. At the time I got a [Relationship] table and it just got
UserAId Int
UserBId Int
RelationshipFlag Smallint
With 1 million users and each one with 10 "friends" that table got 10 millions rows. Not a problem since we put indexes on the columns and it can retrieve a list of all "related" usersB to a specific userA in no time.
Take a good look on your schema and your indexes, if they are ok you DB ill not got problems handling it.
Edit
I agree with #M.Ali
Mini-Database for each user is a definite no-go zone.
IMHO you are fine if you stick with the basic and implement it the right way

DB schema to store billion e-mails

I'm trying to develop an application where users can import their e-mails into and search their imported e-mails. As this will probably be used by many users (easily 10k+) the database design is critical. With these numbers of users the database will probably need to be able to hold over a billion rows (e-mails).
The application will need to be able to quickly return records after a search query is posted on the application. The database will be heavily searched and I would like some help on creating the database table(s) for creating an efficient db schema. I have a lot experience with MySQL myself but I've read somewhere I shouldn't go that way and go look for MongoDB or something? Is the difference so big or is there any way I can still go with MySQL?
from
to
subject
date (range)
attachments (names & types only)
message contents
(optional) mailbox / folder structure
These are the searchable fields, of course all e-mails will have an extra two "columns" for the unique id and the user_id. I've found several db schemas of e-mail but I can't find any documentation of a schema that will work with over a billion rows.
You would be best off starting simple with your proposed table definition and going from there - if the site does get near a billion records then if needed you can move it to amazon servers or another cloud host which (should) allow the table to the partioned.
MySQL can handle a fair amount of data, assuming you are not on a shared host with restrictions.
So, start simple, dont optimise a problem that doesnt exist yet, and see how it goes.

A practical way to organize MySQL database info?

I'm wondering what is the best way to do this:
Person submits review (name, email, webhost, domain hosted, ranking data [five numerical factors], etc.)
PHP inserts that review into "SubmittedReviews" table
I then oversee the submitted reviews in the back end, then submit the ones I want
4 . PHP inserts that info into another table called "LiveReviews" (which has the same table structure as "SubmittedReviews")
(OR, might be better to have PHP create a table for each host, with that hosts' reviews inside it, since there will be many hosts and I'm going to make a separate table to pre-calculate ranking data for my "top hosts" table on the site)
So as I have PHP submit the reviews live (creating a new table [for each host] or just into the "LiveReviews" table) I will also submit the ranking data into another table, adding up all the ranking data for each host, so it is readily available and so I know which hosts are ranking highest.
Or should I just use PHP to calculate the LiveReviews ranking data on the spot when I want to know which host is ranking best? Seeing as the front page will be loading this data often, doesn't seem it'd be good to calculate it every time. I'd rather have it calculated beforehand.
So if I have this "ranking data" table, then it seems I should have tables for each host with all their reviews. Otherwise, it seems just having one large table (LiveReviews) is better.
Hope this makes sense!
What is the best way? I'm pretty new to MySQL.
I must admit that I don't understant exactly what you're trying to achive. What exactly are your users supposed to review?
However I do have two pieces of advice:
Never organize stuff by placing it in different tables. It's heresy in the world of relational databases! And your queries will en up uneccesary complicated.
And consider doing calculation in the DB, not in PHP.
Learning database modeling is a good idea. Entity-Relationship diagrams are not hard but very useful.
Good luck.

Max Tables & Design Pattern

I am working on an app right now which has the potential to grow quite large. The whole application runs through a single domain, with customers being given sub-domains, which means that it all, of course, runs through a common code-base.
What I am struggling with is the database design. I am not sure if it would be better to have a column in each table specifying the customer id, or to create a new set of tables (in the same database), or to create a complete new database per customer.
The nice thing about a "flag" in the database specifying the customer id is that everything is in a single location. The downfalls are obvious- Tables can (will) get huge, and maintenance can become a complete nightmare. If growth occurs, splitting this up over several servers is going to be a huge pain.
The nice thing about creating new tables it is easy to do, and also keeps the tables pretty small. And since customers data doesn't need to interact, there aren't any problems there. But again, maintenance might become an issue (Although I do have a migrations library that will do updates on the fly per customer, so that is no big deal). The other issue is I have no idea how many tables can be in a single database. Does anyone know what the limit is, and what the performance issues would be?
The nice thing about creating a new database per customer, is that when I need to scale, I will be able to, quite nicely. There are several sites that make use of this design (wordpress.com, etc). It has been shown to be effective, but also have some downfalls.
So, basically I am just looking for some advice on which direction I should (could) go.
Single Database Pros
One database to maintain. One database to rule them all, and in the darkness - bind them...
One connection string
Can use Clustering
Separate Database per Customer Pros
Support for customization on per customer basis
Security: No chance of customers seeing each others data
Conclusion
The separate database approach would be valid if you plan to support customer customization. Otherwise, I don't see the security as a big issue - if someone gets the db credentials, do you really think they won't see what other databases are on that server?
Multiple Databases.
Different customers will have different needs, and it will allow you to serve them better.
Furthermore, if a particular customer is hammering the database, you don't want that to negatively affect the site performance for all your other customers. If everything is on one database, you have no damage control mechanism.
The risk of accidentally sharing data between customers is much smaller with separate database. If you'd like to have all data in one place, for example for reporting, set up a reporting database the customers cannot access.
Separate databases allow you to roll out, and test, a bugfix for just one customer.
There is no limit on the amount of tables in MySQL, you can make an insane amount of them. I'd call anything above a hundred tables per database a maintenance nightmare though.
Are you planning to develop a Cloud App?
I think that you don´t need to make tables or data bases by customer. I recommend you to use a more scalable relational database management system. Personally I don´t know the capabilities of MySQL, but i´m pretty sure that it should support distributed data base model in order to handle the load.
creating tables or databases per customer can lead you to a maintenance nightmare.
I have worked with multi-company databases and every table contains customer ids and to access its data we develop views per customer (for reporting purposes)
Good luck,
You can do whatever you want.
If you've got the customer_id in each column, then you've got to write the whole application that way. That's not exactly true as there should be enough to add that column only to some tables, the rest could be done using some simple joins.
If you've got one database per user, there won't be any additional code in the application so that could be easier.
If you take to first approach there won't be a problem to move to many databases as you can have the customer_id column in all those tables. Of course then there will be the same value in this column in each table, but that's not a problem.
Personally I'd take the simple one customer one database approach. Easier to user more database servers for all customers, more difficult to show a customer data that belongs some other customer.