Spanning data over databases

Greg K's picture

He has: 2,145 posts

Joined: Nov 2003

Ok, I didn't know exactly what to name this. It does deal with databases, but it has more to do with the programming, so I put it here.

In our system, each user account has 7 tables, and due to the way my hosting company works (this is about the only gripe I have with them), you cannot create a new database via scripts, the database has to exist to begin with created through their control panel.I thought with having our own dedicated server, it would allow us to control them better, but nope.

Anyhow, after a year up and running, we have over 200 user accounts (marketing is just starting to take off). Well I like using phpMyAdmin for working on the database. As you can imagine, that left hand column listing all the tables gets TOOOOO huge. I even wrote a script that when a user has been expired for over 3 months, archives the data and removes the table. Still, when I go in, it lists over 900 tables in the database. Makes a long scroll and slows down the loading of the databases "home page".

So I have come up with this idea, I know it gets used, but don't know what the term is for it to do a search to find best practices on how to implement it. So I will describe it here, looking for suggestions to improve, alternates to, or if nothing else just the name of what the technique is. It will split the tables up across multiple databases which will be set up in advance.

I figure that 25 users max per database would be manageable via phpMyAdmin, so I'm going with that figure for now.

Based on how many current users I have that still have their data stored in a table (old ones are removed after 3 months of inactivity) right now I would need 5 databases to hold the current users. I set up 8, and even though each database gets a separate username login (username_1, username_2, username_3, etc. as you add databases.), I can give them all the same password. I am also naming the databases the same with a 4 digit number at the end (0001-whatever).

I have a table that keeps track of the databases, it has their ID (the number at the end of the database name), the number at the end of the user name, and the number of users in the database. Here is what I was going to use to find a database to use when a new user signs up. MAX_DB_USERS will be a constant set to 25.

<?php
function findFreeDB()
{
 
$sql = \"SELECT * FROM `databases` \" .
         \"WHERE `dbCount` < \" . MAX_DB_USERS . \" \" .
         \"ORDER BY `dbCount DESC LIMIT 1\";

 
$dbResult = mysql_query($sql);
 
  if (mysql_num_rows(
$dbResult) < 1)
  {
   
$sql = \"SELECT * FROM `databases` \" .
           \"ORDER BY `dbCount` LIMIT 1\";

   
$dbResult = mysql_query($sql);
    errorOut(ERR_WARN, \"Max Users Per DB on all Databases\");
  }
 
  return mysql_fetch_assoc(
$dbResult); // three element array ( ['dbID'], ['dbUser'], ['dbCount'] )
}
?>

This will try to the most full database that isn't already full. If by chance that enough people sign up before additional databases can be created, it will then go and grab the database with the least amount of users. (It will not hurt to have a database with more than 25).

Now, here is the part I could use some recommendation on, the script to suggest when a new database should be made. (it will just e-mail me a warning and I will have to go create it by hand and add it to the list of databases) So far this is what I was figuring.

Take an average of how many users sign up per day since the system went live
Take an average of how many users sign up per day in the past 3 months (I added the average for past three months, so if the sign up rate continues to climb this will be more).
Take the value of the most ever signed up per day

Now on the two averages, and multiply by 1.5 (guessing here to allow room for extras to sign up)
Now, take the most signed up ever in one day, and multiply by 1.25 (allows 25% higher rate, guessing here....)
Take the highest of the three numbers, use this as this is how much room should be left across all the databases.
If there is not this much room left in all the databases, after running a script to remove outdated users, (take count of databases * 25 - sum(dbCount)), then it will e-mail me the values for me to determine if I should add another database (or more than one if we had a boom of sign ups).

I am hoping in light of not being able to automate creating databases, that this might work. Again, any suggestions on better way to handle this, more importantly, how to estimate when it is time to add a new database, would be greatly appreciated. If nothing else, links to sites with info on something similar would be great. This is all new territory for me.

Thank you for your time, and if you need to implement something like this, let me know and I will try to help you.

-Greg

Shirthead's picture

He has: 58 posts

Joined: Jun 2006

Sounds like a very complicated way to get around an inflexible host.

JeevesBond's picture

He has: 3,956 posts

Joined: Jun 2002

The closest thing I can think of is known as Database Federation and is used where very data-heavy databases are split physically (usually across seperate servers) so one server is not too heavily loaded. The best example of this is the speeding ticket system in the US: The database is split by the state the persons car is from, so when a police person creates a ticket the system checks the database for the current state first. As most of the cars driving around a particular state are going to have license plates (or number plates as we call them) from that state, it makes sense to look in that database first, then search through the other databases afterwards. The huge amount of data is neatly split across lots of different servers though, but is still accessible.

There is another name for what you're trying to do though, and that's madness! Laughing out loud

Database federation has it's place, and is not normally needed for 200 customers (unless they each have terabytes of data associated with them!) The first thing for you to do is have a serious look at the way your database is designed. It's rather disturbing that you need to create a new table for each customer! Something is _very_ wrong with a database if you have to change the schema every day (with some noteable exceptions, such as hosting).

Why do you need to create a new table for every customer? If you don't mind me asking. Smiling

a Padded Cell our articles site!

Greg K's picture

He has: 2,145 posts

Joined: Nov 2003

Well good news, I was getting frustrated, so I wrote to one of the founders (i think) of the hosting company. (I have all e-mails saved from when I first signed up with then in 1997, and back then they responded to support from their personal accounts)

Here it turns out that when I inquired about this to begin with and was told I couldn't have access to the mysql root account, either back then they didn't allow it, or i was just told wrong. In any case, they are changing my account to give me access to mySQL root.

YEAH!

As to why it needs to be individual, well for a practical reason, if for some reason the file gets corrupt, then it is only one customers data corrupt, instead of them all (still very unlikely), from a PR standpoint, it is amazing how reassuring it is to customers that their information is not "sitting next" to someone else (they enter very detailed company info like payroll, etc).

-Greg

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.