
October 28, 2009
October 13, 2009
May 23, 2009
Download a Corrupt File? Fix it with BitTorrent
I have been working on a few virtual machine images on my remote server and I downloaded them using my fast link at work. When I came home, I found that the ZIP file I compressed the images in would not extract, most likely due to my laptop locking up during the download. Rather than downloading a few gigabytes over my home’s 150KB/s link, I decided to be more efficient.
Since I have root on the remote system, I was able to install the Debian distribution BitTorrent client and tracker while using the µTorrentclient on my laptop. Although any public or private tracker would do, even no tracker if you’re using µTorrent, I decided to set up the tracker on the remote system. To do that, we use the command
bttrack --port 6969 --dfile /tmp/dfile &
The ‘&’ character at the end means that it will detach from your console and run in the background. The next step is making the .torrent file. This process consist of the application splitting your file(s) into logical chunks and making a hash of each chunk for integrity checks after you download each chunk. You use the btmakemetafile command on the remote system which has the valid file like so;
btmakemetafile filename http://127.0.0.1:6969/announce
where “filename” is the name of the file and 127.0.0.1 is the IP or hostname of your tracker. Since the .torrent file has been made, now we can seed it on the remote system with following command
btdownloadheadless filename.torrent
This will do its own integrity check before it begins seeding.
Finally, download the .torrent file on your client system and open it with the BitTorrent client. Save the file from the BitTorrent client to the same place where it was originally downloaded. Make sure no “skip hash check” options are checked. When this is done correctly, the BitTorrent client will do the integrity check of your downloaded file(s) and mark “unfinished” those chunks that were invalid. By this time, you should be connected to the seeder on the remote system and downloading the chunks that were invalid. My .torrent’s chunk size was 1 MB and there was only one chunk I had to download. Compared to downloading a few gigabytes, I have saved quite a bit of time by only downloading one megabyte.
If you opted to not use a valid tracker URL when making your .torrent, now is your chance to go into µTorrent’s (or other clients) options and click on the “Peers” tab, right click on an empty space and then “Add Peers”. If you don’t know which port the BitTorrent client is running on the remote system, on the remote system do the following –
ps aux |grep btdownloadheadless
netstat -nap |grep PID_of_BitTorrent_client
The first command returns all of the running processes on the system and pipes the output to grep which only shows the line that has “btdownloadheadless” in it. The second column from that output would be the PID. The second command runs netstat and runs the PID and port information, then we pipe it to grep which only returns the lines that have the PID of the BitTorrent client. This may return some false positives, which is why you should check the very last column which will be in the form of “PID/process name” (this means you can skip the ‘ps aux’ step and replace the PID with the process name in the netstat part). To find the port in the output of netstat, look at the 4th column which will have the IP/hostname and then the port prefixed with the ‘:’ character. Ignore the leading ‘::’ and copy the value of that column until the end of the port, and paste it into µTorrent. This will directly connect to a seed so now you can get the chunks from the remote system without a tracker.
There are other worthwhile uses of BitTorrent, including the ROCKS Cluster distribution of install files for compute nodes through BitTorrent. Have your own examples? Please share them in the comment section below.
March 3, 2009
Automatic Passwordless Backup of MySQL DBs to Remote Machine
Assumptions: Bash, SSH, SCP, MySQL, root access, crontab
The easiest way to backup your MySQL databases to a remote server is with mysqldump and SCP along with SSH keys. It takes one command to get an SQL dump off all databases on the server (mysqldump –all-databases) but we would like to split that up. The way we do this is by first getting a list of all the databases and iterating over the list to specify which database to backup and to which file.
The following code is written in Bash, but it doesn’t really use anything unique to Bash so it would be easily portable to Tcsh and friends.
#!/bin/bash
cd /backup
databases=`/usr/bin/mysql -Bse 'show databases'`
day=`date +%A`
for i in $databases; do
if [[ "${i}" == "information_schema" ]] ; then
:
else
`/usr/bin/mysqldump $i | /bin/gzip -9 > /backup/mysql/${day}-${i}.sql.gz`
fi
done
output=`/usr/bin/scp -r /backup/mysql/${day}* username@machinename.example.com:/home/ximian/backups/mysql/.`
I keep my backups in the /backup directory so I change the directory in that. Looking further into the code, I use full paths for everything so this is unnecessary, but in case I want to change that behavior later, I know where the script is running from.
if [[ "${i}" == "information_schema" ]] ; then
:
This means that if one of the database names (${i}) is information_schema, then we skip it. The information_schema database stores information about the data and databases, which are not necessary for backups.
`/usr/bin/mysqldump $i | /bin/gzip -9 > /backup/mysql/${day}-${i}.sql.gz` – to clear up what’s going on here, we’re doing a mysqldump on the database name stored in $i then piping it to gzip for compression following by writing the file to /backup/mysql/DayOfWeek-DatabaseName.sql.gz where DayOfWeek or Day Sunday, Monday, Tuesday, etc…
`/usr/bin/scp -r /backup/mysql/${day}* username@machinename.example.com:/home/username/backups/mysql/.` – here, we are sending copy for today’s MySQL backup to machinename.example.com with “username” as our login account. The -r command to scp is means to recursively copy everything everything in the directory tree; it’s useless here but I added it to demonstrate the feature. If I specified a directory to send to the remote machine, -r would tell scp to copy everything contained in the directory, including sub-directories. In this example, we only wish to keep backups for a week. After a week, they get overwritten due to the file names we picked.
If SSH keys are not set up, there are a few simple steps to follow to get them working;
1. ssh-keygen -t dsa
2. Hit enter for everything, don’t specify a password as you want to use it for password-less login
When finished, it will present a message such as -
Your identification has been saved in /home/elvedin/.ssh/id_dsa.
Your public key has been saved in /home/elvedin/.ssh/id_dsa.pub.
3. Copy the id_dsa.pub file to the remote host with the following information –
scp ~/.ssh/id_dsa.pub username@machinename.example.com:~/.ssh/authorized_keys2
4. Done. You should be able to log in (ssh username@machinename.example.com) without being asked for a password.
To get this process to work automatically, we add it to the root user’s crontab.
crontab -e will open up an editor and will allow you to add the following;
0 0 * * * /backup/mysqlbackup.sh
The leftmost character specifies the minute of the day to run and the next field specifies the hour of day. In that example, our backup would run at midnight. If we were to change it to
0,12 0,12 * * * /backup/mysqlbackup.sh
the job would run at 12:00AM, 12:12AM, 12:00PM, 12:12PM
The next three fields specify ‘day of month’, ‘week’, and ‘day of week’. A ‘*’ means that it will run for all enumerations of those values.
February 17, 2009
Ubuntu Laptop Install
I recently purchased a laptop so I thought I would install Linux on it, as it would be handy to have and carrying around LiveCDs is a hassle. As Vista comes with a file system resize feature, it was easy to free up enough space for my Fedora or Ubuntu install. The process went through quickly and I started the Fedora installer, but I had to mess around with resolution and hardware detection settings for it not to freeze. I chose instead to try the new version of Ubuntu, 8.10 – Intrepid Ibex. The installer worked out of the box with no hardware issues until I encountered the rare screen for its disk partitioning software – I had only two options – ‘Guided (use whole disk)’ or ‘Manual’. I chose the ‘Manual’ option as I already prepared the free space in Vista.
This option brings up a screen that presents your partitions and information about them, and also the free space on the hard drives. The free space I had prepared came up as “unusable” and I couldn’t figure out why. I thought it was an issue with the Windows file system resizing feature so I tried an alternative with no success. Then I had a brainwave…
My laptop came with two partitions, one label ‘OS’ for the C: drive and another labeled ‘DATA” for the D: drive. There is also a hidden partition that OEMs like to use for recovery and other features. Through my school’s MSDNAA program, I am able to try out Microsoft software like Visual Studio and their operating systems. I took the opportunity to try Windows Server 2008 Datacenter so now the computer is up 4 partitions. All of them are set to be primary partitions, which means that Ubuntu cannot be installed on it until I free up a primary partition.
It should be documented in the installer as other people have encountered the same issue as I did without knowing the cause, but I decided to put it here as an alternative.
January 2, 2009
Moodle Course Quotas
This implementation of course quotas works only for Moodle 1.8.5+ and 1.9.3+. While it can work for other versions, it was tested for those. The goal of this quota system is to prevent instructors from going over the quota, however the students are able to upload files if the course is over the quota as their (homework) submissions are important.
The quota works on two levels; site wide and course level. The site wide quota applies to all courses that do not have a course level quota and course level quotas apply to a specific course while overriding site wide quota. The quota enforcement only works on total size and not number of files, however it can be expanded to handle that case.
To begin installation, we must first manually create a table in the Moodle database. We run the following SQL query to do that assuming the Moodle database prefix is ‘mdl_’ -
create table mdl_sitequota(id int not null auto_increment, primary key(id), courseid int not null, quota int not null, unit varchar(1));
The ‘id’ primary key is required by Moodle’s update_record() function which is used for portability. The fields ‘courseid’ (Moodle course ID) and ‘quota’ (numeric limit) are integers, and a one character field ‘unit’ for the storage unit. To avoid issues with large integers as file sizes are typically reported in bytes by various utilities and portability as well, the unit field was included which is denoted by a single character of ‘B’ for bytes, ‘K’ for kilobytes, ‘M’ for megabytes, and ‘G’ for gigabytes.
We should set a default, site wide quota of 100 MB by running the following SQL query -
insert into mdl_sitequota (courseid, quota, unit) values (0, 100, 'M');
A courseid of 0 denotes the sitewide default; any sites that do not have an entry in this table are limited by this value.
Next are the files needed to make this work. Go into the Moodle directory, then admin/report directories. Extract the following files in there – ZIP / TAR.GZ
The files interface must be changed, which is located in the Moodle directory /files/index.php. In ‘index.php’, look for the following lines of code -
case "upload":
html_header($course, $wdir);
require_once($CFG->dirroot.'/lib/uploadlib.php');
Below that, add
require_once($CFG->libdir.'/umnlib.php');
$tablename = "sitequota";
$coursequota = getquotadetails($COURSE->id);
if ($coursequota == NULL) {
$coursequota = getquotadetails(0);
}
$coursesize = get_directory_size("{$CFG->dataroot}/$COURSE->id"); // in bytes - may overflow for large directories
// this switch is for the size of the course directory in a the same unit as the quota unit is defined in
switch ($coursequota->unit) {
case "B":
break;
case "K":
$coursesize /= 1024;
break;
case "M":
$coursesize /= 1048576;
break;
case "G":
$coursesize /= 1073741824;
break;
case "T":
$coursesize /= 1099511627776;
break;
case "P":
$coursesize /= 1125899906842624;
break;
case "E":
$coursesize /= 1152921504606846976;
break;
}
if ($coursesize > (int) $coursequota->quota && isteacher()) {
error("Unable to upload files: Total site size is $coursesize $coursequota->unit and quota is $coursequota->quota $coursequota->unit", "{$CFG->wwwroot}/files/index.php?id=$COURSE->id");
}
print_simple_box_start();
echo "Using $coursesize " . readablestorageunit($coursequota->unit) . " out of $coursequota->quota " . readablestorageunit($coursequota->unit) . " allowed";
print_simple_box_end();
And we’re almost done. We add a library file which stores some of the functions used by the site quotas and the language file as well. For the library, go into Moodle’s /lib directory and create the file ‘umnlib.php’. In it, insert
function readablestorageunit($unitchar=NULL) {
if ($unitchar == NULL) {
return "";
}
switch ($unitchar) {
case "B":
return "bytes";
break;
case "K":
return "KB (Kilobytes)";
break;
case "M":
return "MB (Megabytes)";
break;
case "G":
return "GB (Gigabytes)";
break;
case "T":
return "TB (Terabytes)";
break;
case "P":
return "PB (Petabytes)";
break;
case "E":
return "EB (Exabytes)";
break;
}
return "";
}
// if no argument given, assume $courseid=-1 as $courseid = 0 means sitewide default
function getquotadetails($courseid=-1) {
global $CFG, $tablename;
$courseid = (int) $courseid;
if ($courseid == -1 || $courseid < 0) {
return NULL;
}
$details = get_record($tablename, 'courseid', $courseid);
if (empty($details)) {
return NULL;
}
else {
return $details;
}
}
These first function prints the unit of storage in a easily readable form while the second is a convenience function. Our last step of installing is adding the language file. Go into Moodle's /lang/en_utf8/ directory and create the file 'umn.php'. In it, insert
$string['sitequota'] = 'Site Quota';
$string['sitequotadefaultsettings'] = 'Default Settings';
$string['sitequotaquota'] = 'Quota: ';
$string['sitequotaunit'] = 'Storage Unit: ';
$string['sitequotacourseid'] = 'Course ID: ';
$string['nodefaultquota'] = 'Default quota does not exist and manual addition of 100 GB quota failed - check that site quota table exists';
$string['newdefaultunitfail'] = 'New default unit of storage was not added - database error';
$string['updatedefaultunitfail'] = 'Unable to update default unit of storage - database error';
$string['invalidstorageunit'] = 'The unit of storage was not valid';
$string['defaultquotazero'] = 'Default quota size cannot be less than 0';
$string['newdefaultquotafail'] = 'New default quota was not added - database error';
$string['updatedefaultquotafail'] = 'Unable to update default quota - database error';
$string['sitequotanocourseid'] = 'Quota update failed - must include course ID';
$string['newunitfail'] = 'New unit of storage for course was not added - database error';
$string['updateunitfail'] = 'Unit of storage was not updated for course - database error';
$string['newquotafail'] = 'New quota for course was not added - database error';
$string['updatequotafail'] = 'Quota was not updated for course - database error';
$string['backupdatasize'] = 'Backupdata Size';
$string['totalsitesize'] = 'Total site size';
$string['sitequotacourseid'] = 'Moodle Course ID';
$string['noinstorsdne'] = 'No instructors or site does not exist';
$string['sitequotainstructors'] = 'Instructor';
?>
These can be changed from the languages interface for localization. Finally, we are able to use the quotas interface. Go to http://www.yourmoodlesite.com/admin After logging in as an administrator, go to Reports and Site Quotas. If a link appears as "[[sitequota]]" then add the string below into Moodle's lang/en_utf8/admin.php
$string['sitequota'] = 'Site Quota';
If there are no errors thrown, you are able to use site quotas. Note that when adding a new quota for a course, if either the quota or the storage unit are left out, they are taken from the site level quota. For updating a course level quota, if one field is left out, it will be unchanged.
Please note that this has not been thoroughly tested so I would like feedback about bugs.
Sorry for the formatting, I got tired of wrestling Wordpress with it. Copy the language file that goes into /lang/en_utf8/ from here, the /files/index.php modification from here, and the umnlib.php that goes into /lib from here.
June 6, 2008
Note on Software License
For the Moodle scripts, it’s free for personal use. Including it with any sort of package without my permission is forbidden (subject to change). Update to date license is in http://blog.ods.org/mood/license
Moodle: Finding Inactive Sites, Part 2
Exactly like the previous article of finding inactive sites, the script sends a CSV file to the browser with the columns DaysInactive,URL to the course site, Instructor, and the intstructors e-mail address. For courses with multiple instructors, each instructor gets a row in the CSV instead of putting all instructors and their e-mail addresses in one row.
Moodle: Finding Inactive Sites
This script outputs inactive courses based on mdl_log table from the database, so you must have that logging on.
Moodle: How to Find Course Size in Unix
The following script outputs a table with file sizes using du (YOU MUST HAVE IT) for each course in moodledata. As Wordpress cannot correctly display HTML as text, I’ve linked to it. Drop it in your main moodle directory, the same one that has config.php
