InvXim

May 23, 2009

Download a Corrupt File? Fix it with BitTorrent

Filed under: The Internets — Tags: — Elvedin @ 4:38 pm

I have been working on a few virtual machine images on my remote server and I downloaded them using my fast link at work. When I came home, I found that the ZIP file I compressed the images in would not extract, most likely due to my laptop locking up during the download. Rather than downloading a few gigabytes over my home’s 150KB/s link, I decided to be more efficient.

Since I have root on the remote system, I was able to install the Debian distribution BitTorrent client and tracker while using the µTorrentclient on my laptop. Although any public or private tracker would do, even no tracker if you’re using µTorrent, I decided to set up the tracker on the remote system. To do that, we use the command

bttrack --port 6969 --dfile /tmp/dfile &

The ‘&’ character at the end means that it will detach from your console and run in the background. The next step is making the .torrent file. This process consist of the application splitting your file(s) into logical chunks and making a hash of each chunk for integrity checks after you download each chunk. You use the btmakemetafile command on the remote system which has the valid file like so;

btmakemetafile filename http://127.0.0.1:6969/announce

where “filename” is the name of the file and 127.0.0.1 is the IP or hostname of your tracker. Since the .torrent file has been made, now we can seed it on the remote system with following command

btdownloadheadless filename.torrent

This will do its own integrity check before it begins seeding.

Finally, download the .torrent file on your client system and open it with the BitTorrent client. Save the file from the BitTorrent client to the same place where it was originally downloaded. Make sure no “skip hash check” options are checked. When this is done correctly, the BitTorrent client will do the integrity check of your downloaded file(s) and mark “unfinished” those chunks that were invalid. By this time, you should be connected to the seeder on the remote system and downloading the chunks that were invalid. My .torrent’s chunk size was 1 MB and there was only one chunk I had to download. Compared to downloading a few gigabytes, I have saved quite a bit of time by only downloading one megabyte.

If you opted to not use a valid tracker URL when making your .torrent, now is your chance to go into µTorrent’s (or other clients) options and click on the “Peers” tab, right click on an empty space and then “Add Peers”. If you don’t know which port the BitTorrent client is running on the remote system, on the remote system do the following –


ps aux |grep btdownloadheadless
netstat -nap |grep PID_of_BitTorrent_client

The first command returns all of the running processes on the system and pipes the output to grep which only shows the line that has “btdownloadheadless” in it. The second column from that output would be the PID. The second command runs netstat and runs the PID and port information, then we pipe it to grep which only returns the lines that have the PID of the BitTorrent client. This may return some false positives, which is why you should check the very last column which will be in the form of “PID/process name” (this means you can skip the ‘ps aux’ step and replace the PID with the process name in the netstat part). To find the port in the output of netstat, look at the 4th column which will have the IP/hostname and then the port prefixed with the ‘:’ character. Ignore the leading ‘::’ and copy the value of that column until the end of the port, and paste it into µTorrent. This will directly connect to a seed so now you can get the chunks from the remote system without a tracker.

There are other worthwhile uses of BitTorrent, including the ROCKS Cluster distribution of install files for compute nodes through BitTorrent. Have your own examples? Please share them in the comment section below.

March 3, 2009

Automatic Passwordless Backup of MySQL DBs to Remote Machine

Filed under: The Internets — Elvedin @ 2:33 am

Assumptions: Bash, SSH, SCP, MySQL, root access, crontab

The easiest way to backup your MySQL databases to a remote server is with mysqldump and SCP along with SSH keys. It takes one command to get an SQL dump off all databases on the server (mysqldump –all-databases) but we would like to split that up. The way we do this is by first getting a list of all the databases and iterating over the list to specify which database to backup and to which file.

The following code is written in Bash, but it doesn’t really use anything unique to Bash so it would be easily portable to Tcsh and friends.


#!/bin/bash
cd /backup
databases=`/usr/bin/mysql -Bse 'show databases'`
day=`date +%A`
for i in $databases; do
if [[ "${i}" == "information_schema" ]] ; then
:
else
`/usr/bin/mysqldump $i | /bin/gzip -9 > /backup/mysql/${day}-${i}.sql.gz`
fi
done

output=`/usr/bin/scp -r /backup/mysql/${day}* username@machinename.example.com:/home/ximian/backups/mysql/.`

I keep my backups in the /backup directory so I change the directory in that. Looking further into the code, I use full paths for everything so this is unnecessary, but in case I want to change that behavior later, I know where the script is running from.

if [[ "${i}" == "information_schema" ]] ; then
:
This means that if one of the database names (${i}) is information_schema, then we skip it. The information_schema database stores information about the data and databases, which are not necessary for backups.

`/usr/bin/mysqldump $i | /bin/gzip -9 > /backup/mysql/${day}-${i}.sql.gz` – to clear up what’s going on here, we’re doing a mysqldump on the database name stored in $i then piping it to gzip for compression following by writing the file to /backup/mysql/DayOfWeek-DatabaseName.sql.gz where DayOfWeek or Day Sunday, Monday, Tuesday, etc…

`/usr/bin/scp -r /backup/mysql/${day}* username@machinename.example.com:/home/username/backups/mysql/.` – here, we are sending copy for today’s MySQL backup to machinename.example.com with “username” as our login account. The -r command to scp is means to recursively copy everything everything in the directory tree; it’s useless here but I added it to demonstrate the feature. If I specified a directory to send to the remote machine, -r would tell scp to copy everything contained in the directory, including sub-directories. In this example, we only wish to keep backups for a week. After a week, they get overwritten due to the file names we picked.

If SSH keys are not set up, there are a few simple steps to follow to get them working;
1. ssh-keygen -t dsa
2. Hit enter for everything, don’t specify a password as you want to use it for password-less login
When finished, it will present a message such as -
Your identification has been saved in /home/elvedin/.ssh/id_dsa.
Your public key has been saved in /home/elvedin/.ssh/id_dsa.pub.

3. Copy the id_dsa.pub file to the remote host with the following information –

scp ~/.ssh/id_dsa.pub username@machinename.example.com:~/.ssh/authorized_keys2

4. Done. You should be able to log in (ssh username@machinename.example.com) without being asked for a password.

To get this process to work automatically, we add it to the root user’s crontab.
crontab -e will open up an editor and will allow you to add the following;

0 0 * * * /backup/mysqlbackup.sh

The leftmost character specifies the minute of the day to run and the next field specifies the hour of day. In that example, our backup would run at midnight. If we were to change it to

0,12 0,12 * * * /backup/mysqlbackup.sh

the job would run at 12:00AM, 12:12AM, 12:00PM, 12:12PM

The next three fields specify ‘day of month’, ‘week’, and ‘day of week’. A ‘*’ means that it will run for all enumerations of those values.

February 17, 2009

Ubuntu Laptop Install

Filed under: The Internets — Elvedin @ 3:43 pm

I recently purchased a laptop so I thought I would install Linux on it, as it would be handy to have and carrying around LiveCDs is a hassle. As Vista comes with a file system resize feature, it was easy to free up enough space for my Fedora or Ubuntu install. The process went through quickly and I started the Fedora installer, but I had to mess around with resolution and hardware detection settings for it not to freeze. I chose instead to try the new version of Ubuntu, 8.10 – Intrepid Ibex. The installer worked out of the box with no hardware issues until I encountered the rare screen for its disk partitioning software – I had only two options – ‘Guided (use whole disk)’ or ‘Manual’. I chose the ‘Manual’ option as I already prepared the free space in Vista.

This option brings up a screen that presents your partitions and information about them, and also the free space on the hard drives. The free space I had prepared came up as “unusable” and I couldn’t figure out why. I thought it was an issue with the Windows file system resizing feature so I tried an alternative with no success. Then I had a brainwave…

My laptop came with two partitions, one label ‘OS’ for the C: drive and another labeled ‘DATA” for the D: drive. There is also a hidden partition that OEMs like to use for recovery and other features. Through my school’s MSDNAA program, I am able to try out Microsoft software like Visual Studio and their operating systems. I took the opportunity to try Windows Server 2008 Datacenter so now the computer is up 4 partitions. All of them are set to be primary partitions, which means that Ubuntu cannot be installed on it until I free up a primary partition.

It should be documented in the installer as other people have encountered the same issue as I did without knowing the cause, but I decided to put it here as an alternative.

January 2, 2009

Moodle Course Quotas

Filed under: The Internets — Tags: — Elvedin @ 10:59 pm

This implementation of course quotas works only for Moodle 1.8.5+ and 1.9.3+. While it can work for other versions, it was tested for those. The goal of this quota system is to prevent instructors from going over the quota, however the students are able to upload files if the course is over the quota as their (homework) submissions are important.

The quota works on two levels; site wide and course level. The site wide quota applies to all courses that do not have a course level quota and course level quotas apply to a specific course while overriding site wide quota. The quota enforcement only works on total size and not number of files, however it can be expanded to handle that case.

To begin installation, we must first manually create a table in the Moodle database. We run the following SQL query to do that assuming the Moodle database prefix is ‘mdl_’ -

create table mdl_sitequota(id int not null auto_increment, primary key(id), courseid int not null, quota int not null, unit varchar(1));

The ‘id’ primary key is required by Moodle’s update_record() function which is used for portability. The fields ‘courseid’ (Moodle course ID) and ‘quota’ (numeric limit) are integers, and a one character field ‘unit’ for the storage unit. To avoid issues with large integers as file sizes are typically reported in bytes by various utilities and portability as well, the unit field was included which is denoted by a single character of ‘B’ for bytes, ‘K’ for kilobytes, ‘M’ for megabytes, and ‘G’ for gigabytes.

We should set a default, site wide quota of 100 MB by running the following SQL query -

insert into mdl_sitequota (courseid, quota, unit) values (0, 100, 'M');

A courseid of 0 denotes the sitewide default; any sites that do not have an entry in this table are limited by this value.

Next are the files needed to make this work. Go into the Moodle directory, then admin/report directories. Extract the following files in there – ZIP / TAR.GZ

The files interface must be changed, which is located in the Moodle directory /files/index.php. In ‘index.php’, look for the following lines of code -
case "upload":
html_header($course, $wdir);
require_once($CFG->dirroot.'/lib/uploadlib.php');

Below that, add

require_once($CFG->libdir.'/umnlib.php');

$tablename = "sitequota";
$coursequota = getquotadetails($COURSE->id);
if ($coursequota == NULL) {
$coursequota = getquotadetails(0);
}

$coursesize = get_directory_size("{$CFG->dataroot}/$COURSE->id"); // in bytes - may overflow for large directories
// this switch is for the size of the course directory in a the same unit as the quota unit is defined in
switch ($coursequota->unit) {
case "B":
break;
case "K":
$coursesize /= 1024;
break;
case "M":
$coursesize /= 1048576;
break;
case "G":
$coursesize /= 1073741824;
break;
case "T":
$coursesize /= 1099511627776;
break;
case "P":
$coursesize /= 1125899906842624;
break;
case "E":
$coursesize /= 1152921504606846976;
break;
}

if ($coursesize > (int) $coursequota->quota && isteacher()) {
error("Unable to upload files: Total site size is $coursesize $coursequota->unit and quota is $coursequota->quota $coursequota->unit", "{$CFG->wwwroot}/files/index.php?id=$COURSE->id");
}

print_simple_box_start();
echo "Using $coursesize " . readablestorageunit($coursequota->unit) . " out of $coursequota->quota " . readablestorageunit($coursequota->unit) . " allowed";
print_simple_box_end();

And we’re almost done. We add a library file which stores some of the functions used by the site quotas and the language file as well. For the library, go into Moodle’s /lib directory and create the file ‘umnlib.php’. In it, insert

function readablestorageunit($unitchar=NULL) {

if ($unitchar == NULL) {
return "";
}

switch ($unitchar) {
case "B":
return "bytes";
break;
case "K":
return "KB (Kilobytes)";
break;
case "M":
return "MB (Megabytes)";
break;
case "G":
return "GB (Gigabytes)";
break;
case "T":
return "TB (Terabytes)";
break;
case "P":
return "PB (Petabytes)";
break;
case "E":
return "EB (Exabytes)";
break;
}
return "";
}

// if no argument given, assume $courseid=-1 as $courseid = 0 means sitewide default
function getquotadetails($courseid=-1) {
global $CFG, $tablename;

$courseid = (int) $courseid;

if ($courseid == -1 || $courseid < 0) {
return NULL;
}

$details = get_record($tablename, 'courseid', $courseid);
if (empty($details)) {
return NULL;
}
else {
return $details;
}
}

These first function prints the unit of storage in a easily readable form while the second is a convenience function. Our last step of installing is adding the language file. Go into Moodle's /lang/en_utf8/ directory and create the file 'umn.php'. In it, insert

$string['sitequota'] = 'Site Quota';
$string['sitequotadefaultsettings'] = 'Default Settings';
$string['sitequotaquota'] = 'Quota: ';
$string['sitequotaunit'] = 'Storage Unit: ';
$string['sitequotacourseid'] = 'Course ID: ';
$string['nodefaultquota'] = 'Default quota does not exist and manual addition of 100 GB quota failed - check that site quota table exists';
$string['newdefaultunitfail'] = 'New default unit of storage was not added - database error';
$string['updatedefaultunitfail'] = 'Unable to update default unit of storage - database error';
$string['invalidstorageunit'] = 'The unit of storage was not valid';
$string['defaultquotazero'] = 'Default quota size cannot be less than 0';
$string['newdefaultquotafail'] = 'New default quota was not added - database error';
$string['updatedefaultquotafail'] = 'Unable to update default quota - database error';
$string['sitequotanocourseid'] = 'Quota update failed - must include course ID';
$string['newunitfail'] = 'New unit of storage for course was not added - database error';
$string['updateunitfail'] = 'Unit of storage was not updated for course - database error';
$string['newquotafail'] = 'New quota for course was not added - database error';
$string['updatequotafail'] = 'Quota was not updated for course - database error';
$string['backupdatasize'] = 'Backupdata Size';
$string['totalsitesize'] = 'Total site size';
$string['sitequotacourseid'] = 'Moodle Course ID';
$string['noinstorsdne'] = 'No instructors or site does not exist';
$string['sitequotainstructors'] = 'Instructor';
?>

These can be changed from the languages interface for localization. Finally, we are able to use the quotas interface. Go to http://www.yourmoodlesite.com/admin After logging in as an administrator, go to Reports and Site Quotas. If a link appears as "[[sitequota]]" then add the string below into Moodle's lang/en_utf8/admin.php

$string['sitequota'] = 'Site Quota';

If there are no errors thrown, you are able to use site quotas. Note that when adding a new quota for a course, if either the quota or the storage unit are left out, they are taken from the site level quota. For updating a course level quota, if one field is left out, it will be unchanged.

Please note that this has not been thoroughly tested so I would like feedback about bugs.

Sorry for the formatting, I got tired of wrestling Wordpress with it. Copy the language file that goes into /lang/en_utf8/ from here, the /files/index.php modification from here, and the umnlib.php that goes into /lib from here.

June 6, 2008

Note on Software License

Filed under: The Internets — Elvedin @ 4:41 pm

For the Moodle scripts, it’s free for personal use. Including it with any sort of package without my permission is forbidden (subject to change). Update to date license is in http://blog.ods.org/mood/license

Moodle: Finding Inactive Sites, Part 2

Filed under: The Internets — Elvedin @ 4:36 pm

Exactly like the previous article of finding inactive sites, the script sends a CSV file to the browser with the columns DaysInactive,URL to the course site, Instructor, and the intstructors e-mail address. For courses with multiple instructors, each instructor gets a row in the CSV instead of putting all instructors and their e-mail addresses in one row.

Inactive Moodle Site CSV Output Script

Moodle: Finding Inactive Sites

Filed under: The Internets — Elvedin @ 4:33 pm

This script outputs inactive courses based on mdl_log table from the database, so you must have that logging on.

Inactive Moodle Sites Script

Moodle: How to Find Course Size in Unix

Filed under: The Internets — Elvedin @ 4:25 pm

The following script outputs a table with file sizes using du (YOU MUST HAVE IT) for each course in moodledata. As Wordpress cannot correctly display HTML as text, I’ve linked to it. Drop it in your main moodle directory, the same one that has config.php

Moodle Site Size Script

June 2, 2008

State of MPI on the IBM Cell and MPI Bandwidth/Latency Testing

Filed under: The Internets — Elvedin @ 3:13 pm

Currently, OpenMPI is the best option for Cell blades due to its loopback communication performance. Loopback communiction is used when a node is sending data to another node on the same system, which on a Cell blade is likely due to it having dual Cell Broadband Engines.  OpenMPI already implements shared memory for loopback communication, but MPICH2 uses less effecient means for this type of loopback communication. That means MPICH2 has lower performance in the average case until the the communcation module “Nemesis” is released for the Cell/PowerPC.

To test these conjectures, take a look at the classic MPI Ping-pong example (C or Fortran77). The output from a Cell QS20 running OpenMPI version 1.2.1 is below -

Hello from 0 of 2
Hello from 1 of 2
Timer accuracy of ~4.053116 usecs

8 bytes took        70 usec (   0.228 MB/sec)
16 bytes took        13 usec (   2.440 MB/sec)
32 bytes took        11 usec (   5.836 MB/sec)
64 bytes took        11 usec (  11.671 MB/sec)
128 bytes took        10 usec (  25.565 MB/sec)
256 bytes took        12 usec (  42.108 MB/sec)
512 bytes took        13 usec (  78.090 MB/sec)
1024 bytes took        19 usec ( 108.733 MB/sec)
2048 bytes took        29 usec ( 141.982 MB/sec)
4096 bytes took       382 usec (  21.448 MB/sec)
8192 bytes took        85 usec ( 192.492 MB/sec)
16384 bytes took       138 usec ( 237.784 MB/sec)
32768 bytes took       570 usec ( 114.964 MB/sec)
65536 bytes took       433 usec ( 302.562 MB/sec)
131072 bytes took       912 usec ( 287.454 MB/sec)
262144 bytes took      1599 usec ( 327.919 MB/sec)
524288 bytes took      3019 usec ( 347.315 MB/sec)
1048576 bytes took      5971 usec ( 351.212 MB/sec)

Asynchronous ping-pong

8 bytes took        36 usec (   0.444 MB/sec)
16 bytes took         8 usec (   3.948 MB/sec)
32 bytes took         9 usec (   7.064 MB/sec)
64 bytes took        11 usec (  11.423 MB/sec)
128 bytes took        11 usec (  23.342 MB/sec)
256 bytes took        12 usec (  42.108 MB/sec)
512 bytes took        15 usec (  68.174 MB/sec)
1024 bytes took        20 usec ( 102.261 MB/sec)
2048 bytes took        25 usec ( 163.618 MB/sec)
4096 bytes took        64 usec ( 128.208 MB/sec)
8192 bytes took        71 usec ( 230.602 MB/sec)
16384 bytes took       140 usec ( 233.740 MB/sec)
32768 bytes took       290 usec ( 225.865 MB/sec)
65536 bytes took       434 usec ( 302.064 MB/sec)
131072 bytes took       807 usec ( 324.915 MB/sec)
262144 bytes took      1531 usec ( 342.474 MB/sec)
524288 bytes took      3104 usec ( 337.818 MB/sec)
1048576 bytes took      5987 usec ( 350.274 MB/sec)

Bi-directional asynchronous ping-pong

8 bytes took        25 usec (   0.645 MB/sec)
16 bytes took        11 usec (   2.918 MB/sec)
32 bytes took        12 usec (   5.263 MB/sec)
64 bytes took        11 usec (  11.671 MB/sec)
128 bytes took        13 usec (  19.884 MB/sec)
256 bytes took        15 usec (  34.087 MB/sec)
512 bytes took        17 usec (  60.492 MB/sec)
1024 bytes took        20 usec ( 102.261 MB/sec)
2048 bytes took        35 usec ( 117.670 MB/sec)
4096 bytes took        85 usec (  96.516 MB/sec)
8192 bytes took       100 usec ( 163.618 MB/sec)
16384 bytes took       181 usec ( 181.079 MB/sec)
32768 bytes took       365 usec ( 179.541 MB/sec)
65536 bytes took       522 usec ( 251.145 MB/sec)
131072 bytes took      1108 usec ( 236.607 MB/sec)
262144 bytes took      2870 usec ( 182.689 MB/sec)
524288 bytes took      5892 usec ( 177.965 MB/sec)
1048576 bytes took     11677 usec ( 179.596 MB/sec)

Max rate = 351.211540 MB/sec  Min latency = 4.053116 usec

This ping-pong example will not give you accurate results and the page that hosts it recommends NetPIPE. I suggest both, but ping-pong tends to produce more reliable results as I have encountered some inconsistency with NetPIPE’s measurements.

Scale:

1000 nanoseconds = 1 microsecond (usec), 1000 microseconds (usec) = 1 millisecond, 1000 milliseconds = 1 second

Moodle 1.8 – Disabling Emoticons

Filed under: The Internets — Tags: , — Elvedin @ 2:55 pm

There is a patch for 1.9 that adds the functionality of changing settings to enable/disable emoticons, but it doesn’t quite work on 1.8. The best way to do it in Moodle 1.8 is going to your moodle directory, then lib/weblib.php and look for the function replace_smilies (line with “function replace_smilies(&$text)”) and comment them out until after the very last one, which would be the ‘( )’ => ‘egg’ entry.

The benefit of this is that during parsing of text emoticons will not get automatically added, although they can still be used if selected from the menu of an editor. A note is that you should keep at least one valid emoticon entry in the $emoticons array, just to make sure that it doesn’t break anything by making $emoticons empty.

Older Posts »

Powered by WordPress