If this document changes, it will be available here: http://genericbooks.com/Literature/Articles/2 .
What are Large Objects? Large Objects are "things" (text or binary data) which cannot fit into a normal field in a table. For example, a 100 megabyte file cannot really fit into a field in a table.
What were the technologies I was using? RedHat Linux, postgresql-6.5.3, Perl 5 (with DBI, DBD::Pg, and Pg), and apache_1.3.9.
In general, how do you use Large Objects in PostgreSQL? You can save large objects like "files" where the database server will let you input a file and it will give you a number in return, and when you want to retrieve the data, you use the number to export your data to a temporary file on your hard drive. It is kind of weird. In order to extract a Large Object, it sort of takes two steps. First, you copy the data to a file on your computer, and then you can read it. The problem is, from my point of view, is that data normally is read once. Here, it is read twice, to make the file, and then to read the file. There isn't any way around this (as far I as know), but I am just new to Large Objects in PostgreSQL (I have used them in other database servers before though), so perhaps there are better ways.
#!/usr/bin/perl
use DBI;
use vars qw($dbh);
### zing is the database I am connecting to.
$dbh ||= DBI->connect("dbi:Pg:dbname=zing");
my $Command = "select first_name,last_name from people
sort by last_name,first_name";
my $sth = $dbh->prepare($Command);
my $Result = $sth->execute;
while (my @row_ary = $sth->fetchrow_array)
{print "<br> $row_ary[0] $row_ary[1]\n";}
The problem with PERL DBI is that the Large Object interface isn't
fined tuned yet, and you have to use the specific DBD driver for each
type of database. The problem with DBD::Pg is that Large Objects is
not well documented, and it took me a long time looking through
newsgroups to find the answer I was looking for. Eventually, I also
did manage to find
this note on Large Objects.
Okay, how do we import and export large object from PostgreSQL? This following example comes straight from the this PostgreSQL documentation. This example uses the program psql.
CREATE TABLE image (
name text,
raster oid
);
INSERT INTO image (name, raster)
VALUES ('beautiful image', lo_import('/etc/motd'));
SELECT lo_export(image.raster, "/tmp/motd") from image
WHERE name = 'beautiful image';
Now, we need to convert this into perl. Here is a perl script which would do the exact same thing.
#!/usr/bin/perl
use vars qw($dbh);
$dbh ||= DBI->connect("dbi:Pg:dbname=zing");
my $Command = "INSERT INTO image (name, raster)
VALUES ('beautiful image', lo_import('/etc/motd'));";
my $sth = $dbh->prepare($Command);
$sth->execute();
$Command = "SELECT lo_export(image.raster, '/tmp/motd') from image
WHERE name = 'beautiful image'";
$sth = $dbh->prepare($Command);
$sth->execute();
Those two examples do the following. The first command loads the file "/etc/motd" into a table. The second command takes the data in the table and exports it to a file called "/tmp/motd". If you want to get the "oid" of data in the table and export it to a file, you could also do this.
## This command will return a numeric "oid" number $Command = "SELECT raster from image WHERE name = 'beautiful image'"; $sth = $dbh->prepare($Command); $sth->execute(); my @row_ary = $sth->fetchrow_array; my $Oid = $row_ary[0]; ## This command will export the data with the "oid" to a file $Command = "SELECT lo_export($Oid, '/tmp/motd') from image"; $sth = $dbh->prepare($Command); $sth->execute();
For live examples, ZING at www.genericboosk.com has all of its Perl Scripts available for public viewing. Look for scripts that view documents that are extracted from the database. ZING now has setup scripts to let people upload and download documents, so we should have real live examples floating around on the website.
At ZING, we export the files by the pid number of the child process of the webserver. When the webserver needs to export another document, it hands the process to one of its children, and if the child already has exported a document before, it will just overwrite the previous one which makes it so we don't end up with tons of exported undeleted files. Then, once an hour, we go through and delete any files older than 15 minutes. It is not as elegant as I would like, but it works fine.
In general, it is a bad idea to let your webserver have super user status or to have the webserver and database server run under the same username. For security, you don't want your webserver having the power to blow away the database server. Oh well.
Anyways, I will try to find better solutions, so let me know if you hear of any!
Large Objects have always been a pain for me. I have always wanted to use Large Objects in PostgreSQL. Now that I had a reason, I finally did it. Hopefully it will save other people headaches. If it saved you from headaches, donate time or money to ZING, or do something charitable for some other cause! If we all do a little, it makes a big impact.
My next goal is to do handle Large Objects in PostgreSQL using the programming languages PHP and Python. After that, I want to see if there is a way to use Large Objects without being a super user. After that, I want to compare this to MySQL. For licensing reasons and since PostgreSQL has always been 100% free and open, I prefer PostgreSQL over MySQL. However I want to compare and contrast them to help make PostgreSQL better. Recently, MySQL has relaxed its license, but I am going to stick with PostgreSQL since they have always had the best license.