Note that although we wrote the following instructions while installing 1.0.1, we believe that because most installation problems relate not to DSpace itself, but rather to other factors, such as mod_webapp, these tips should be useful for fresh installations of DSpace 1.1.1 or later.

SunSITE wanted to launch a DSpace testbed, but we found that consistent, reliable installation and configuration information was difficult to find. Eventually, we were able to install DSpace successfully, but we believe that the sharing of our installation and configuration experience can only assist others as they attempt the setup. Please note that this document is not meant to be a step-by-step manual detailing every step of installation; rather, we hope that it will shed some light on the more complex and difficult pats of the DSpace setup.

First, a few notes. General information about DSpace can be found at the main MIT project page. MIT's installation instructions - which are an excellent starting point, and to which we will refer often - are located here, with specific emphasis on the installation section.

Due to the generous support of the DSpace community, we have some documents that have been contributed to us covering other aspects of DSpace installation and confguration. Also, in this list are links to external pages or sites that cover tips and techniques for related technologies. So far, we have documents exploring these issues:
Here are the links directly to the various sections of this document:
We are running DSpace under Solaris 9 on Sparc hardware. With that in mind, here are the software packages that we used, with specific version numbers (unless linked, all software can be found by following the instructions and links within the DSpace installation instructions, as linked above):
  • DSpace 1.0.1 (currently we are running 1.1)
  • JavaBeans Activation Framework 1.0.2
  • Java Servlet 2.3 and JSP 1.2
  • JavaMail API 1.3
  • Tomcat 4.0.6 (binary version)
  • Apache 1.3.27
  • Ant 1.5.2
  • PostgreSQL 7.3.2
In order to build mod_webapp, we used the following software: For the most part, the DSpace install documentation is fine, especially for the few JAR files that need to be put into the DSpace source tree, etc. Even for PostgreSQL, the installation instructions are easy to follow, although we used a later version of the database than they suggested (just remember, if you do use a later version, to add the --with-java to the configure parameters, and then place the postgresql.jar file in the correct place as indicated). Update: the new installation instructions for 1.1 and later suggest that you must configure PostgreSQL with the --with-java option.

The main difficulty that people encounter when installing DSpace revolves around the proper configuration of both Apache and Tomcat, as well as the bridging of the two with mod_webapp. Let's look at the Apache install first, then Tomcat, and then finally mod_webapp.

Apache    Back to Top

While it is possible to run newer versions of Tomcat under SSL as a standalone server, we still believe that using Apache as a sort of proxy server for handling SSL is the better method. However, here is a link describing how to run Tomacat as a standalone SSL server, should you be interested in going that route. Since the scope of this document tries to focus as much as possible on DSpace-related issues, we will not describe how to setup mod_ssl, for instance. To progress beyond this point, it is assumed that you have a working Apache server with SSL capabilities.

We had problems using the default dspace-httpd.conf file that comes with DSpace. So, we extracted what we needed from it and just placed the directives inside the regular httpd.conf. This renders the usefulness of letting DSpace create the dspace-httpd.conf file moot, but that's something we can live with. Check out our sample httpd.conf file incorporating the changes described below. The file has all of the DSpace-specific additions and changes we made to get a working DSpace Apache build talking over mod_webapp to Tomcat.

The RedirectMatch lines went in right above our <VirtualHost _default_:443> line.

Make certain you define a real ServerName in the top-level ServerName directive; otherwise, you may encounter an error complaining that you have an invalid virtual host name. In other words, we have a top-level ServerName dspace.sunsite.utk.edu as well as another ServerName dspace.sunsite.utk.edu within the default SSL virtual host entry.

Within the included dspace-httpd.conf, they have included several directives dealing with SSL. We actually added none of these to our configuration file, finding that the way we had it set up worked just fine; your experience, however, may be completely different. Some paths they include in the dspace-httpd.conf are completely inaccurate, such as the SSLCertificateFile and the SSLCACertificateFile values. This must be for the MIT-specific implementation, because by default DSpace does not even come with an etc directory at the top level (e.g., /dspace/etc doesn't exist). Therefore, use the correct paths to your certificates instead, but you should probably already have these entered correctly if you have a working, SSL-enabled Apache install to begin with.

We changed the user and group that Apache runs as from nobody to dspace, which is our system's DSpace user, as well as the user that Tomcat runs under.

Other than the necessary mod_webapp changes - which will be discussed later - our Apache installation remained fairly unchanged.

Tomcat    Back to Top

Our Tomcat installation was fairly painless as well. In fact, we only changed one thing. The DSpace docs will instruct you to delete some extraneous lines from the server.xml Tomcat configuration file. While you can do that, it is not necessary. Our only change was to change the name attribute within the Tomcat-Apache Service from localhost to dspace.sunsite.utk.edu - or in other words, exactly what we set the ServerName directive in httpd.conf to.

mod_webapp    Back to Top

By far, the most common headache when installing DSpace is the mod_webapp component. However, if you follow these steps, you will have a better than average chance of ending up with a stable binary.

Before we talk about the steps used to compile mod_webapp, I want to address the issue of the lack of binaries for mod_webapp. I do not know why people are so against giving away their binaries. So, here is our mod_webapp binary, compiled for Solaris 9 Sparc. Have fun.

Alright... Here are the instructions to compile mod_webapp (note that much of this is repeated from the README.txt file found within the jakarta-tomcat-connectors-4.0.6-src/webapp directory):
  1. Start out by unpacking the jakarta-tomcat-connectors-4.0.6-src.tar.gz file, probably in some temp directory (we use /scratch). This will create a jakarta-tomcat-connectors-4.0.6-src directory.
  2. Within that new directory, there will be a webapp directory. Inside of the webapp directory, unpack the apr_APACHE_2_0_35.tar.gz file. This will create an apr directory within the webapp directory.
  3. From within the webapp directory, run ./support/buildconf.sh. This should produce a configure script within the webapp directory.
  4. Run ./configure --with-apxs=PATH_TO_APXS --with-apr=PATH_TO_APR_SRC --enable-java=PATH_TO_TOMCAT. The PATH_TO_APXS will likely be similar to /usr/local/apache/bin/apxs, the PATH_TO_APR_SRC should be right within your current directory, such as /scratch/jakarta-tomcat-connectors-4.0.6-src/webapp/apr, and the PATH_TO_TOMCAT should be obvious, such as /usr/local/tomcat.
  5. If everything went well (no errors, etc.), run make and hope for the best.
At this point, you should have the file mod_webapp.so inside an apache-1.3 subdirectory within webapp, as well as the tomcat-warp.jar file inside the build subdirectory. This is the same information as is found within the README.txt file, and if you do have both of these files, congratulations, and installation and configuration can proceed as detailed within INSTALL.txt and the other DSpace instructions.

But, as many people know, most of the time the building of mod_webapp does not proceed smoothly. While we cannot address every problem people face, we can discuss some of the problems we had, as well as the steps we took to fix them.

Under Solaris, we ran into a problem while trying to run the buildconf.sh script. An error would appear saying essentially that autom4te was not found, even though clearly it was right in the same directory with automake , autoconf, etc. This is actually not some strange system problem! Rather, for us, we simply edited the autom4te script and changed the path to Perl on the "shebang" line at the top of it. That's right - just make certain that that line actually points to your Perl executable, and that should solve that problem. We had this exact same problem with another script later in the buildconf.sh process, so if you see the script complaining that it cannot find a file that you know is there, check to see whether it is using the correct path to perl.

Perhaps more disturbing, even after we ended up with a "clean" configure run, we encountered problems late in the running of make. For some unknown reason, we were given an error saying that the path /scratch/jakarta-tomcat-connectors-4.0.6-src/webapp/apr/apr was invalid. Well, yes, that certainly is invalid. There is no apr subdirectory within the webapp/apr directory. And apparently the make script was trying to copy a file into or out of this directory. Our first thought was to try and fake the script out by making a symlink named apr within the "real" apr directory pointing back to its parent. This is quite the kludge, but it has worked on a few other occasions with testy scripts. But what we ended up doing was simply creating a real, empty apr directory within apr, and at that point, the make completed without problem, and produced a working, stable mod_webapp.

Please note that it is possible, in some crazy way, to actually compile a mod_webapp.so file even though you have had many errors, especially during the running of buildconf.sh! It is imperative that you make certain to fix all errors during that process! For instace, under Solairs we had to install a couple of packages, such as m4, to meet the requirements. We produced a couple of "successful" builds of mod_webapp without satisfying each error, and our Apache servers would simply segfault and die on each connection.

The configuration of getting Apache to "talk" with Tomcat using mod_webapp is relatively easy. In fact, we just added the WebAppConnection conn warp localhost:8008 and WebAppDeploy dspace-oai conn /oai lines directly above the <VirtualHost _default_:443> section, and the WebAppDeploy dspace conn / line within the VirtualHost section, just as outlined in the dspace-httpd.conf file. Also, here is our sample httpd.conf file, with all the additions and changes needed to configure Apache for mod_webapp and Tomcat that were just mentioned.

Other people have reported good success using mod_jk, or even mod_proxy, to bridge the proverbial gap between Apache and Tomcat. I would love to see any information about this, including sample configuration files, etc. If there is enough interest, I would gladly add a section dedicated to those or other methods of making Apache work with DSpace. Also, if anyone knows of good web resources for the installation and configuration of mod_webapp, please let me know as well. I, however, read through what seemed like every page remotely related to mod_webapp, and much of the information was incorrect, incomplete, confusing, or otherwise less than helpful.

Handle Server   Back to Top

Perhaps one of the most difficult components of DSpace to set up is the Handle sever. This is mostly due to exceedingly poor documentation. The good news is that setting up the server is actually fairly easy, but poor wording and conflicting reports make the process seem more difficult than it really is.

Here are the steps that we went through to set up our server:
  1. Run /dspace/bin/dsrun net.handle.server.SimpleSetup /dspace/handle-server. Follow the instructions, answering relatively simple questions. Another option might be to run /dspace/bin/make-handle-config after configuring dspace.cfg, but we did not do this.
  2. Mail the file /dspace/handle-server/sitebndl.zip to hdladmin@cnri.reston.va.us. They will quickly create your global identifier and email you with the appropriate information.
  3. Once you receive the email, you must edit the Handle configuration file (/dspace/handle-server/config.dct). Just as the DSpace instructions say, make the "storage_type" = "CUSTOM" change, as well as the "storage_class" = "org.dspace.handle.HandlePlugin" change.
  4. While still in config.dct, update any lines that say something like YOUR_NAMING_AUTHORITY with the appropriate number as sent to you in the email from the Handle admin people. For instance, here at UTK we have a line that reads "server_admins" = ( "300:0.NA/1785" ). We made this change in three places, under server_admins, backup_admins, and replication_admins.
  5. Edit /dspace/config/dspace.cfg, changing handle.prefix to whatever number you were assigned (in our case, the line reads handle.prefix = 1785), and changing if necessary handle.dir to point to the right place, usually /dspace/handle-server.
You should be finished at this point, and the server should start after running /dspace/bin/start-handle-server. Since this is not a perfect world, you may encounter a problem when attempting to start the server, or you may notice a problem while trying to resolve handles through the hdl.handle.net server. Here are some of the problems we had, and how to fix them:

Do not follow the instructions on the Handle site to home your server. While it is true that the official DSpace docs do not explicitly tell you to home your new server, they also do not tell you that it is not necessary to do so. Apparently, this is a common mistake, and while it does not hurt you to do so, certainly it does not help. In fact, ignore any further instructions on the Handle site; even though the DSpace docs point you there for guidance, ignore anything else about getting your server up and running.

Another thing that is not explicitly spelled out is that you do not actually create your own handles; rather, DSpace takes care of handle creation whenever a new collection is created, a new item is added, and so forth. Again, the DSpace docs do not tell you to use the Handle admin interface to create handles, but the official Handle docs may lead you to think that you should. Going down that road will produce many errors and headaches...

Pay close attention to /dspace/handle-server/error.log. We could not figure out why our Handle server was not resolving handles, but an examination of the error log showed a message saying that the TCP port was already taken, and thus it could not bind. Well, a quick look at the running processes showed that we had several running zombie processes relating to Java and DSpace, even though everything was completely shut down. After killing those processes, the port was open and the Handle server could bind to its port. Problem solved.

Even though you may not choose to do this, we went ahead and deleted all useless stuff in our config.dct file, including everything related to HTTP and UDP. To the best of our knowledge, the DSpace implementation of the Handle server does not use these; instead, it uses only the TCP interface. (In fact, it seems that the included Handle server is kind of watered-down in several respects - or at least its implmentation within DSpace is - but that is another conversation.) For example, here is our production config.dct file, showing what we left in, and our server works just fine. Indeed, if I had to wager a guess, I would say that even more stuff could be pruned out, although in reality the excess is probably not hurting anything or wasting resources.

Apparently, the coders of DSpace found an error in the source that prevents handles from resolving globally (e.g., when linked through hdl.handle.net as opposed to locally). Anyway, they released a code fix for this. What you need is the new HandlePlugin.java source file. Download it, and place it in your DSpace source tree in the right place, which for us is /scratch/dspace-1.0.1/src/org/dspace/handle, obviously overwriting the older copy that is there. When that is done, run ant, then ant update from within your main source dir (again, for us, /scratch/dspace-1.0.1). This will create and install a new copy of dspace.jar within /dspace/lib. Your handle problems should go away, provided everything else seems in place. However, note that they are including code in the mid-April 2003 release of DSpace that will render this fix unnecessary. Update: I am certian this is no longer an issue in any new release of DSpace, but I will leave the instructions here just in case.

Conclusion

In retrospect, our installation of DSpace actually went relatively well. Aside from the mod_webapp component, and the lack of usefulness of the dspace-httpd.conf file, everything went according to spec, as outlined in the DSpace docs. If you have any questions, comments, or additions to this page, please email me at jsimms@utk.edu.