Random bits of information by a developer

28 April 2009

Dependency Primer

Every software project has dependencies: your own resources, your own APIs you've created, third party APIs and projects, etc. We've all had to deal with them at some point in our career. Recently while playing with Gradle (www.gradle.org) I've come to a realization about how they should be handled correctly. In this post, I'll talk about the two most prominent kinds of dependencies, their scopes, and where they fit into an application. I will not be discussing packaging or how the different scopes must be resolved to create a distributable / deployable artifact. Being a Java developer this is seen from within the Java space, but the concepts are universal to all software projects.

Dependency Types

There are basically two types or categories of dependencies: first level and transitive. First level dependencies are those resources that your application directly relies upon. Examples of first level dependencies include the language in which the project is written, entities your project directly uses such as an XML parser, images, and classes from a third party project. Transitive dependencies are dependencies of your first level dependencies. An example from the Java world could be commons-logging which is dependent on some logging implementation, commonly log4j or JDK logging. For commons logging the log implementation is a first level dependency, but for your project it's a transitive dependency. Another example may be an SAX XML parser (or any other XML parsing API). The SAX API is used directly by your project and is therefore a first level dependency, but it requires an implementation, possibly Xerces, which would be a transitive dependency of your project. This definition of dependencies has really been ingrained in me while using and trying Gradle (a build system [yes, another one] written in Groovy). In the past I've used Maven or Ivy (basically as a Maven alternative, but it does much more in the world of dependencies). Maven introduced me to the concept of transitive dependencies and how they fit into it's life cycle and therefore your project's life cycle. I owe a great deal to Maven for introducing me to the concepts, but Maven has a few blemishes (at least I think so) in the way it handles these different types of dependencies. Maven will by default (I don't think you can change this) include all of the transitive dependencies of the same scope in your classpath, which in my opinion, is incorrect. Here's an example taken from JSFUnit: pom.xml:
   <dependency>
      <groupId>net.sourceforge.cssparser</groupId>
      <artifactId>cssparser</artifactId>
      <scope>compile</scope>
   </dependency>
   <dependency>
      <groupId>net.sourceforge.nekohtml</groupId>
      <artifactId>nekohtml</artifactId>
      <scope>compile</scope>
   </dependency>
   <dependency>
      <groupId>xalan</groupId>
      <artifactId>xalan</artifactId>
      <scope>compile</scope>
   </dependency>
Dependency Tree:
[INFO] +- net.sourceforge.cssparser:cssparser:jar:0.9.5:compile
[INFO] |  \- org.w3c.css:sac:jar:1.3:compile
[INFO] +- net.sourceforge.nekohtml:nekohtml:jar:1.9.9:compile
[INFO] |  \- xerces:xercesImpl:jar:2.8.1:compile
[INFO] +- xalan:xalan:jar:2.7.0:compile
[INFO] |  \- xml-apis:xml-apis:jar:1.0.b2:compile
For this project cssparser, nekohtml and xalan have been configured as first level dependencies, but the effective classpath contains their compile (first level) dependencies as well. If the project relies on these libraries it should state them explicitly and not rely on the crutch of having them as transitive dependencies. Ivy usage can fall into the same problem, but this is not the case in Gradle (at least not without changing the default), which I believe is the correct way of handling first level dependencies. With Gradle the compile scoped (more in the next section) is not resolved transitively, so you'll have compile time errors if you have not declared a needed dependency.

Life cycle Scopes and Dependencies

The build life cycle for a software project can be distilled into three phases: compile (if needed), test, and runtime. Test is a little special because it contains two phases itself: testCompile and testRuntime, which are extensions of compile and runtime. So where do the different types of dependencies come into play? Your first level dependencies become your compile dependencies and runtime dependencies are pretty much your compile dependencies with transitive dependencies and a few other things that may be provided for you like container provided dependencies (though, those are arguably transitive dependencies of any third party dependency) such as a messaging provider, an HTTP implementation, transaction support, etc. The test dependencies extend compile and possibly runtime, and add their own dependencies for testing: a testing framework, mocking framework, possibly a slimed down server, and others, which of course would be first level dependencies for your tests.

Summary

To recap, there are two different kinds of dependencies: first level and transitive. First level are dependencies needed to build an run your project. Transitive dependencies are those dependencies of your dependencies. A software build life cycle essentially has three phases: compile, test, run. The compile phase should only use your first level dependencies. Runtime extends compile and is resolved transitively. Test extends both compile and runtime (though at different times) and uses its own dependencies as well. I hope this has been informative and helped others understand the relationship and distinction of dependencies and a software project.

23 April 2009

JPA Annotation Locations with Hibernate

Just ran into this one at work. If you're using Hibernate as your JPA provider (not sure if this is true for others, please comment) all of your annotations on the main entity, mapped super class, embedded classes, etc. must be in the same location (either all on the properties or all on the methods). The reason behind it is at StackOverflow. Basically Hibernate expects the annotations to be in the same location as the first @Id it comes across. I experienced weird errors where it was picking up the property which was mapped to a different column on the table so when Hibernate when to validate the schema it blew up. Hope this helps someone.

12 April 2009

Ivy Configurations when pulling from a Maven Repository Part I

I heard about Ivy (http://ant.apache.org/ivy) some time ago, but never really took the time to look into it. After all, I had Maven, and that's what we were using at work. So I really had no incentive to look into it. As I'm sure many of you have found there are some issues with Maven. With all the things it does well, there are a few things where it really falls flat on it's face. How about transitive dependencies for example? Bane of my Maven experience. The standard project layout is very nice, but at the same time it is a hindrance if, for whatever reason, you need to go against it. As most of my readers have seen I'm pretty well entrenched in the Seam camp. Seam does not play well with Maven, or maybe it's Maven that doesn't play well with Seam (Embedded JBoss to be specific, but others have found ways around this [http://www.google.com/search?q=seamtest+maven&hl=en, http://www.seamframework.org/Community/SeamTestCoverageMavencobertura, https://jira.jboss.org/jira/browse/JBSEAM-2371, http://www.seamframework.org/Documentation/SeamWithMavenOverview to name a few]). For those that have been using Seam with Maven are familiar with not being able to run their Seam tests easily with Maven, unless you know to put your test scoped dependencies first in the pom. There are some other issues I have with Maven, but this is not a post about how much Maven sucks. You can google for those, there are a lot of them; back to Ivy. A few months ago my friend Dan Allen blogged about dependancy management in a seam-gen project with Ivy (http://in.relation.to/Bloggers/ManagingTheDependenciesOfASeamgenProjectWithIvy), see his post for a decent intro to Ivy. In his code download he was unable to setup the dependencies needed for testing his project. In this post I'm going to explain why Dan ran into problems, the relationship between Maven scopes and Ivy configurations, as well as provide an updated version of his Ivy-ized seam-gen download.

IVY CONFIGURATIONS

I believe a little background information about Ivy configurations may be in order. If you're coming from the Maven world they are somewhat similar to dependency scopes and profiles. Because Ant really has no concept of a build life cycle the way Maven does (one of the things I do like about Maven) Ivy doesn't either. So if Ivy configurations aren't really Maven Dependency scopes, what exactly are they? The official Ivy site calls them "views on your module" (http://ant.apache.org/ivy/history/trunk/tutorial/conf.html). Personally I still find that concept difficult to wrap my head around. The definition I have come up with is this: An Ivy configuration is a labeled grouping of a project's publications and that grouping's dependencies. Perhaps that's similar to the Ivy site's definition, but it helped me understand what was going on, and how to create my own configurations. Unlike Maven scope names, Ivy configuration names are completely arbitrary, which as you guessed is both a good and a bad thing. It's a bad thing when you go to share your application, module, whatever with someone else and they use it as a dependency. They'll have to see the ivy.xml you created to determine the correct configurations to use. With Maven, we were given the scopes and we couldn't change them. I would suggest defining a company wide set of configurations or at least list the public ones in a README or something if you are distributing your project. You could also use the makepom ant task, and use that to upload to a Maven Repository but that's a different post :) As I mentioned above, an Ivy configuration may also be used in mapping and tying together dependencies. A full discussion with examples is available at http://ant.apache.org/ivy/history/latest-milestone/ivyfile/dependency.html under the Configurations Mapping section (sorry, they didn't include an anchor for that section). In it's most basic form it looks like this: conf="my_conf->other_conf which translates to my_conf depends on other_conf. There are some special wild cards and other kinds of mappings you can do, which are in the above link. You can also specify multiple mappings within the same attribute by separating them with a semi-colon. Very handy for say depending on the module itself and also the source. I know it sounds a little odd to depend on the source of a module, but that's how Ivy sees it.

MAPPING MAVEN SCOPES TO IVY CONFIGURATIONS

Thanks for humoring me through that long block of text to finally get to the point of this post. As Dan mentioned in his blog post, Ivy can read from Maven repos (very good move on Ivy's part I believe), but in order to do this they have to convert the Maven POM to an Ivy file. When you setup a Maven Repository as an Ivy Resolver within an Ivy settings file there are a couple of attributes which affect the resulting Ivy file (http://ant.apache.org/ivy/history/latest-milestone/resolver/ibiblio.html). The first one is m2compatible, which if you're using a Maven 2 repository is always going to be true. The second attribute, which is true by default if m2compatible is true is usepoms. I can understand why you would select false to conserve bandwidth (although minor), and reduce network traffic (one less call to make) but you do lose some things when it is set to false when Ivy creates the ivy.xml file for the dependency. Below are the same ivy.xml files converted from a Maven repository. The first one is not using the pom:
<ivy-module version="1.0">
 <info organisation="org.testng" module="testng" revision="5.6"
               status="release" publication="20081031232755" default="true">
     <configurations>
      <conf name="default" visibility="public">
     </conf></configurations>
     <publications>
          <artifact name="testng" type="jar" ext="jar" conf="default"></artifact>
           </publications>
       </info>
</ivy-module>
<ivy-module version="1.0" m="http://ant.apache.org/ivy/Maven">
 <info organisation="org.testng" module="testng" revision="5.6" status="release" publication="20071116012303">
  <license name="Apache License, Version 2.0" url="http://apache.org/licenses/LICENSE-2.0">
  <description homepage="http://testng.org">
  TestNG is a unit testing framework.
  </description>
  <m:maven.plugins>org.codehaus.mojo__dependency-Maven-plugin__null|org.apache.Maven.plugins__Maven-clean-plugin__null|org.apache.Maven.plugins__Maven-jar-plugin__null|org.apache.Maven.plugins__Maven-source-plugin__null</m:maven.plugins>
 </license></info>
 <configurations>
  <conf name="default" visibility="public" description="runtime dependencies and master artifact can be used with this conf" extends="runtime,master"/>
  <conf name="master" visibility="public" description="contains only the artifact published by this module itself, with no transitive dependencies"/>
  <conf name="compile" visibility="public" description="this is the default scope, used if none is specified. Compile dependencies are available in all classpaths."/>
  <conf name="provided" visibility="public" description="this is much like compile, but indicates you expect the JDK or a container to provide it. It is only available on the compilation classpath, and is not transitive."/>
  <conf name="runtime" visibility="public" description="this scope indicates that the dependency is not required for compilation, but is for execution. It is in the runtime and test classpaths, but not the compile classpath." extends="compile"/>
  <conf name="test" visibility="private" description="this scope indicates that the dependency is not required for normal use of the application, and is only available for the test compilation and execution phases." extends="runtime"/>
  <conf name="system" visibility="public" description="this scope is similar to provided except that you have to provide the JAR which contains it explicitly. The artifact is always available and is not looked up in a repository."/>
  <conf name="sources" visibility="public" description="this configuration contains the source artifact of this module, if any."/>
  <conf name="javadoc" visibility="public" description="this configuration contains the javadoc artifact of this module, if any."/>
  <conf name="optional" visibility="public" description="contains all optional dependencies">
 </configurations>
 <publications>
  <artifact name="testng" type="jar" ext="jar" conf="master">
  <artifact name="testng" type="source" ext="jar" conf="sources" classifier="sources">
 </artifact></artifact></publications>
 <dependencies>
  <dependency org="ant" name="ant" rev="1.6.5" force="true" conf="">compile(*),master(*)">
  <dependency org="junit" name="junit" rev="3.8.1" force="true" conf="">compile(*),master(*);runtime->runtime(*)">
  <dependency org="qdox" name="qdox" rev="1.6.1" force="true" conf="">compile(*),provided(*),runtime(*),master(*)">
  <dependency org="org.beanshell" name="bsh" rev="2.0b4" force="true" conf="">compile(*),provided(*),runtime(*),master(*)">
 </dependency></dependency></dependency></dependency></dependencies>
</ivy-module>
The ivy.xml generated with the pom contains much more information, notably more configurations, publications, and dependencies. The configurations that are in the last file are configurations that Ivy places in every ivy.xml which is converted from a Maven POM. You can always rely on them being there if usepoms is set to true. This part, as well as really not understanding Ivy configurations (yes, they are difficult to understand as the official documentation isn't that great for configurations) is where I believe Dan had problems. I suggest always enabling usepoms, because you get everything you would normally if you were using Maven, and it easier to craft your own configurations with dependencies. In the download (a modified version of what is in Seam 2.1.2) you'll see how powerful this can be for your builds and dependency management with Ivy. If you find / know of a better way to accomplish what I've done, please post a comment and I will make corrections as needed. I hope this helped at least one person better understand Ivy configurations. I spent about a week diving through documentation, forum postings and the Ivy source code to figure this out, I hope you don't have to go through the same experience :) In the next part I will demonstrate how to use Ivy to manage your dependencies without needing to add them to your source control repository.