Saturday, January 21, 2012

Mapping-by-Code - Set and Bag

It's time for a huge topic - collections. The original Ayende's post was about <set> in the context of one-to-many relationship. I'm going to change it a bit and start with showing how to map sets and bags using mapping-by-code vs. Fluent NHibernate, ignoring the relation context (i.e. one-to-many or many-to-many) - it is clearly separated both in XML mapping and in mapping-by-code. I'll describe different relation types mappings handled by sets and bags in the separate post.

Contrary to Fluent NHibernate and likewise XML mapping, the collection type is a starting point for the mapping. The options are exactly the same in <set> and <bag>, they differ only with the method name - Set vs. Bag. I'll show Set here, for Bag just change the method called in the first line.

Set(x => x.Users, c =>
{
c.Fetch(CollectionFetchMode.Join); // or CollectionFetchMode.Select, CollectionFetchMode.Subselect
c.BatchSize(100);
c.Lazy(CollectionLazy.Lazy); // or CollectionLazy.NoLazy, CollectionLazy.Extra

c.Table("tableName");
c.Schema("schemaName");
c.Catalog("catalogName");

c.Cascade(Cascade.All);
c.Inverse(true);

c.Where("SQL command");
c.Filter("filterName", f => f.Condition("condition"));
c.OrderBy(x => x.Name); // or SQL expression

c.Access(Accessor.Field);
c.Sort<CustomComparer>();
c.Type<CustomType>();
c.Persister<CustomPersister>();
c.OptimisticLock(true);
c.Mutable(true);

c.Key(k =>
{
k.Column("columnName");
// or...
k.Column(x =>
{
x.Name("columnName");
// etc.
});

k.ForeignKey("collection_fk");
k.NotNullable(true);
k.OnDelete(OnDeleteAction.NoAction); // or OnDeleteAction.Cascade
k.PropertyRef(x => x.Name);
k.Unique(true);
k.Update(true);
});

c.Cache(x =>
{
x.Include(CacheInclude.All); // or CacheInclude.NonLazy
x.Usage(CacheUsage.ReadOnly); // or CacheUsage.NonstrictReadWrite,
// CacheUsage.ReadWrite, CacheUsage.Transactional
x.Region("regionName");
});

c.SqlDelete("SQL command");
c.SqlDeleteAll("SQL command");
c.SqlInsert("SQL command");
c.SqlUpdate("SQL command");
c.Subselect("SQL command");
c.Loader("loaderRef");
}, r =>
{
// one of the relation mappings (to be described separately)
r.Element(e => { });
r.Component(c => { });
r.OneToMany(o => { });
r.ManyToMany(m => { });
r.ManyToAny<IAnyType>(m => { });
});

Whoa! The first parameter, not surprisingly, is the lambda expression for the collection property we're mapping. Second one allows to configure the set/bag using a bunch of options. Third one, optional, defines the type of relation the collection takes part in - one-to-many by default (I'll write about relation types separately).

Let's see what options set and bag have to offer and how it differs from XML mappings.

Fetch, BatchSize and Lazy define how and when the collection is loaded from the database. Table, Schema and Catalog say where to look for the collection data in the database. Inverse is useful for bidirectional relationships and defines which side is responsible for writing.

Cascade tells how operations on the entity affects the collection elements. Note that in mapping-by-code it is redefined a bit. Here are the possible values and its corresponding XML values (note that All still doesn't mean literally all values; DeleteOrphans still needs to be specified aside).

[Flags]
public enum Cascade
{
None = 0, // none
Persist = 2, // save-update, persist
Refresh = 4, // refresh
Merge = 8, // merge
Remove = 16, // delete
Detach = 32, // evict
ReAttach = 64, // lock
DeleteOrphans = 128, // delete-orphans
All = 256, // all
}

Combining the values can be done using logical | operator or using syntax provided through extension methods:

Cascade.All.Include(Cascade.DeleteOrphans);

Coming back to bag/set options. Where, Filter and OrderBy affect what data are loaded from the database. Where and OrderBy allows to specify any SQL expression for narrowing and sorting the collection at database level. Furthermore, OrderBy has an overload that makes it possible to specify the ordering by an expression, too. Filter is even more powerful - see another Ayende's post for an explanation.

All the options above, up to Key call in my example, have corresponding XML attributes in <bag> or <set> XML element. Key and all further options are defined in XML as separate elements inside <bag> or <set>.

Key is to define key column in the table that holds collection elements - there are some DDL parameters and behaviors configurable. Key method is a direct equivalent of mandatory <key> element in XML.

Cache controls the second level cache behaviors for the collection. It's an equivalent of <cache> element in XML.

All the SqlXYZ methods are to set up custom ways of reading and writing collection to the database and all of it have its corresponding XML elements, too - useful if we have to use stored procedures for data access. In XML mappings, there's also an ability to tell NHibernate to check what was returned from the procedure, but it seems not supported by mapping-by-code.

The only attribute of Set and Bag that is not configurable in mapping-by-code is generic. It is to determine whether the collection type used is generic. But by default, NHibernate checks this on its own by reflection and I can't see why anyone would need to override this behavior.

Fluent NHibernate's equivalent

Fluent NHibernate redesigned the approach established by XML mappings and made a relation type an entry point (and only required part) of the mapping. Moreover, only relations with entities at the other side are considered "first-class citizens" by FNH. So there are only two entry methods: HasMany and HasManyToMany. Element and component (composite element) mappings are hidden inside HasMany as its options. Many-to-any seems to be not supported.

As I'm not describing relationship types in this post, I'll pick many-to-one as an example just to have an entry point and I'll skip the options connected with relationship itself, focusing on collection and key column options. This way it'll cover about the same options as in mapping-by-code example.

HasMany(x => x.Users)
.AsSet<CustomComparer>() // or .AsSet(), .AsBag()
.Fetch.Join()
.BatchSize(100)
.LazyLoad() // or .ExtraLazyLoad()
.Table("tableName")
.Schema("schemaName")
.Cascade.AllDeleteOrphan() // or .None(), .SaveUpdate(), .All(), DeleteOrphan()
.Inverse()
.Where("SQL command") // or an boolean lambda expression
.ApplyFilter("filterName", "condition")
.OrderBy("SQL expression")
.Access.Field()
.CollectionType<CustomType>()
.Persister<CustomPersister>()
.OptimisticLock.Version() // buggy
.ReadOnly()
.Generic()
.KeyColumn("columnName")
.ForeignKeyConstraintName("collection_fk")
.Not.KeyNullable()
.ForeignKeyCascadeOnDelete()
.PropertyRef("propertyRef")
.KeyUpdate()
.Subselect("SQL command")
.Cache.IncludeAll() // or .IncludeNonLazy, .CustomInclude("customInclude")
.ReadOnly() // or .NonStrictReadWrite(), .ReadWrite(), .Transactional(), .CustomUsage("customUsage")
.Region("regionName");

List of possible options is quite long, too. Let's go through and note the major differences.

AsSet method needs to be called to change the default collection type (bag) to set. I've already mentioned that I don't like the fact that such a fundamental thing like collection type is just an ordinary, optional switch in FNH. Moreover, the overloads with comparer (being equivalent of Sort from mapping-by-code) make AsSet method look like the comparer is the only sense of its existence. And that's obviously not the case.

Several options have names changed in FNH:

  • Filter and its Condition are merged into ApplyFilter,
  • Type is CollectionType this time (instead of FNH's standard CustomType),
  • Column is KeyColumn,
  • ForeignKey is ForeignKeyConstraintName,
  • NotNullable is Not.KeyNullable,
  • OnDelete(OnDeleteAction.Cascade) is mapped by ForeignKeyCascadeOnDelete,
  • Update is KeyUpdate,
  • and finally, quite confusing, Mutable is ReadOnly in FNH. It wouldn't be so surprising if FNH didn't use ReadOnly as a shortcut for Not.Update and Not.Insert in other mappings. And Mutable is something different.

There are several options not supported in FNH, i.e. Catalog, Unique, Loader and custom SQL queries (the latter is surprising, as custom SQL queries are available in component mapping, where it shouldn't). In Cascade, there are only a few options - the less used ones like refresh are not supported.

OptimisticLock is buggy. It allows to define concurrency strategies that are valid at entity level only. For collections, OptimisticLock is just a boolean flag. Running the example above (with invalid .Version() call) results in XML validation error.

There's also a problem with an interface being too fluent, again. When defining cache options, we can in fact define all its values together, what makes no sense.

m.Cache.IncludeAll().IncludeNonLazy().CustomInclude("customInclude")
.ReadOnly().NonStrictReadWrite().ReadWrite().Transactional().CustomUsage("customUsage");

Moreover, when we step into cache configuration, we have no way to go back to collection-level options - so Cache configuration needs to be at the end.

An interesting option available only in Fluent NHibernate is specifying Where condition using an expression. It seems to work only if property names are equal to column names, but anyway, it's much better than plain SQL expression in a magic string (for more complicated cases, this fallback is available, too).

What is surprising, PropertyRef is opposite - it needs to be specified by string in FNH, when mapping-by-code supports strongly-typed expression there.

3 comments:

  1. IMO the collection type (set,bag,...) being a switch is useful when using conventions to set the actual type depending on some condition eg child type. So i could Fluent- or Automap the collection and set the type afterwards

    ReplyDelete
  2. How do you set a composite key as the foreign key when defining the Bag/Set?

    ReplyDelete
    Replies
    1. Use Columns() instead of Column() in Key options

      Delete