Hibernate::
beyond "hello world"

 


Download sources

Contact us::

Researching Hibernate Fetching strategies.

Last update: Oct-7-2005

This seems to be a common complain: Hibernate sucks because it does not fetch related objects with JOIN, but uses separate selects to get those objects.
Well, it is not that simple:
  • firstly - Hibernate fetching strategy is tunable and we can specify fetch="join";
  • secondly - separate selects might be beneficial for performance;
Lets try to compare 'select' and 'join' strategies in this scenario:
There is a MainObject (10000 instances) that references ReferencedObject (10 instances), now let's try to get all the MainObjects from database using our competing strategies.
Lets populate the database:
 
DBSetterImpl.java
public class DBSetterImpl extends AbstractDBSetter { 
16    
17     public void createTables( Connection c ) throws IOException, SQLException{ 
18       execScript( c, this.getClass().getResourceAsStream( "data/create-db.sql")); 
19     } 
20    
21     public void _dropTables( Connection c ) throws Exception{ 
22       execScriptTolerantly( c, this.getClass().getResourceAsStream( "data/drop-db.sql")); 
23     } 
24    
25     public void generateTestData( Connection c ) throws SQLException{ 
26       int refObjIdLimit = 10; 
27       for( int i = 0; i <= refObjIdLimit;i++){ 
28         exec( c, "INSERT INTO fs_ref_objects (id, name) VALUES ( '"+ i +"'," + 
29             "'ref-obj-"+ i +"dhjbskfhjgkjhskdf ksjhdfhkgjhsgdkfjhgkshjdf 
                       ksdjfhgvksjhdfgk sddkjhfgkjshdff sdjhfgkjhg dsfk sdgfkjhgsdfh gksdfg')"); 
30       } 
31       for( int i = 0; i < 10000; i++){ 
32         exec( c, "INSERT INTO fs_main_objects (id, name, ref_obj_id) 
                 VALUES ( '"+ i +"','main-obj-"+ i +"','"+ (i % 10) +"')"); 
33       } 
34     } 
35    
36     public static void main( String[] args ){ 
37       try{ 
38         TestUtilities.run( new DBSetterImpl() ); 
39       } catch( Exception e ){ 
40         e.printStackTrace(); 
41       } 
42     } 
43   } 
44   

And then run our test:
FetchStrategyTest.java
 
45    
46     public void execSelect() throws IllegalAccessException, NoSuchMethodException, 
                                       InvocationTargetException{ 
47       List res = new ArrayList(); 
48       for( int i = 0;i< 10000;i++){ 
49          res.add( TestUtilities.getHSession().get( MainObject.class, String.valueOf( i )) ); 
50       } 
51       TestUtilities.walkThroughResult( res ); 
52     }


We will execute the same code, but with a differently configured Hibernate Session. Here is the configuration with fetch="join"

and this is how we configure Hibernate to use 'select' strategy

Helper utilities execute our testing code 3 times to allow us to estimate an average.

Let's see what our test results are:

run-class:
[java] SELECT STRATEGY
[java] 0 [main] WARN net.sf.ehcache.config.Configurator - No configuration found. Configuring ehcache from ehc
ache -failsafe.xml found in the classpath: jar:file:/C:/Documents%20and%20Settings/kosta/.m2/repository/net/sf/ehcache/
ehcache/1.1-1/ehcache-1.1-1.jar!/ehcache-failsafe.xml
[java] 359 [main] INFO org.springframework.aop.framework.DefaultAopProxyFactory - CGLIB2 available: proxyTargetC
lass feature enabled
[java] 359 [main] INFO org.springframework.core.CollectionFactory - JDK 1.4+ collections available
[java] 375 [main] INFO org.springframework.core.CollectionFactory - Commons Collections 3.x available
[java] Pass #1
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@cafb56
[java] execSelect 18359 ms
[java] Pass #2
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@cafb56
[java] execSelect 15750 ms
[java] Pass #3
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@cafb56
[java] execSelect 13985 ms
[java] Average::16031.333333333334 ms
[java] JOIN STRATEGY
[java] Pass #1
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@14dc0f2
[java] execSelect 17375 ms
[java] Pass #2
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@14dc0f2
[java] execSelect 17281 ms
[java] Pass #3
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@14dc0f2
[java] execSelect 17187 ms
[java] Average::17281.0 ms
[java] JOIN STRATEGY - DOES NOT WORK WITH HQL
[java] Pass #1
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@5dfb32
[java] execHQLSelect 844 ms
[java] Pass #2
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@5dfb32
[java] execHQLSelect 687 ms
[java] Pass #3
[java] hibernateSessionFactory = org.hibernate.impl.SessionFactoryImpl@5dfb32
[java] execHQLSelect 656 ms
[java] Average::729.0 ms
Now that is quite interesting!
For 'select' strategy we see steady decline in the execution time: 18359 ms -> 15750 ms -> 13985 ms
and for 'join 'strategy' we see roughly the same execution time: 17375 ms -> 17281 ms -> 17187 ms
JIT optimization perhaps?
Now the last test: when we use HQL query to get list of all 10000 objects, then timing is way better that before,
BUT: Hibernate (3.0.5) does not use JOIN to get related objects, it uses default 'select' strategy as we can see in the postgres log:
2006-01-08 20:45:04 STATEMENT: BEGIN
2006-01-08 20:45:04 LOG: statement: select mainobject0_.id as id, mainobject0_.name as name5_, mainobject0_.ref_obj_id as ref3_5_ from fs_main_objects mainobject0_
2006-01-08 20:45:04 STATEMENT: select mainobject0_.id as id, mainobject0_.name as name5_, mainobject0_.ref_obj_id as ref3_5_ from fs_main_objects mainobject0_
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 STATEMENT: select referenced0_.id as id0_, referenced0_.name as name4_0_ from fs_ref_objects referenced0_ where referenced0_.id=$1
2006-01-08 20:45:04 LOG: statement: COMMIT
2006-01-08 20:45:04 STATEMENT: COMMIT
2006-01-08 20:45:04 DEBUG: CommitTransaction
 
Despite that the execution time of HQL is much better than fetching objects one by one: 884 ms->687 ms->656 ms - that is the time to execute 21 queries. Significantly bettert than 40-53 seconds which we need to execute 10000 or 10020 queries.
Conclusions:
  • there is definitely something to show to your DBA and discuss if Hibernate is really that bad :) for DB access;
  • you should consider your usage pattern and see what would work best for you: in most cases Hibernate provides us with enough flexibility to fine tune DB access strategies.
Another thing to consider: the tests do not use Hibernate second level cache to store reemergence objects, and that might give additional boost to the performance of "select" strategy, because in many cases objects will be readily available from 2nd level cache and can be shared between Hibernate sessions.
Environment: The test was performed on HP 7000 NB P4 3.4GHz 1Gb, Postgres 8.0.1
C:\dev\sl\hb-beyond-hw>java -version
java version "1.4.2_06"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_06-b03)
Java HotSpot(TM) Client VM (build 1.4.2_06-b03, mixed mode)