16
Mar
Mar
ETL vs. ESB
A few months ago, prior to the release of the Apatar Community Preview, Matt Asay asked me via email whether Apatar competes with ESB products, specifically with MuleSource and ServiceMix. After I responded to Matt by email, I thought about posting my response to a blog, and Matt said, “Go ahead.” In fact, two people asked me a similar question during the MySQL User Conference, which reminded me about the email that I’m posting below.
The short answer is that Apatar is an ETL (Extract, Transform, and Load) technology; ETL does not compete, but compliments ESB products across different information integration scenarios. ServiceMix and Mule are ESB products. ESB is a standards-oriented (in the SOA age) EAI/EII technology. Therefore, I will address the comparison question, “ETL vs. ESB” from an ETL vs. EAI/EII point of view. EAI, by the way, is a term coined by Dave Linthicum in his book (published in 1999) called, “Enterprise Application Integration.”
ETL is geared toward data movement, typically in batch modes across the enterprise. It is “pull” technology and works on user’s demand or on schedule.
ESB is a “push” technology, sending messages when they occur.
| ETL is a “pull” technology, works on demand/on schedule. |
ESB is a “push” technology. |
| ETL cannot time-out, decay, or issue transactions to front-office applications during transformation processes. |
ESB is capable of timing and decaying data in queues, escalating information content to the right decision-maker on that piece of content. |
| ETL is fully scalable, capable of loading massive batches of data in parallel. |
ESB is not suitable for massive volumes of data because of its service bus architecture (by network, and source system speed to X transactions per second). |
| ETL can hook to ESB/EAI middleware as just another feed, if desired. |
ESB’s primary job is to integrate applications, opposed to Data Migration, Replication, Data Warehousing, and BI. |
I can also refer you to a comparison table I found at http://www.coreintegration.com/solutions/di.asp:
| Feature |
ETL |
EAI |
EII |
| Timing |
Batch snapshots |
Real time |
Real time |
| Unit of work |
Set of transactions committed within an ETL cycle interval |
Single business transaction |
Single business transaction |
| Historical Record |
Yes |
No |
No |
| Persistent Auxiliary Tables |
Yes |
No. Transactions applied directly to applications’ tables |
No. Virtual database |
| Application |
Managerial reporting, trend analysis, multi-dimensional aggregation |
Near real time synchronization of operational data where transaction commitment is dependent on state of related transactions |
Near real time decision making based on most current information in operational systems. No update. |
| What it’s not |
Not source of record. Does not support transaction processing |
Not appropriate for ad-hoc analysis and reporting |
Not a virtual data warehouse |
At the end of the day, I think that business users have to consider their unique requirements, and pick the technology accordingly. There are different horses for different courses. Be it Apatar, Mule ESB, DataStage, Yahoo! Pipes–whatever works.
In my opinion, as more vendors enter the information integration space, they are confusing things even more because of the differences in technology they are offering. No two are alike, but all are calling themselves data integration vendors. Go figure.








Posted by Renat Khasanshyn
6,544 Views 


June 21st, 2007 at 2:28 am
[…] » ETL vs. ESB Naked Open Source; A blog by Renat Khasanshyn on the Intersection of Open Source, Enterprise 2.0, Mashups, Office 2.0, and Apatar Open Source ETL (Extract, Transform, Load) A view on ETL vs. ESB by Renat Khasanshyn of Apatar. I met Renat yesterday at the Mass Tech Leadership Council Open Source Summit. His company is focused on open source ETL capabilities. (tags: ETL ESB Renat Khasanshyn Open Source blog data integration) Posted by Debbie Filed in Links […]