EAI & ETL : When you should choose one over the other

Posted by Avinash Rao on Monday, 6 February, 2012

As part of our quarterly review process of all the FAQs, we have decided to touch upon the most frequently talked about subject, ETL vs EAI. In this article we look to address some of the pain points involved in the decision making process between the two, what they mean to your Business and when you should draw boundaries between these two  integration techniques.

As part of any data integration strategy, it is a golden rule of thumb to follow ‘prevent and prepare rather than repent and repair’. In most cases concerning Business Intelligence initiatives, if you know what you don’t need, you are already half way through in the quest to what you want to achieve. Business Enterprises invest millions of dollars on BI initiatives and Data Warehousing and hence rely on a consistent, accurate and reliable data. Thus following proper integration techniques becomes all the more vital. With concerned to enterprise-integration strategy, EAI is essential for maintaining a perfect harmony between business needs and IT solutions where as Data integration, a function of ETL is a constant need as inconsistent data at the heart of any BI initiative fails to provide an accurate picture of the Business. Hence the vital question EAI or ETL?

What is EAI (Enterprise Application Integration)?

EAI may be defined as a process of aligning a business’s strategic vision with its information technology. In other words EAI is mainly process integration that exists to allow client applications to operate on data from across the business through API calls without regard to its location, encapsulating technology or format. It involves integration of incompatible business applications within and beyond enterprise to allow them to talk to each other seamlessly and to share data in real time.

EAI solutions enable the automation of end-to-end business processes by coordinating sequences of tasks and resources (both systems and people) that perform them. EAI solutions support sophisticated exception management and the dynamic modification of processes even when processes are underway. EAI involves developing a ‘unified view’ of an enterprise’s business and its applications, seeing how existing applications fit into the scheme of things, and then devising ways to efficiently reuse what already exists while adding new applications and data.

EAI integration is mostly message-based, transaction-oriented, point-to-hub, brokering and transformation for application-to-application integration. Some of the core benefits offered by Enterprise Application Integration are.

  1. A focus on integrating both business-level processes and data
  2. A focus on reuse and distribution of business processes and data
  3. A focus on simplifying application integration by reducing the amount of detailed, application specific knowledge required by the users

What is ETL (Extraction, Transformation & Loading)?

As the acronym stands, ETL is a process of extraction / reading a database, transforming or validating the data, and lastly loading / writing to a database. Extract, Tranform & Load are three database functions that are combined into one tool (like DBSync) to pull data out of source databases and place it into target databases. ETL is used to migrate data from databases to others, to form data marts and data warehouses and also to convert databases from one format or type to another.

Extract - The process of reading data from source systems. Data can be extracted in schedule-driven pull mode or event–driven push mode. Pull mode operation supports data consolidation and is typically done in batch. Push mode operation is one online by propagating data changes to target data stores.

Transform - the process of converting the extracted data from its existing form into the format it needs to be in so that it can be placed into other systems or databases. Transformation occurs by using rules or lookup tables or by combining the data with other data.

Load - The process of creation and execution of workflows to write data into the target systems. Data loading may cause a complete refresh of a target data store or may be done by updating the target destination. Interfaces here include de facto standards like ODBC, JBDC, JMS, or application interfaces.

ETL is mainly designed to process very large amounts of data, ETL provides a suitable platform for

  1. Improved productivity by reuse of objects and transformations
  2. Strict Methodology
  3. Better Metadata Support, including impact analysis.

Drawing Boundaries for ETL & EAI

As mentioned in the above sections, both techniques rely on the concept of a unified view and the definition of a mapping that allows data from many disparate sources to be “projected” onto that view. But what differ are the purpose, speed and direction and amount of data that are transformed and placed within the unified view from the external sources.

Process-level integration mainly deals with building enterprise-wide business workflows and processes and incorporating existing applications into those processes. EAI middleware acts as the workflow engine integrating applications in near real time, passing small amounts of data through message queues and a series of stages. EAI tools provide much more complete workflow capabilities than ETL tools, which provide simple workflow.  EAI tools, and especially their workflow components, provide very sophisticated GUI development environments that enable design and management of very complex business processes. Here transformations are focused on ensuring a common understanding of the context and meaning (semantics) of the data involved within the message, the more likely a proven EAI is more appropriate.

With regards to data integration, ETL tools and also next generation ETL clearly holds an advantage whether in batch or real time. Synchronizing data between two applications involves a lot more data manipulation than simply moving data from point Source to Target. It involves data intense tasks that depend upon either RDBMS efficiencies/scalability or in-memory data caching to achieve the necessary throughput. Typically, enterprise data warehousing projects require you to move large amounts of data within relatively small windows of time. Also since ETL tools were born out of the relational database world, and thus are adept at performing SQL oriented transformations. They deal mainly with pulling data out of multiple relational tables, understanding the meaning and relationships between the tables, combining, merging, or joining that data, and augmenting it with data from their sources. Since this is more than just moving data from Source to Target, ETL is more appropriate.

When EAI

When ETL

High number of transactions

Large amounts of data

Message transforms

Complex transformations

Transformation act on a single row of data

Transformation is data-set oriented

Little data augmentation

Large data augmentation

1 to n ; m to n

Point-to-point

Suitable for real-time data needs

Suitable for large amounts of data

High volume, low footprint data exchange

Generally used to move data between 2 or more databases / data repositories


Master Data Management – How To Plan It

Posted by Manjunatha G on Wednesday, 15 June, 2011

Most of us are aware Master Data Management is not just managing data, but it’s beyond just MDM, it actually requires use of technology, data integration tools & processes to create and maintain consistent and accurate list of master data in organizations. Master Data Management or MDM typically called as might become a nightmare for many organizations either large or small if it’s not planned and implemented at the right time. So how organizations need to plan for it?

3 Steps to Plan for Master Data Management


1. Identify Master Data Requirements – First and foremost thing when you have decided to go for master data management is the need to evaluate your existing systems, processes, infrastructure, organization structure and more, then try to find out if your organization actually requires an MDM and start identifying the requirements by asking questions as what’s are the business values to be achieved with this information from MDM & more.

2. Business Process Requirements – Once you have identified your MDM requirement your next immediate things to do is to start asking questions related to business process like what needs to be managed, who would be managing the MDM, what are the processes to be considered, what are the integration tools required, who can access the information, what’s the format of display, who can alter or edit the information, how to maintain the master list should it be in single format or multiple formats/copies, what locations to be maintained and more.

3. Type of Analytical requirements – Next would be to find out what types of analytical requirements for your organizational MDM plan like for areas as business intelligence & management reporting for accurate information for various roles and responsibilities.

From these steps and information start building the MDM project plan & look out for vendors like IBM, Informatica, Oracle, SAP & others who can implement MDM for your organization. During the next few years Master data management might become mandatory due to increase in mergers & acquisitions worldwide. Developing a strong MDM for large or small business helps in organizations success.

Learn more about DBSync


Issues with substandard Data Quality

Posted by Manjunatha G on Wednesday, 9 March, 2011

Quality of data plays a major critical role in industries like retail, telecommunications & financial services with large customer base.  Most organizations falling under these industries in a rush to capture market share and to add new customers give secondary thought to quality of data captured, leading to substandard customer data with inaccurate, duplicate & outdated customer details.

Inaccurate or poor customer data is due to non standardization of data capturing process or format, multiple data capturing points across the organization & data storing in incompatible systems and formats.

Organizational impacts due to poor data quality

  • Business performance – Due to duplication of data it’s difficult for marketing to cross sell other products and services, profiling and segmentation of customers leading to decrease in business performance
  • Customer retention – Due to poor data quality cost of customer retention is high as it requires multiple interactions with customers to understand their actual requirements which would have been solved with  accurate data and past transaction details
  • Customer Mapping – Due to multiple entries of same customer with contradicting details it would lead to wrong mapping of customer s with wrong product or service leading to valuable customer loss and revenue to organization

So organizations with large customer data should ensure their data capturing methods are standardized across all data capturing points with latest data integration software for a satisfied customer and profitable organization.

Learn more about data integration by visiting www.mydbsync.com

Learn more about DBSync


Integrating QuickBooks & Lokad for accurate sales & inventory forecast: Next level forecasting

Posted by Manjunatha G on Tuesday, 1 March, 2011

Compared to few years back we are now in an ever changing business world on real time basis and to this fast changing dynamic world even retail chain stores from hyper markets to smaller retail stores & manufacturing companies have implemented sales & inventory forecasting software for quick and easy forecast either web or desktop based. Most of these forecasting software’s are developed with all the modern statistical forecasting techniques with features as inventory in hand, forecasting inventory required, scalability, stock level alerts & minimum customization.

Safety stock, Inventory Levels and supply chain

But what’s the next level of forecasting for more accurate inventory levels?

  • Anticipating the unexpected – Few of the larger retail chains around the world have customized their inventory forecasting models with anticipating unexpected and alternatively  have incorporated solutions in forecasting software. But what about the same features availability for small to medium size retail stores. If yes at what cost?

 

  • Collaborate with suppliers in sales, planning and inventory forecast – In today’s scenario it’s impossible for any manufacturer to manufacture all the items they require and the same has been outsourced to suppliers. Does your inventory planning software consider suppliers inventory too for your organizations inventory forecast? If not include them

 

  • Include your customers in planning your sales & forecasting inventory – All types of retail stores and manufacturing organization have loyal customers with regular purchase patterns.  Can your inventory forecasting software include your customer’s future requirements by understanding their purchase patterns? If yes then your forecasting software is ahead of it’s time

All the sales & inventory forecasting software’s currently available in the market do have forecasting sales & inventory capability confined to internal organizations inputs* only. If these software’s are included with all the next level forecasting inputs at affordable prices then it would change the way forecasting in done by collaborating with partners, suppliers & customer’s on real time basis.

For more detailed sales & inventory forecasting software visit www.mydbsync.com 

Learn more about DBSync & Lokad


Data Quality in Data Integration

Posted by Avinash Rao on Tuesday, 1 March, 2011

Data, as much as it is a lifeblood of our business, can also be Achilles Heel for various reasons, sometimes reasons best known to none! If you are in the Business Data Integration stream, the complexity only multiplies exponentially. When you are selling an integration product, you are by default understood as flawless integrator of data regardless of any factors. But we are not living in an ideal world are we?

data quality

More often than not the data received and described, is never as accurate, never as complete or never as consistent as it is required to be. The result of this would be data flow not being as smooth as intended. When the result isn’t the best the actions and conclusions based on it will not be the best as well, eventually leading to a dissatisfied customer. Some of the common issues encountered as a result of poor quality of data,

  • Data duplication – The key reason for this to happen is that systems do not have proper keys (primary or external ids)  i.e. the keys that tie applications.
  • Application Crashes – This could happen for many reasons ranging from machine or servers to amount of data transfered.
  • Half data being moved – reasons include limitation of data transfered or system governance limits to errors in mapping or data quality or validations
  • Performance lag – reasons could be in-adequate resources applied to the integration software to network bandwidth allocated to the integration.
  • No synchronization – reasons include in-correct credentials, data mapping errors or data validation issues.

As with any product, with evolving times, comes in more automation targeting performance enhancements and more sophistication, meaning lesser room to ensure data quality. Integration process without accurate, consistent data is as good as integration without the data and eventually business without customers. So it is absolutely necessary to strike the right balance between the two.

Over a period of time we have followed best practices to ensure a smooth integration as any amount of data profiling, data management, data cleansing can ensure at best what can be described as a half solution and not a complete one. Hence we believe in incorporating Customer intelligence into the product and this is being done by gathering key facts about the customer, their nature of business and their data usage and management, a step that not only gives an edge over competition but more importantly also ensures a step ahead in having a satisfied customer.


Integrating Customer Intelligence with your CRM

Posted by Rajeev Gupta on Saturday, 8 January, 2011

Do you know how your application is used by your Customers?
Can you predict if a prospect will make a purchase?
Can you identify if your Customer will renew your subscription?

These are some of the common challenges facing every organization. So what is Customer Intelligence?

Customer Intelligence (CI) is the process of gathering and analyzing information regarding customers; their details and their activities, in order to build deeper and more effective customer relationships and improve strategic decision making.

Customer Intelligence is a key component of effective Customer Relationship Management (CRM) such as salesforce.com or Microsoft CRM, and when effectively implemented it is a rich source of insight into the behavior and experience of a company’s customer base.

Customer Intelligence begins with reference data – basic key facts about the customer and their interaction with your company and / or applications. By mining this data, and placing it in context with wider information about competitors, conditions in the industry, and general trends, information can be obtained about customers’ existing and future needs, how they reach decisions, and predictions made about their future behavior.

This data is then supplemented with transactional data – reports of customer activity. This can be commercial information (for example purchase history from sales and order processing), interactions from service contacts over the phone, e-mail, web visits to tracking use of your application. A further subjective dimension can be added, in the form of customer satisfaction surveys or agent data.

This and in my coming series of articles we will address some of the ways to develop strategies and architect solutions and analytics to help you develop a better intelligence set.

Some of the ways to build your customer intelligence

  • Tracking Application usage
  • Tracking web site activity
  • Tracking shopping cart activity and understanding lost baskets
  • Integrating Accounting with CRM to track payment patterns and Account Receivables
  • Social Media integration and mining
  • Predicting Renewals for subscriptions

Developing a strategy for building a Customer Intelligence platform –

Building Customer Intelligence using DBSync

Building Customer Intelligence using DBSync

Any Customer Intelligence platform should have the following

  • Define: What is that you want from the information and what you plan to do with it
  • Model: Define the algorithm that would best define your measures. Measures are the input parameters that you need to have to effectively score and segment your customer. A way to think about it is
    y = f(x1,x2,x3)
    Many times while defining the model we introduce measures which could enhancement and improve your model, but would be difficult to capture given the existing technology and capabilities of the organization. As part of designing CI, efforts should be made to check feasibility of implementing the model.
  • Capture: Capture information from your data sources. These could be either from transactional databases or your data warehouse. In general you would have the following setup –
    • Data extraction from applications and datasources into a data ware house
      Example: Extracting web activity from your web application into a data warehouse which could have online customer tracking cookie, IP, referring source, visiting pages, time of visit, length of visit.
    • Aggregating and Summarizing information in data warehouse
      Example: Summarizing data extracted from accounting to track payment patterns and frequency, Credit status or from CRM to look at number of opened cases in last six months.
    • Validate: Run your data captured and summarized to the model developed. Check if your model does provide adequate scoring and segmentation that is required to effectively make decision. A good practice is to segment your score into color codes to define levels of Customer Intelligence Level that could help you quickly say if for example a new Lead is a qualified prospect or an existing customer relationship is going sour.
    • Integrate with CRM: Now that you have valuable information with your model to represent your customer intelligence, you need to integrate and make it easily accessible to your Sales and Marketing team. The best way is to seamlessly integrate with your CRM application so that your marketing team can use your CRM customer or prospect database and Customer Intelligence data along with CRM inbuilt analytics to continue to track and nurture the relationship with your customer.
    • Value— clearly identify information of value.
    • Context— clearly identify the context in which the data was gathered or processed. For instance, an increase in umbrella sales may be due to an increase in local precipitation rather than a fashion trend.
    • Granularity of identity— clearly distinguish and associate between data instances. For example, information surrounding the attributes of customer A may not apply to customer B.
    • Action—The results of analytics should point to a course of action.

Use Case

At DBSync we have a comprehensive customer tracking system built in to track customer usage and satisfaction to build our Customer Intelligence.

We use salesforce.com as our CRM application. Amongst the various models that we track, one is on predicting customer renewals based on usage of our application.

Define: Track and monitor Customer use of DBSync to predict renewals and assist customers continue increase use of the application. Our goal is the have each customer at 7 or higher.

Model: Our model y = f(x1,x2,x3,x4) can be best described as

y = A value between 1 (low usage) to 10 (high usage)
x1 =  Records processed in last 1 week
x2 =  Records processed in last 1 months
x3 =  Records processed in last 6 months
x4 = Date of Customer Acquisition
f(…) = A mathematical weighted model to score the customer.

Capture: Our Extract, Transform and Load for executing this model is as follows –

  1. We use DBSync application itself to extract data from our tracking database into our data warehours
  2. Use DBSync to execute data warehouse processes to build the data mart for summarization and aggregation.
  3. Once our data warehouse is ready we use DBSync to run our Model y =f(x1,..) and populate Salesforce.com customer records.

Integration with CRM: Once information is populated in Salesforce.com, we have reports and dashboards for customer usage analytics. These reports are scheduled to be emailed out every week to the Sales and Management team to track and assist our customers.

Articles and links that would be helpful

Learn More about DBSync

Tutorial on how to use DBSync with database

Register for Free Trial


Salesforce.com Data Replication made easy

Posted by Rajeev Gupta on Wednesday, 10 November, 2010
Salesforce.com to Database Data and Schema Replication

Need to perform exact schema and data replication to your database? DBSync for Data Synchronizer is the right tool for you.

Why do you need it?

  • To build a data warehouse or create complex reports using BI tools
  • To perform integration with other inhouse applications like Customer Master databases, ERP or other databases
  • To perform offline data scrubbing and migration before final migration to Salesforce.com.

How does it work?

DBSync is a simple application which understands data structures in Salesforce.com and your database (any JDBC complaint database like MySQL, SQLServer, Oracle, DB2 or PostGresSQL) and can auto-create both schema and data in your database.

DBSync provides 3 ways for driving data to and from a database -

  1. Using a web interface to define your integration “Profiles” end points, and select the Salesforce.com objects to synch-up.
  2. Batch interface – once you have setup your profiles, you can use pre-configured batch scripts (for Windows or Linux) to initiate data and schema transfer from Salesforce.com to database or data transfer from database to Salesforce.com.
  3. Real-time integration from Salesforce.com to database – DBSync leverages Salesforce.com outbound messages to intercept and interpret inbound message and then route it to the target database.

The initial setup and install time is pretty quick. A person familiar with databases should be able to install and setup in less than an hours time.

More information can be found at DBSync Synchronizer page


QuickBooks Sales & Inventory forecasting made easy…

Posted by Rajeev Gupta on Monday, 4 October, 2010

Sales and Inventory forecasting is a limitation for QuickBooks … Lokad to the rescue…

DBSync for Lokad - Sales forcasting made easy

DBSync for Lokad - Sales forcasting made easy

DBSync has partnered with Lokad to provide an amazing sales forecasting solution. In about 20 miniutes, you can extract years of sales data from your QuickBooks directly into Lokad’s Salescast applicaiton and get your forecast.

  • Manage inventory by understanding your safety stock levels and trends
  • Identify trends in sales, inventory and more
  • Reduce cost by reducing capital stuck up in your inventory

Our solution now can un-lock valuable QuickBooks data and provide actionable business intelligence.

To learn more, check out our Lokad and QuickBooks integration offerings. Call us to see all the amazing things we can get out of your accounting data.

Read more at Lokad Blog to see how to setup and get started.


Turn your salesforce into a BI solution

Posted by Rajeev Gupta on Tuesday, 20 April, 2010

The key problem one of the Radiology – healthcare providers was facing was visibility to their referral information. They were using Net.Orange, and was proving to be quite expensive.

To run their day to day operations, they were using Amicas a leading EMR/Radiology Information System, which managed patient scheduling, physician information and all details other than the billing information, which was managed by another system. While it contained all key data from referral standpoint, they were unable to understand and related referral activities to that of physician liaison and outreach activities.

We used Healthcare CRM: Physician Relationship and Referral Management module built on force.com and were able to quickly setup the sales and marketing end of it. The key still remained to extract information out of the EMR and scheduling system. We had couple of options

Options:

  1. Leverage HL7 – HL7 is an event driven protocol and two transactions namely ADT (Admit – discharge) and SIU (Scheduling) was relevant in this case. The issue was that not elements were passed through with the transaction – like complete referring physician information or diagnosis codes and we were not able to extract historical data as it was needed for forecasting future referral.
  2. Perform an export in flat-files and manually import it. While it sounded like was in-expensive way to do it, it was quite expensive in the long run. Also, we had seen a drop in usage as folks would forget to perform import.
  3. Automate data extract and import of data using DBSync. We were able to tap into the SQL Server and construct views for each of the data sets required for our analysis. We then scheduled data extracts every 15 minutes with the built in scheduler and push to force.com.

Analytics:

While Force.com has built in reports, dashboards and workflows, it does lack some statistical calculations like variance and drop offs. We added real time calculations on a number of statistical calculations required for analysis using APEX – an on-demand programming language for force.com. These routines would run at either real-time to process referral statistics as it came through the integration or run every day at a scheduled interval.

The end result was an on-demand data warehousing and analytical tool to provide clear real-time reports and dashboards, along with workflows to automate notifications of referral drop offs by physicians or new physician referrals.

For more detail check out http://www.avankia.com/healthcarecrm_prm