Archive | Best Practices RSS for this section

ESB Toolkit Best Practices

I’ve been trying to put together a document of BizTalk best practices and have found a number of sources with a lot of great ideas.  I’ll share what I’ve discovered at some point, but for now here’s one of my favorites. 

It’s  from Stott Creations regarding the ESB Toolkit.  Below is a list taken directly from the article.  Please read the full article  (http://esb.stottcreations.com/biztalk-esb-toolkit-2-1-and-real-world-best-practices/) for more details:

Top 5 List of BizTalk Real-World Best Practices

Best
Practice
Benefits
and Explanation
In
design, use the ESB Toolkit as the starting point – but not the endpoint. 
The ESB
portal as delivered is to expose the capabilities of the ESB toolkit data and
their relationships but is not a production grade site as-is. 
Just
Enough Logging (JEL) – scale back logging post-go live to sane levels. Do not
enable tracking for ports and orchestrations, and minimize the tracking of
MsgBody. Do enable tracking on Biztalk server hosts.
Logging
every message is demanding on database and storage resources, and over time
becomes less valuable. After deployment, administrators and users only care
if BizTalk is running and if they have the ability to drill in on errors. As
a general rule, the greater the amount of logging, the less said logging is
used. Minimize and target the tracking and logging you actually need.
Avoid
complex deployment scenarios using shared custom core libraries
As a rule of thumb, always exhaust all possible
scenarios using out-of-the-box (OOTB) components to solve an issue before
moving to custom code.  Most scenarios can be accomplished by leveraging the
samples provided, at the very least as a starting point. Shy away from
starting from scratch. This also eases deployment and maintenance (no custom
core components means nothing in the config files. Everything can be deployed
in a comprehensive package)
Write
tightly focused, small components 
This
promotes solid designs and reusability. Break up development in small units
of work and string these solutions together in the itinerary. 
Your
endpoints should always be resolved/configured via BRE over UDDI
Power users often want to change endpoints and
map types; the BRE as a resolver engine is the most powerful and flexible
engine available. Comparatively UDDI both lacks intelligence and is has a
complex interface configuration.

Temporal Data in a Relational Database

Lately, I’ve been spending a great deal of time in the DB world.   My most recent challenge has been in dealing with time-based, or temporal, data.  In several real-world scenarios, we find that timing matters, particularly when dealing with transaction-based systems, such as banking, payroll, or HR.

Most relational databases with referential integrity work really well with current state data.  But a system may need to keep up with the current, past, and future states as well.  For example, if I’m going to create a paycheck for an employee, I may need to know something about his future pay-rate as in “On such-and-such date, Joe Blow will receive a paycheck and be paid x dollars.”  His current state may be some amount less or more than that particular pay-rate on that date.  To further complicate the matter, I will also want to know what we knew when we paid him on a particular date in the past.  We may even back-date a pay-raise, creating a future state with an effective date in the past.

Whoa… Where’s the Doctor with the TARDIS when you need him?

These problems aren’t new.  Many DB gurus far more able than I am have been digging around these issues for the past 30+ years and nobody has come up with a perfect solution.   Some, such as Richard Snodgrass have written entire (quite good) books on the problem (Morgan-Kaufmann, ISBN: 1558604367 (1999)).   My favorite discussion, though (primarily because of its simplicity), comes from the always ingenious Martin Fowler:  http://martinfowler.com/eaaDev/timeNarrative.html.   I highly recommend reading this article for a solid understanding of temporal data patterns.

For our project, we had a DB in place that handled temporal data by never performing an actual UPDATE against a record.  This way, a row of data in a table represents a single point in time which can be given a Created Date and an Effective Date.  When you need to UPDATE, you simply INSERT a new record with some identifying data, such as an Employee Number for example, to tie everything together.  Pretty smart.

But one problem in dealing with relational data in a temporal model this way is in maintaining referential integrity as the data changes.   When the updated record is inserted, how can you effectively maintain the relationship between, say, an Employee record, and time-sensitive Compensation data?  As you can see, the scenario gets pretty messy quickly, so we need a more elegant solution than INSERTS with effective dating.

One solution that was tried was to create a single “Base” table, with immutable records that contained an ID that was used to tie everything together.

The revised “Base Table” model looked similar to this:

Although this model may work for a while, referential integrity is eventually lost because all entities, regardless of their relationship to each other, were tied back to only the single record in the base table.  This had the effect of flattening the data model and forcing the maintenance of referential integrity into the application layer.  Over time new applications would come on board and developers that didn’t understand the entity relationships (or didn’t care) ignore integrity rules.

The model we eventually decided on was inspired by what Martin Fowler referred to as a “Temporal Object” pattern.   Basically, we constructed the entire model to maintain referential integrity between all entities through foreign key relationships.   Temporal data, such as pay rate, was extracted into separate entities with a foreign key relationship created between them and the table from which they were extracted.

The model we ended up with was similar to this:

Naturally a major concern was that adding these extra tables would complicate JOINs in our queries which, of course, it did.  However, most queries added only a single extra JOIN, but there was definitely the potential for many more, depending on the specific query needs.   Given the need for Referential Integrity + Temporality, the trade-off was accepted.

With the exception of auditing, most of our needs required only data from a single point in time.  We managed this by using an Effective Date field in our temporal tables and passing in a date as a parameter for all our queries.  We could retrieve a single employee record for a given point in time like this:

SELECT
    e.EmployeeId,
    d.EmployeeDataId,
    d.EffectiveDate,
    d.EmployeeNumber,
    d.EmploymentType,
    d.HireDate,
    etc
FROM   Employee e
JOIN (SELECT * FROM EmployeeData WHERE EffectiveDate =
        (SELECT MAX(EffectiveDate) FROM EmployeeData WHERE
            ((EmployeeId = @EmployeeId) AND
            (EffectiveDate <= @EffectiveDate)))) d
ON e.EmployeeId = d.EmployeeId
WHERE
e.EmployeeId = @EmployeeId

Note that rather than joining directly on the temporal data table, we join on a query that retrieves the MAX effective date starting at the date we pass in as a parameter.  This would allow us to retrieve the most recent data as of EffectiveDate we passed in.

[For another approach to the above SQL see Temporal Data in a Relational Database Pt. 2]

There are many other approaches to this problem, of course, but this approach provides the benefit of supporting both temporal record maintenance as well as referential integrity while being only slightly more complex.

%d bloggers like this: