So you want data quality…

Virtually everything in business today is an undifferentiated commodity except how a company manages its information. How you manage your information determines whether you win or lose?

Bill Gates
In a world of big data, identifying patterns in the data is business critical. However, the biggest challenge for your defined patterns is the quality of your data and, more importantly, a set of early warning systems that will help you ensure that your data remains of high quality.

What is good data quality?

The trusted adviser is an individual that is considered the go-to for a business. Like the trusted adviser, the data we manage is provided to our business. Whether they come to us is dependent on if they consider our raw data & actionable information acceptable for making decisions, projections, and tactics. Only then does it become considered be of good or high quality, or a trusted system.

How is good determined and why is this important?

The measurements are based on level of completeness, validity, consistency, timeliness, and accuracy (among other things). The impact of bad data quality is that teams spend time reconciling conflicting reports and/or making decisions with outdated or incorrect information and conclusions. The most costly outcome is that additional systems are built becauseĀ the existing system does not meet my needs.

This pattern increases the cost of integration across data sources that are not in sync and is considered to be extremely high as end-users attempt to work within the system to get the results that they want.

What questions should we ask to make a difference?

Physical

    • Completeness
      • Do I have the right number of rows? – Comparison between source and target systems (missing records, extra records, mismatched records)
      • Is the data that I have the same as the data that the source has? – Column comparison that data in source system is represented in the target system (common examples are zip code, date, currency, or use of double byte character sets)
    • Performance
      • Does the load happen in a reasonable time (related to timeliness)
      • Do user queries finish in reasonable time (related to usability)
    • Redundancy
      • When dealing with multiple sources, are we seeing duplicates coming from a single source or multiple sources and how are they being handled. This can be for all columns of a row (physical duplicate) or for business specific columns (logical) duplicates.
    • Stress
      • What is the impact of increasing data loads from their known state of X to 3X? Are known performance levels maintained?
    • Timeliness
    • Is the data getting to the users in time for them to make decisions?
    • Are service levels being met?
    • How old is the data that is available?
    • When was the source last refreshed? Slowest updating source is the freshness value.
    • How does our system handle race conditions of data arriving (or not arriving) on-time

Logical

    • Consistency
    • Target systems should not have conflicting values that are held to be true in a source system
    • If discrepancies exist between multiple sources, determination of true master would be required. If true source cannot be determined, variance reporting should be implemented.
    • Referential integrity
    • Target system should not have orphaned records.

Business

    • Accuracy
    • Domain value and record count comparisons to ensure that the values seen by end users in the source system remain true in the target system.
    • Domain integrity
    • How often do Null, blank, space checking, unknown, default values incidences occur?
    • Ensure that source domain values are still enforced in the target
    • Usability
    • Do we get meaningful reports for the business at the end of the process?
    • Is the information relevant and we have supporting data elements pulled in from sources?
    • Validity
    • Domain values should be representative in the target data set.
    • How are business rules applied, is it consistently done, and can we measure it being done?
    • Do we track the modification by business rules and the impact on the data set as it is being modified?

These are just a few examples of some of the questions we could be asking. Using this very basic set of questions, we can begin creating reports and monitors to tell us the eventual state of our own system.

What kind of questions would you ask?

So you want a resume…

Over the past few weeks, I have been asked about how to write a resume or to review them or to help understand what the heck an effective resume looks like.

I thought I would take a few minutes to write up some guidance on how to look at resumes as well as what you should be writing in yours. As I regularly do, let’s start with a quick story…

Many years ago, I worked for a small startup and there was a group meeting this week. The manager of the company spoke and it was clear that I was about to be laid off. Well bummer – I guess I better put together a resume.

How many times have you said the words, “I better update my resume” or, worse yet, “I need to find my resume” or, even worse, “How do I make a resume.” I guess there is one more — “What’s a resume,” but I will assume everyone got past that one.

It is one of those things that everyone hates to do, but it is one of those necessary taxes that we must pay to move forward in our careers. It is not fun or glamorous, but here are some simple things to help you put it together quickly.

  • Every quarter take 30-60 minutes and write down every single accomplishment you have done in your role.
  • When you write the accomplishment, use the form of action verb, description of what you did, and a value statement
  • The value statement should be in terms of employee satisfaction, customer satisfaction, dollars, time, or some other value that can be measured objectively.
  • If you have all of this information at your disposal, when it comes time to find a job, you can choose the specific accomplishments that pertain to the job you are interested.

Other general rules to be aware of while writing:

    • Keep your industry jargon and abbreviations to a minimum. Keep things in terms of the audience.
    • Write your bullets in the active voice and maintain consistency in your sentence tense.
    • Do not go over 1 page.
    • Do not go over 1 page.
    • Do not go over 1 page – while you may have accomplished a lot and you think you should put every single item on the list, it really does not help you. This was repeated 3 times intentionally.

If you follow these simple guidelines, you will have a more effective resume and have a better chance of catching that recruiter or hiring manager’s attention.

Do you have any other tips that you have found has made a difference?