Table of Contents
Lynn Winterboer and Cher Fox - Test Automation - Agile Enablement for Data Warehousing and Business Intelligence Teams
Regardless of whether your data warehousing (DW)/business intelligence (BI) team is dedicated to Scrum, a fan of Kanban, or focused on XP, “Agile” in any form calls for small increments of potentially deployable results – which means that QA is essential on a regular, frequent basis. As we develop these small increments, we need to not only test the new development, but also regression test what we’ve already built. Our test suite grows larger with each iteration, and manual testing quickly becomes infeasible. Agile requires that we automate our tests so that regression testing doesn’t become a development bottleneck.
Summary
- Content rating (0-no new ideas, 5 - a new ideas/approach, 9-new ideas): 7
- Style rating (0-average presentstion, 5 - my level, 9-I learned something about presenting): 5
Main idea - since DW/BI applications are driven by SQL (not procedural languages) will need to modify thinking approach. Can (and should) still be done.
Action / Learning
- Book: “Agile Analyics: A Value-Driven Approach to Business Intelligence and Data Warehousing” by Dr. Ken Collier - Chapter 7: Test-Driven Data Warehouse Development
- Book: "More Agile Testing: Learning Journeys for the Whole Team" by Lisa Crispin and Janet Gregory - Includes several chapters on test automaion as well as one dedicated to agile tes4ng for DW/BI.
Presentation
Test Automation: Agile Enablement for Data Warehousing and Business
Video with some of the ideas Lynn Winterboer Talks Test Automation for Data Warehousing Teams at Mile High Agile 2016
My copy: Test Automation - Agile Enablement for Data Warehousing and Business Intelligence Teams
Cher Fox - @thedatanista Lynn Winterboer - @agilelynn
Notes
Wait for vendor to solve problem
Create own framework based on experience As data warehouse has different requirements
Why test automation important
Qa requires regular testing
Need regression testing to create safety
Cannot do if manual and test suite becomes really big
Agile bi development is driven tests Acceptance criteria definition of done
So why aren't all teams automating tests
5 out 60 data warehousing
Data industry doesn't focus on testing They rock data modeling, moving large amounts of data Don't have same level of skill
Problem is there is no education Find a handful of resources online on the subject
Team members Don't have professional qa people
Care about data is right before anything else
Need some who knows and love data and then teach them qa skills
Skills and discipline
Existing tools are not easily adapted to the data world
Development languages Not same as what data warehouse tools
Traditional Web, Mobile, UI focus But UI in data warehouse
Problem with good test data sets Think about what you develop before you do it
Large or small volumes of data Tests systems are usually shared with the source system 1 in 6 month refresh Traditional development Junk data tests edge cases but didn't
Data sensitivity
Small data sets need extra planning and design
The path to test automation
Top 3 path to automation
Want high quality dw / bi system
Need to have the attitude and culture Test automation Learning culture Build quality in Test as a team sport
Need some skills and practices Testing books need some translation - Java vs SQL as primary language Design tests automation How make it repeatable How measure / monitor Feedback loop - bi role is to provide feedback to organization
Tools and automation Test environment - dedicated, production like Build automation Test data sets Test automation tool
Informatica in colarabo
First thing is to figure out how to do a data warehouse build. First test is “is build good”
Don't want to automate chaos
Test automation pyramid
Push many tests as possible to the lowest possible layer Pulls thinking up front Brittle tests that are affected by the UI Subjective user feedback is essential - data folks rely too much on this
Good places to start
Start doing unit tests on new development - Think about how we validate before we start code Regression tests you wish you had before building something new Deployment test - build built correctly Painful test setup or test execution is slow
What does simple test automation look like
What do we want to test 16 orders selected by BI - static test data Then do results against this in the target schema for those 16
“Leave no trace” - clean up later Data people don't like to delete data Build your tables then tear them down
Can do manually
Hard part is “what are the expected results”
Actual vs expected result - this is a test
Think about some examples Dollar total or 1, 0 (still count as order), biggest, negative order Shipped is flagged, or is it a date or … Sales rep / region - pick a rep for a region,
Happy path is average sized order shipped yesterday
Prove that the logic works Therefore should work in every other case
Perhaps pick a handful of orders that represent this
Still do large scale full production data But this is to catch the small stuff early on
Btw this is also good regression test
Start simple Do examples
Source to target Everything in source gets to target system
Edw and stage data Run a checksum - same number on both sides
2014 sales Source system report matches
Catch misunderstand Even when you both understand the data
Re-baseline the data Drop schema Reload data
Expected results can be one value or multiple values
Group tests - label indicating test type
Every time pass or fail writes to database Microsoft bi desktop to display
Note Access is data entry Actual tests are stored in database
Tool took about 80 hours to develop