Be the Next Microsoft Employee – Third Challenge: Draw me a picture

Born To Learn

Born To Learn
Born To Learn Blogs

Be the Next Microsoft Employee – Third Challenge: Draw me a picture

The Buckster

The third challenge is live! Watch it here: http://www.microsoft.com/learning/en/us/certification/bethenext.aspx Read on to find out more about the technical side of the challenge, and how you can enter to win free books and possibly even a brand-new laptop!

As I mentioned, in each challenge we’re trying to ensure we capture not just a technical solution, but how a contestant thinks and works – and as our HR judge, Tim stated on the first day, how they make the technical understandable. Nothing shows off a mix of these attributes better than a project design. A project design that we’ve asked the contestants to create, and then explain in both written and oral format.

My friend Denny Lee (http://borntolearn.mslearn.net/btl/b/bethenext/archive/2012/08/03/what-do-big-data-amp-speeding-tix-have-in-common-guest-judge-denny-lee.aspx) joined me as the guest judge today. I can tell you this – I would be nervous indeed if he were checking out my work. Oh, wait…he HAS checked out my work. I have this one scar….

Challenge Background

We presented each of the contestants with a series of requirements for an Extract, Transform and Load system, with the ultimate end-goal of understanding where the data comes from and where it goes, and of course the technology associated with that process - we printed this on paper for them:

ADVENTURE WORKS DATA WAREHOUSE / ETL / DATA QUALITY

BUSINESS SCENARIO

Adventure Works uses various software applications to manage different aspects of the business, and each application has its own data store.  Specifically:

  • Internet sales are processed through an e-commerce web application.  This data source contains data about customers and the orders that they have placed through the e-commerce web application
  • Reseller sales are processed by sales representatives, who use a reseller sales application. This data source contains data about resellers and the orders that they have placed through Adventure Works reseller account managers.  Details of the sales employees themselves are stored in a separate human resources system.
  • Reseller payments are processed by an accounting application.  Details of these payments are exported to comma-delimited files.
  • The senior sales executives use a SharePoint application to manage reseller account managers.
  • Products are managed in a product catalog and inventory system. This database contains data about products that Adventure Works sells, and that products are organized into categories and subcategories.

Some business partners, such as the marketing agency that Adventure Works uses to conduct marketing campaigns, provide data to Adventure Works through cloud-based data stores.

This distribution of data has made it difficult for business users to answer key questions about the overall performance of the business.

Additional requirements include:

  • Management is looking for a solution that will accommodate insightful reporting using PowerView.
  • The marketing department has requested demographic data about population by gender in the international sales territories in which Adventure Works operates.
  • The data stewards have noticed some data quality issues in the customer data, and requested that you provide a way for them to cleanse data so that the data warehouse is based on consistent and reliable data.  The data stewards have provided you with an Excel workbook containing some examples of the issues found in the data.
  • The data stewards are also concerned that customer data may include duplicate entries for the same customer that they would like to resolve.
  • The ETL solution you are building for Adventure Works Cycles consumes product data from an application database. However, product data is created and updated in various systems throughout the enterprise, and you need to ensure that there is a single, consistent definition for each product.

THE CHALLENGE

“Using the whiteboard/flipchart, your challenge is to study the business scenario and requirements, and then design an ETL solution to help Adventure Works build an enterprise data warehouse. ”

But there are some things we’re checking for that we didn’t exactly call out in the requirements – that’s pretty common in real-life work situations as well. A data professional should probably ask for the reasons the data needs to move – and if that’s not possible, they should account for multiple purposes in their design.

The challenge also involves knowing the Microsoft stack of data products. Yes, that’s vendor-specific, but after all, they *are* coming to work at Microsoft. I would expect other vendors to also hire people that are knowledgeable about their stack of products. J Don’t worry – we included other products as well.

You’ll also notice that there is an external data requirement. A data professional is aware that there are services that offer data, and where they fit the bill, these should be included in any design.

Challenge Recap

I decided that since this was a big challenge that would obviously require a little research first (this is not the kind of thing you should do on what’s in your head – in a contest or in real-life!) that I would allow the contestants to have the challenge early. I gave it to them as we were setting up the filming.

Side-note – setting up, filming, editing and re-filming all of these challenges took all day, and sometimes even longer to do. You might see only 15 minutes or so of video, but having dozens of people involved in this production, moving to locations, working with all of the technology that is required to film a show is really demanding. Hours and hours of work go into each challenge, to say nothing of the preparation work from the team, the film crew, and the contestants. You’re seeing only a VERY small part of the work everyone put in. 

Back to the challenge. I was sitting in the make-up room (don’t judge me) and one of the other judges came in and said “I know you gave them the challenge early, but I think they are actually trying to solve it as a group, now. What do you want to do?” I wanted to stop them – this is an individual challenge, after all, but Tim from HR said “Let’s see where this goes.” We actually (without their knowledge) watched what they did.

After about 30 minutes, I came over and brought them to the whiteboards (actually writable glass panes we have at the Microsoft Offices) and “officially” gave them the challenge. And then, just because I wanted to, I took their notes that they had created as a group. My concern was that they would simply group-think the process and all the solutions would be the same. Was I ever wrong about that – they actually approached it from different angles entirely.

It was interesting to watch the progression. Adding to the pressure of this challenge was each of us judges walking in and questioning their design while they were creating them. I don’t know about you, but I can’t stand that. Just leave me alone and when I’m done I’ll explain it. Nope, we walked around constantly, interrupting, questioning, wasting their time.

When the challenge was done, we had each person present their solution. One contestant re-used a design they had seen before – which is OK, but then said “once you’ve seen one of these they are all pretty much the same.” Uhm, no.

Another contestant approached the design well but ignored technologies (like Data Quality Services) that replaced an entire function in their design. They didn’t know that SQL Server has these kinds of things built right in. Another contestant wasn’t familiar with the reporting systems required in the design – something they should have looked up.

In the end, our winner had a very componentized design that was flexible but included the features we requested. He had actually never designed a system this large, but kept it detailed enough to be usable, but general enough to handle changes.

And he explained it well. That’s key. You can have the best design in the world, but if you can’t explain it, you might not ever get it implemented. The presentation was clear, concise, and complete, as was the design. Win.

Your Challenge and Last Week’s Winners

So, now it’s your turn. Design a challenge that encompasses the following attributes:

  • ETL Process Flow
  • Multiple data sources
  • Data Quality
  • Transfer options and tools
  • Contestants must show the design graphically

Remember, be complete – a few sentences doesn’t win the prize. Tell me how you would judge someone on a “win”. What are you looking for? How does your challenge show off those bullet points?

Last week’s winners were:  Martin Cox, with Ryan Roper and Joseph Hagan! Congratulations! We'll be in touch - and don't forget to submit for this week....

  • Mercwrought
    |

    What an honor to get selected two times for the @home challenge, I am not sure what to say. I have worked with SQL for about 10 years now and I hoped that I had the skills to make a go of it and it looks all of that experience has paid off.