Adding twitter and other social data

October 13, 2011

Based on client requests, we’ve been hard at work tapping in to additional useful variables. A quick update on highlighted additions :

  1. Connectedness measure for an email
  2. Twitter mentions, we previously had presence, but now we’re tapping in to the twitter API to handle keywords and hashtags. E.g. If you query on twitter=mhookey (or @mhookey or #mhookey) then you find the number of times people mention hate/fail or love me
  3. Twitter velocity – I.e. What rate are tweets arriving for a given keyword

We’re always open to special requests, so please let us know if you have an upcoming project that requires integrating with better web data, segmentation, or predictive analytics but haven’t quite figured out how to apply the toolkit.

What-if (we threw an analytics party and everyone came)?

October 11, 2011

We’re pleased to announce the release of what-if (scenario testing) functionality for each API, all included within the base package.

This allows you to perform scenario testing on your underlying API. For example if you build a conversion API, where product offered is a variable, it can be nice to test the impact of changes to product offers. This is now possible :

Be warned though, if you want to draw strong conclusions from this analysis you’re predicting a counterfactual scenario. To do that with the most confidence, statistical purists would strongly suggest you need a randomized experiment (in this case random in the product offer variable). Even if you don’t have this, our modeling approach bring in as much third party data as possible to remove the biases inherent in an historical analysis, as such it can still suggest where the low hanging fruit might be.

Try it out, under ‘what-if’ on the left hand side.

Clearer data attributes

September 25, 2011

In our continued effort to demystify data, we’ve recently published our available attributes, which clarifies which inputs are required for each attribute. We’re continually updating this list, so please let us know if you have any suggested additions.

Occasionally it can be nice to avoid using (or even seeing) particular attributes. We’ve recently added support for this too … within the Account page. Just enter a comma separated list of values, and when third party data is being appended, any fields with names including this text will be skipped.

If you have any questions or suggestions, please let us know.

Optimize your web forms; the conversion rate vs accuracy trade off

September 8, 2011

We all want to ask as few questions as possible on our web forms. However each question adds incremental value. How do we think through the trade-off of additional questions (leading to accuracy) vs simplicity?

First, let’s illustrate the tradeoff here.

We’re trying to find the optimal number of questions, where conversion is maximized, subject to some minimum level of information content.

The first, perhaps obvious, observation here, is that third party data is always a good idea. You get extra information content, for example to customize offers and look and feel, without impacting on the consumer experience.

Next, we need some way to test the information content of various subsets of the questions. Demyst.Data offers a way to do this – but the concept is pretty simple.

1. Upload your exhaustive questions, and a target variable

2. Fit some scorecard or segmentation that you’re happy with

Here’s ours. This can be thought of as the ‘taj mahal’ workflow (i.e. all questions are included).

3. Delete columns, rinse and repeat

The next step is to delete each column, and refit the entire scorecard, and plot side by side. Again, here’s one we prepared earlier.

The orange line, the baseline, is flat (clearly if you don’t ask any questions then predictive lift isn’t possible). The red line is what it looks like if no “Demyst” data is appended. All this means is we’ve temporary turned off the third party data and refit. The “without demyst” line is almost as steep as the full ‘taj mahal’ line. In a real dataset, this might mean you wouldn’t bother buying third party data (not something we’d advocate – actually what’s happening here is the emails are always joe, or john, so it’s not surprising that it’s not adding much value).

4. Keep going

There’s a near limitless number of permutations of this exercise.

No we can see that credit and email as standalone don’t add much value. Age is really the winner here, suggesting a radically simpler quoting process.

We don’t have the full picture yet, since we don’t know if that reduction in lift is compensated by a corresponding lift in conversion thanks to a simpler workflow. That’s a topic for another post.

Beta invites, UX, & stuff

August 20, 2011

Thank you to all who have registered for a private beta trial.  We’re thrilled with number of requests and will continue to open up spots daily.  For those of you who have already signed up and tested the tool, please, don’t be shy, send us your reactions.  We need real user feedback so we can perfect the experience and continue to meet our clients needs better.

In line with some of the input from early adapters, we’re excited to announce a new addition to our team, Bryan Connor, a UX and data visualization expert who is putting in countless hours to make the product as user friendly and intuitive as possible.  You can sample some of Bryan’s prior work at and I’ve included an initial iteration of the tool below (or, sign up here for a beta trial- the new design is in place!).

In other updates, our engineers continue to enhance the modeling techniques, access new data sources, and perfect the outputs to improve on some of the results of our early pilots.  And of course, we’re working on our 7-minute demo for Finovate and getting our travel plans in place.  Hit us up, we’re seeing up to 40% lift in predicting default versus the status quo.  Let us help you grow!


August 11, 2011

“We’re really excited to have DeMyst Data demoing their innovative new solution at FinovateFall. We think the audience will find their new solution for helping lenders with segmentation and offer customization via alternative data sources very interesting.” ~Eric Mattson, CEO of Finovate

For those of you unfamiliar with Finovate, it is “the conference” for showcasing innovations in the fields of banking and financial technology. On stage, we’ll publicly launch the tool and demo some of our initial results with real client data.  Our focus will be on exposing lenders to the rich segmentation we are able to create with minimal customer inputs and illustrating how the outputs can be used to customize offers for thin file consumers.

We’ll be in NY (and traveling around the US) for a few weeks leading up to the conference and look forward to re-connecting with many of you and meeting others for the first time.   Drop us a note, we’d love to share some results and discuss how the product can help you grow!

Lies, Damn Lies and Statistics

June 24, 2011


It was Mark Twain who popularised it, but the original authorship of the oft quoted phrase “lies, damn lies and statistics” is widely contested. One to whom it is frequently attributed is Benjamin Disraeli. A distinguished conservative politician and literary figure, Disraeli’s business ventures are deservedly less celebrated.

His speculative investments in South American mining companies in the early nineteenth century proved calamitous and almost ruined him. One wonders whether, had he a more considered view of the power of information than is implied by the phrase with which he is sometimes associated, he may have avoided the pitfalls of reckless investment.

While the world, and particularly the ethereal world, is awash with data (and indeed statistics), it is alarming how infrequently that data is converted to useful information. At a time when data is generated and captured at an unprecedented rate and indeed has become inordinately accessible, it is ironic that we remain so beholden to the spinmeisters and their political masters. The power of information has never been more readily, tantalisingly, at our fingertips but somehow we don’t reach out and grasp it.

At Silne we have a healthy disregard for what we call information asymmetries. In equity markets information asymmetries are said to be removed through the trading activities of arbitrageurs. When I trade on the basis of closely held information I essentially expose that information to the world. In the meantime of course, I make money. Information asymmetries then confer power on the holder of information, or serve to diminish the interests of those without access to it. That’s not fair and we don’t like it.

We define information asymmetries rather broadly … information is available but is not being used; you have information but I don’t; information exists but I don’t know it does. Finding relevant, predictive data, sifting and analysing it, and using it to solve problems and improve decision making  is not easy but it can be a route to the truth, not the damn lies which Disraeli so lamented.