Hi all – we’ve switched to our new blog URL at :
This site will be closed in a few days
Hi all – we’ve switched to our new blog URL at :
This site will be closed in a few days
If you’ve ever read our blog or navigated our site, you’ve likely seen the phrase, ‘removing information asymmetries’. If you’ve sat through a meeting with us, you’ve been lectured on how data transparency can benefit the consumer. Let me try to connect the dots.
Asymmetric information refers to a situation in which one party in a transaction has more or superior information compared to another. Economist George Akerloff publicized the problems of asymmetric information in his 1970’s paper discussing the ‘market for lemons’ in the used car industry. He explained that because a buyer cannot generally ascertain the value of a vehicle accurately, he/she would be willing to pay only an average price for it. Knowing in advance that the ‘good sellers’ are going to reject this average price, the buyer removes the aforementioned ‘lemon seller’s advantage’ by adjusting downward the price they are willing to pay. In the end, the average price isn’t even offered, only the ‘lemon’ price is. Effectively, the ‘bad’ drive the ‘good’ out of the market.
A similar situation occurs in the credit markets. Let us examine a case in which a lender is faced with uncertainty about the creditworthiness of a group of borrowers. Having to account for the bad risks, lenders are pushed to charge artificially high interest rates to cross subsidize their risk. Recognizing this and not willing to borrow at usurious rates, the good subset of creditworthy borrowers remove themselves from the credit markets. Similar to above, the ‘bad’ have driven out the ‘good.’
This inefficient risk cross subsidization affects a large portion of the $many trillion financial services markets, and removing it will yield huge value in the coming years. The availability of information is paramount to realizing this value. Fortunately, data today is being created at an unprecedented rate.
At Demyst.Data, we are constructing the infrastructure and mechanisms to aggregate & analyze this data. Our clients are working to engage the consumer to share their information and educating them on the benefits of transparency. Together, we are removing the asymmetries necessary to draw the ‘goods’ back to the market and to help lenders make educated lending decisions. We believe we’re engaged in a win/win game; hence, our passion, excitement, and enthusiasm about the potential value of improved information.
As we’ve added hundreds of interesting online attributes, we’ve been hitting some performance bottlenecks when processing larger, batch datasets. This hasn’t been an issue for customers thankfully, and it doesn’t affect our realtime APIs, but it’s still frustrating. I had a spare day, so it felt like time for a performance boost.
Here’s the executive summary :
To start I spun up a deliberately small cut down test server, set up a reasonably complex API, and used the great tools at
http://blitz.io
to rush the API with hundreds of concurrent requests
That spike at the start was a concern, even though it is on a small server.
All the CPU usage was in rack/passenger, so I dusted off the profiler. Thread contention was getting in the way. We need threads because we integrate with so many third party APIs. We were still on REE for it’s memory management, however that uses green threads, so it was time to bite the bullet and (1) update to ruby 1.9.X.
That helped a fair amount, but we were still getting the timeouts.
So we re-ran the profile and noticed a strange amount of time in activerecord associations and one particular activerecord and a different mongodb query. This led to a few things …
2. We didn’t dig in to why but mymodel.relatedmodel.create :param => X was causing some painful slowness in the association code. It wasn’t that important to keep the syntactic sugar; switching to Relatedmodel.create :mymodel_id => mymodel.id, :param => X saved a bunch.
3. We added a couple of activerecord indexes, which helped a bit. MongoDB indexes were working a charm, but there was one particular group of 3 independent indexes that were always used in conjunction, and the mongo profiler was revealing nscanned of >10000 for some queries. Creating a combined index helped a lot. Another couple of examples that remind us that, while ORMs are nice, you can never forget there’s a database sitting under all of this.
The result?
And no timeouts until about 150 concurrent hits.
The performance was already plenty in our production system (we automatically scale horizontally as needed), but this helped improved things about 2-3x.
That’s enough for today. We’ll share some more details on performance benchmarks in the coming weeks.
Any other thoughts from the community? Please email me (mhookey at demystdata dot com).
Facebook’s documentation on authentication via Facebook and the graph API is very comprehensive … but sometimes a worked example still helps. Here is how you can add a “Connect with Facebook” button with minimal effort, using rails, coffeescript, and ruby
You need to register your app with Facebook if you haven’t already
From here. Look under authentication, and copy/paste to application.js and/or /layouts/application.html.erb. Add this line to the script to make sure the async loading works
window.setup();
<div class="field">; <fb:login-button size="large"> Connect to Facebook </fb:login-button> </div>
For example if you want to access the logged in customer’s profile after they have logged in, to customize the page, you might do something like this in coffeescript:
$ ->
window.setup()
window.setup = ->
window.FB.Event.subscribe('auth.login', -> do_something()) if window.FB?
do_something = ->
console.log "doing something ..."
window.FB.getLoginStatus (authtoken) ->
if authtoken.authResponse
window.FB.api '/me', (fbdata) ->
console.log "FB name : #{fbdata['name']}"}
# Add interesting personalization logic here
… and you’re ready to go
We have a white labelled offering where we can host this for you, and return the data through painless APIs, in case you’re looking to get up and running even faster. email us and let us know what you’re working on.