</We are hiring!>

An in depth look into caching (Part 2)

…we posted about caching. Time to post about caching again. This time, we’re going to concentrate on the complexities of caching at scale. You know, you have a big honking piece of a datacenter crunching numbers so you can post the most awesome kitty pics. That’s what the internet is for after all. And you want your awesome kitty pics to appear in less than 100 milliseconds because… well, teenagers are impatient.

Our take on mobile fraud detection

Jampp’s mission is to help companies grow their mobile business by engaging users and driving new customers. Combatting mobile fraud is a top priority to ensure this is possible. Other than spamming, phishing and scamming, ad fraud is one of the most profitable scenarios for rogue internet users. Mobile fraud forecasts for 2016 vary along $1.25bn (Forensiq) and $7.2bn (ANA). Yes, you’ve read that correctly: impact is on the order of billions of dollars.

Scoring our publishers

At Jampp, we use a variety of sources to generate our traffic flow. Generally speaking, the source employed in one campaign is not necessarily the same as the one used in another. However, most platforms do not have a unified measure to compare the quality of the traffic sources. To create such measure, it is not as trivial or simple as comparing CTRs due to the aforementioned usage of different sources in different campaigns and because campaigns do not share the same post-install metrics (and we are precisely interested in post-install quality). In this entry, we briefly discuss how we...

ITBA's Big Data Present and Future Conference

On May 18th, our data team participated in the “Big Data: Present and Future in Argentina” conference organized by the Buenos Aires Institute of Technology ( ITBA). Several companies that are currently working with technologies considered as part of the Big Data ecosystem or building their own attended this event.

Using Julia for safe Data Science

Recently in Jampp, I had the chance to switch some of our data science environment from Python to Julia. For various reasons, its type system is, in my opinion, one of the best language features. The most obvious one is the performance enhancements it allows. I will not, however, address that point here: it has been benchmarked very well in several places. Instead, I will briefly show a safety advantage this type system brings that is really handy for data science.

How we use Jupyter + Airpal to improve our Data Analytics processes

Being a data driven company, reporting needs are constantly increasing in Jampp. From basic summarizations to complex analysis, every team needs to query our databases. Given this backdrop, a priority for our tech team is to readily provide these reports to non-technical areas. Client-sided and other frequently used reports can be found on our Dashboard . Initially, this was enough to cover Jampp’s evolving reporting needs but, for some time now, we found ourselves getting more and more report and visualizations requests.

PrestoDB on Amazon EMR at Jampp

At Jampp we are big users of Amazon EMR. Since we handle a lot of data, our volumes keep growing and we have a lot of unstructured log data. Amazon EMR was a great fit for a lot of the use cases we had for analytics and log forensics.
x </We are hiring>
    import tornado.ioloop
    import tornado.web

    class CandidatesHandler(tornado.web.RequestHandler):

        def get(self, name):
            secret = self.get_argument('secret')
            is_geek = False
            if ',' in secret:
                word = ''.join([chr(int(x)) for x in secret.split(',')])
                is_geek = word == 'geek'
            if is_geek:
                self.write("Hi %s, we are waiting for you. jobs@jampp.com" % name)
                raise tornado.web.HTTPError(404, "Geek not found")
    if __name__ == "__main__":
        app = tornado.web.Application([
           (r'/candidates/(.*)', CandidatesHandler)