Tuesday, 16 September 2008

Google Developer Day 2008: A Deeper Look at App Engine -- Mano Marks

  • goal of best practices: use less quota so pay less money and google have less load :-)
  • free preview will always be free:
    • 500Mb storage
    • 2Gb bandwidth
  • currently don’t allow you to pay for App Engine, but committed to charging for additional capacity by end of year
    • CPU: 10-12 cents/hr
    • Storage: 15-18 cents/hr
    • etc.
    • if double free preview, can expect about $40 / month
  • currently support Python, but others will be coming
    • they know but they’re not telling
  • looking to provide large upload/download support, but not sure how yet
    • current limit is 1Mb for file and response size
  • currently no SLA
  • Google don’t put adverts on app engine apps
    • they will make more money on search :-)
    • They also don’t look at your data

storing data

  • keys are limited to 500 bytes
  • can’t change the ID or key_name
  • transactional read & write with get() and put()

counters

  • Bigtable doesn’t know size of tables — that would be O(N)
  • Model.count() is a big transaction
  • could create an entity that maintains the count
    • frequent updates can cause high contention
    • fundamental limitation of distributed systems
  • instead, create sharded counters
    • randomly break counters into shards using a counter config to hold references to all shards
    • when want to count, ask counter config to add them all together
    • use get_or_insert() to fetch or create atomically

memcache

  • when you add things to memcache, you define the staleness that you’re happy with
  • use memcache to reduce storage and processing requirements

Q & A

  • can use urlfetch to request data from your own servers

    • all app engine requests must complete in 10secs
    • and urlfetch must complete faster (4secs?)
    • you get an HTTP error that you can handle if the request times out
  • Is there a profiling tool for app engine or Bigtable?

    • not aware of anything
    • difficult to see return on implementing memcache
  • What logging is there?

    • there is a log; it logs requests and you can write to it
  • SLA…?

    • the quota will be calculated using a moving average, not a total for the month
    • however, the aim is that if you get slashdotted you’ll stay up
  • Parallel processes?

    • don’t allow threading
    • don’t allow direct file write access
    • have limited file read access
    • can’t access direct network sockets
    • however, there is a mapreduce implementation for app engine written by a Googler — http://code.google.com/p/httpmr/?
  • email restrictions?

    • can send from any developer of app or from a logged in user
    • restricted to sending one per second
  • when will Django 1.0 will be included?

    • some people have uploaded django 1.0 themselves
    • django 1.0 includes a C library, so this must be worked around at the moment
  • three big languages internally in Google: Python, Java & C++

No comments: