Back channel notes on etherpad
- why?
- scalability issues — have to do bizarre things to get to Flickr/Google size
- some models don’t fit schemas
- Voldemort — used by LinkedIn
- needs at least four servers to get started!
- CouchDB, MongoDB, etc
- download and type
make
- MongoDB was much faster, tho’ CouchDB has improved
- download and type
- whenever you hit a tag page on on flickr, you hit a search
- if you hit “my photos, tagged X” you hit a relational database
- Programming the Semantic Web
- by the guy who wrote Programming the Collective Intelligence — very good: all people who like X will like Y
- redis
- key-value store, network accessible
- ridiculously fast
- doesn’t persist to disk — every 15 seconds it dumps the entire database to disk
- can improve reliability by replicating
- e.g. live stats services
- can have a key-set, with add to set, set intersection
- Git
- has shown that it can scale to the size of the linux kernel
- so can scale to storing your desktop settings!
- git is not just a RCS it’s a file system with revision control
- there’s also git# and jGit
- jaiku migrated to app engine
- including all the history
- need to think of queries at design time, otherwise you’re stuck and have to do a big MapReduce to extract data
No comments:
Post a Comment