Sat, 12 May 2007

Web2Expo Presentations Online

The list of Web 2.0 Expo presentations online include five that I witnessed. If you're not very interested in Web 2.0 crud, you still might want to check out the Architecture for Humanity (link to a PDF) which I found impressive, moving and not hype-saturated.

posted at 11:24 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as:

Thu, 19 Apr 2007

Everything You Do Gives You Gold

Session: Reality Bites: the Future of Gaming + Virtual Worlds 2.0

Participating:

Ginsu: last night, talked about how we'd do the panel, other people decided he had to go first, because SL is over-hyped, 5 minute limit. Spent 3 years understanding it, can now explain it in 10 minutes.

Ginsu: [puts up picture of Gutenberg press] It all started here, you've heard this before. Before it, media was tightly controlled, creation was sacred act. Had to be literally a monk to write and distribute media. Since that point, continuation of the idea of lowering the bar, making it cheap to produce mass media and market. Gutenberg press had a slow distribution time-line.

Ginsu: same stuff, but now it's faster, using technology, text, images, video, voice. Shared collaborative space. Not different from books, that much. Instead of it taking decades / centuries, it's now nearly real time. What do virtual worlds have to do with web 2.0? It's an extension of the same sharing, creating impulse

Ginsu: New topic, emoticons. Hate them, love books. Good writing is amazing. Write everyday, so do you probably, mostly for work, try to avoid emoticons, when dashing messages off, the emotional bandwidth is thin, constraining. Forced to use emoticons. But with Second Life, you get more emotional context, based on avatar choice, posturing, clothes, hairstyle. Susan says say something poignant, this is it: this cultural and emotional bandwidth that is available in a VR environment, is maybe a little different from the printing press.

Lane: Love to get into how everything was chosen. Reality is that it wasn't complicated. A group of parents who looked at what was available for kids and saw:

  • UN-entertaining, sterile
  • purely built on marketing and merchandising products to kids

Lane: Sat down and asked: can we do this better? For the most part, we think it did. Different paths, but we built this for our kids.

Club Penguin

Susan: Club Penguin is a VR with millions of users.

Lane: Built using Flash 6, so it would work in all the browsers. Looked at barriers to entry and looked at how to burst them. Demographic they looked at is not patient. Would rather have 2D graphics than long download times. Built to be easy interface, load up on "grandma's computer". Built around two things, fun + safety. Express that a lot, because it's still their values. Fun enough to keep kids hanging around, safe for them to be there. Big challenge to make it safer than anything out there.

Lane: asked what I hoped to express, good values, good ethics, good morals does work and you don't have to be controversial to sell. Safety is important, beyond just a marketing tool / pitchline. Has to actually work. Built to take months and months to explore. Lots of features which haven't even been found yet. Built by parents for parents.

Joichi: going to talk fast, assume everyone knows what WoW is. ( He wasn't kidding, I was barely able to keep up with him, typing, so several invisible gaps in the transcript of his words. ) [puts up a slide] Content is on one side, Context on another. Music is stuff you can put on a truck and ship around. When you used to feel lonely, you listened to music, knew others felt that way, too. Then video games, a little more interactive, Karaoke, much more interactive and now with Text Messaging, very much more interactive. Entertainment industry going from Content to Context and this is where it intersects Web 2.0.

Joichi: Similarly, Communication Technologies range from Mass Media, Magazines, Blogs, Social Networks, Email, Instant Messenger, Presence. It's like the US finally discovered SMS. Kids in Japan, SE Asia grew up knowing they had the internet in their pocket. Studies show kids forming intimate presence communities where they know where 5-8 people in their circle are at any given time. Twitter isn't boring, it's not about content anymore, it's context. A lot of people miss context when they think about games because they think it's about content. The whole notion of co-presence is an important part of the game / entertainment thing.

Joichi: A lot of WoW players have WoW full-screen, do everything through it. Blizzard allowed creation of Addons, using lua, brilliant thing. Now you can integrate all the information into one interface. It's all about real-time presence, not static web stuff. Web 2.0 is catching up with WoW.

Joichi: [Richard Bartle slide] "Not Yet, you Fools!" envisions game as immersive fantasy, considers voice immersion-bursting, reality-intrusive, ruins role-play. Reality is that voice is there, Western notion of the internet is logging in to cyberspace [closes laptop] and then you log out. Eastern notion is less binary. [shows South Park clip] A lot of people look at the surface of education. "Simulation" v. "Metaphor" Simulation is close likeness to real world. If you wanted to use a game to teach someone how to be a better manager, using simulation, you'd recreate the conditions of their job, same environment. But metaphor is a different way.
Metaphor is like a raid, where all aspects are different but it has a shared core of the idea. Uses the word "Ensemble". [Shows 40 person ensemble going after dragon] So it has nothing to do with your job, but you have exposure to the same core principles, managing large groups of people toward a goal. There's a zone you get into when everything works and you get a reward, not the same reward as getting a higher score than anyone, it's a reward from collaboration, easy in WoW, hard to get anywhere else.

Joichi: Where you have social software, social forums, you have tools to collaborate, shows Rupture

Susan: you were CEO of myfamily.com or whatever. Why Gaia?

Craig: I went to Benchmark with an EIR with one goal, building it up. Looked at consumer internet, only wanted something with an enormous consumer value, something that would sell without marketing. Looking for a product where founder has enormous grasp of end product. Someone building something for themselves. Looked at 250 startups over 14 months.

Craig: Gaia world's fastest growing hangout for teens. #2 forum, a billion posts, over 1M posts yesterday, 2M monthly unique visitors. Avg simultaneous users 64k. 3x growth since May 2006. Avg minutes per session: 48, beats myspace, facebook, habbo, runescape, puzzle pirates

Craig: why do they love it? basic concept, is building profile, then you build avatar, friendslist but a cute friendslist, can build a blog, they call it a journal, communicate and self express. Build a home, write fiction, poetry, join a club, draw art, submit creations to user-managed newspapers or just have users vote hotornot style on it. Or just play games. Free flash games. Hang out in towns. A little like Club Penguin, but for the older demographic of kids. Gold falls from trees in Gaia. In fact, everything you do there gets you gold, that's the basic metaphor. Use the gold to trick out your avatar, 11 stores, 5k+ items for avatar or house. There's an eBay marketplace, where you can [re]sell creations. 50k+ auctions daily.

Craig: behind it all, rich storyline, they build a lot of the content. Beginning of October, had a Tom Cruise doppelganger, jumping on a couch, yelling about aliens. Movie theater, like mst3k. The combination of content they create, plus user content. 7 banks, including one that is a result of a merger. Weddings online, with a wedding planner. Gaians throw their own parties where they perform plays. [shows screenshot of dress rehearsal] Got into this because it's a great value proposition. In a world where teens are constantly branding and packaging themselves, Gaia is where you go to get away from it all, and just be yourself...or who you want to be.

Susan: I get that game designers know more about UI than web 2.0 designers. Question: if that's true, why are all the successful online game companies, why don't they use game designers for their site design?

Raph: the Game Industry is oblivious. They're all big traditional content owners. The answer is they're completely clueless. They don't realize that's what's happened in virtual worlds is their lunch has already been eaten, by people from the outside. The people on this panel work for companies where games are part of the culture. The virtual world hasn't come completely to grips with the user-generation phenomena. Many game people have fled big media because they don't get it.

Raph: everybody but the game industry is rushing into this space. Everyone references WoW. WoW is a wild outlier. Viacom has published more virtual worlds in the last 6 months than any vw publisher. Game industry is being marginalized from games business as everyone rushes for the game space. Game design is not an arcane science.

Craig: having you in our office was amazing because everyone in our office is a huge fan. Why can't we make games free, why do people have to go buy in stores? People feel it started with Raph, with Ultima Online, etc.

Sue: Craig you showed a visual aesthetic style, which may appeal to teens but maybe not mass market, question in general, perception is that online gaming is very niche, hard-core audience. How respond?

Craig: first, we are mass market. 2M unique visitors last month, no money on marketing, PR. We only have one language. I think games which cost $20 and take four years to make are obsolete. 2-3M WoW players, but it's an enormous amount compared to previous gameplayers. Club Penguin is radically mass market because it's easy to get in and figure out what to do. Most games cost $20 or more, hard to understand; myspace and facebook are free, take seconds to figure out.

Lane: from day out set out to serve parents and kids, shun interviews and events like this. Put aside what we personally wanted to serve community which wanted more features, better features. Growing up in the game industry it was about what do I want, my friends want, no, it's about what kids want?

Ginsu: is this a fad? can't understand how people could ask this. Were you told growing up you would have a persistent online media, 15-20 years ago, that you would find spouse, be able to buy stuff, interact online like we do now.

Raph: manga and anime, if you think that isn't mainstream, you're old and out of touch. It's all over TV. Look at avatar, airbender

Craig: but tv is becoming a little niche... Virtual reality dwellers outnumber population Canada.

Question by Susan: expect future web to be visually rich, given that many virtual worlds require emotional commitment, how can you reconcile what will happen when people have many choices?

Raph: interoperability standards, OpenID

Question by Susan: people are being overwhelmed by choice now, what happens in 5 years? How compete for people's attention?

Raph: don't even understand the question. who watched buffy? emotional investment in buffy similar to WoW. Of course, there will be big sites and small sites. Good shows / worlds will get cancelled, people will gravitate to worlds that interest them.

Craig: if you're in the audience and you're wondering if it's too late, no, it's not. You still have time to build interesting worlds. In that space, there will be many, many, many winners. It's a mistake to look at where you fit in versus somebody now. It's time to put on blinders and build a world which fits your vision. When the question is asked, which world you go to? It's like the time you spent as a kid, going to school OR playing soccer OR hanging out with your friends? No, all of those.

Lane: cable channel analogy. 50 channels? how could they thrive against the big three broadcast networks!

Question by Susan: another way of asking it: look at social networks, thousands, majority of users concentrated on a very few of them. as we move immersive, are we going to see that? club penguin, gaia online, we see deep segmentation. what do you think the distribution of success will look like?

Ginsu: try but it's hard to not sound self-serving or be self-serving. at the point where we are, cost of virtual world creation is expensive. Easy to do a web site, channels are expensive. if you're going to create and experiment in a way that is open and extensible world where you don't have to hire 50 developers, spend millions of dollars. If you had a system like that which was open to everybody...that would be pretty cool. That's what we're chasing at Second Life. Vast majority of users are consumers. Small, powerful, minority are creators. Not just virtual shirts, shoes, things like that. It's about having a large virtual space to yourself, managing community, managing experiences of others.

Question by Susan: what metrics do you use to measure your site's success?

Craig: number of users, time spent, 4-5 secondary metrics: retention rate, revenue, etc. whole site is fundamentally free, revenue generation is not chief goal

Ginsu: several hundred dashboard reports daily, about 20 everyone looks at, other people look at specialized reports.

Lane: quite simple, put a lot of time and effort into listening to the audience. spends time reading blogs, looking at forums. users are very quick to say it's not fun and not safe. easy to quickly see where things are because they're a great vocal demographic. Have people on staff solely to keep an eye on blogs, find out what people like and don't. Working in real time means they don't have to wait for service packs, can roll out changes real time.

Lane: 70% of staff are doing customer service

Raph: conversion is an important metric which didn't get mentioned, uniques v. 30 / 60 day trailing revisits. Linden has now released stats showing users checking in every 3 months, used to be every other day. Need to know how many people are bouncing off their sites, how many sticking and core.

Joichi: drag it back from numbers, look at behavior. It's hard to change behavior. This co-presence thing is a trend but we don't control it. WoW is great because they figured out what was going on and added a little bit of value to it. It's rare to hit upon something new which is going to change everything. Flickr isn't successful because they don't have an e or because it's blue, it's because they spotted what people wanted and feed it. A lot of technical people think it's just feature add, we need to think about it more like sociological anthropology.

posted at 16:33 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , , ,
Islands in a Frothy Ocean

Session: Licensing User-Generated Content With: Fred von Lohmann, EFF

This was a fast-paced high-level look at some of the issues you get into when your company wants to encourage users to make stuff and distribute it through you. This is some of the hoopla around Web 2.0, right here. Crowdsourcing, community building, whatever you want to call it.


  • Licensing inputs
    • on the shoulders of giants
    • up-loaders who don't own content
  • Licensing outputs
    • reuse, recycle, re-mix

About inputs:

Posterchild for angry giant shoulders is Viacom v. Google.

Basic copyright problem, when it comes to copyright, big ocean of uncertainty. Statutory damages and personal liability because there's no shield, which can reach up to the officers, directors and the investors.

Four islands of certainty in the ocean. The tide-line moves, so you're never really sure where you are. These so-called "safe harbors" eliminate monetary damages and limit injunctions.

  1. Conduit Island
    • if you're an ISP
    • solely providing connectivity
    • it's not your fault
  2. Caching Island
    • "nothing grows"
    • designed for AOL's caching circa 1997
    • only works when user requests content
    • no forward caching allowed
    • doesn't help akamai
  3. Search Engine Island
    • indexing
    • searching
    • directories
    • linking
  4. Hosting Island
    • most important in web 2.0 world
    • designed for web-hosting companies
    • couldn't guarantee all pages by all users didn't infringe

You don't have to be an island to build a business, qv Bittorrent.

web hosting + search engine = eBay Now many more companies combine safe islands. MOG is an idea to improve music blogging [?!] and is a new-ish company using several islands.

Myspace and youtube and similar companies are betting that they're above the tide-line.

How to get on an island, the basics:

  • register a copyright agent
    • this costs $40
    • trivial to do
    • SO DO IT
  • notice and takedown
    • copyright owners have to follow some rules
    • if they jump through the hoops, you must comply or be cast off the island
  • infringer termination policy
    • user with lots of complaints (ie, more than 2)
    • you need to close their account

(Not) Staying on the Island:

  • "Red Flag" Knowledge
    • if you know a user is infringing
    • don't do anything about it
    • you get pushed off the island
    • if you had evidence indicating obvious infringement
    • perversely, the more you know about the uploaded content, the bigger your exposure and the more culpable you may be
  • Direct Financial Benefit + Control
    • if the infringing directly benefits you
    • "youtube loses because they have ads"
    • youtube segregates ads from the video pages, themselves
    • control is more than just being able to delete / takedown content

For more information, call your lawyer. Do it now. Don't wait too long because it may change your biz model, software architecture, employee policies.


About outputs:

How do you attract content re-users?

flickr is a use-case for this. Not only allowed upload, allowed users to get pictures. If someone is in the business of selling stock photo, they're already being obsoleted by flickr.

How does flickr make it easy to re-use content from their site? Creative Commons. That's the short answer. CC has a content curators page, making it easy to find content under CC licenses. Also a search facility which lets you search the whole web for CC content.

  • Attracting the re-users
    • Findable
    • Usable
    • Simple
    • flickr interface is pretty great
      • search by content license type
      • including refined CC license subtypes
    • flickr has a page showing CC subtype categories, lets you browse
    • attribution-nocommercial-noderiv most popular
  • Giving creators reasons to use CC licenses
    • vast majority of flickr users do not license using CC
    • explain the licenses
    • default
      • permissions for pictures
      • set in profile
    • batch changes make easy to relicense

Audience QA:

  • good examples of commercial license implementations?
    • CC has standardization
    • CC has internationalization
    • commercial context, much harder to achieve
      • more complicated
      • internationalization problem
      • can probably be done
      • Revver is maybe at the forefront on this
  • what happens if you're operating outside the US?
    • the DMCA harbors are part of the reason many ISPs are here
    • in many other countries, no islands, only the ocean
    • US has most articulate protections, legal principles
    • protected in US doesn't mean you're protected internationally
    • internet is international, copyright law is not
  • "I take a picture of you, upload it to flickr, license it attribution-only, can people do whatever they want without your permission?"
    • complicated question
    • simple answer: no violation of the photographer's copyright
    • complicated addendum: may violate subject's privacy rights
    • subject may have recourse to stop use of image if used commercially
    • depends on what subject and photographer are doing at the time and where they are but it's more likely some other legal problem, not copyright
  • what about a site devoted to video mashups? End-product might be legal but what about the raw material uploads?
    • easy answer: license the raw materials and you're fine
    • safe harbors/islands should shelter you if you obey the takedown stuff
    • but once you're doing the mashups, are you still on an island?
    • even if it's fair use for the end-user, might not be fair-user for the service provider like Kinko's photocopying and selling a textbook
  • what does non-commercial mean?
    • enormous debate in the CC community about this
    • many things we can agree are or aren't and many no agreement on
    • lots of discussion on the CC wiki / site
posted at 15:59 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , , ,
Cassandra Media

Hey, remember Indymedia? I first heard about it and visited it back in November, 1999. It was because of this site that I went to the WTO Protests. I followed the Independent Media Center for years but never felt like I had the time to get involved.

The last time I looked at their site, it had been overran by race-baiting hate-mongers for whom I had no respect and no desire to interact with.

Today I spotted an event on the web2open chalkboard about Indymedia. I went to it. It was fascinating. I had failed to realize just how strongly IMC had foreshadowed the rise of user-created, user-uploaded, user-annotated content. IMC was web 2.0 before there was such a thing.

So what's happened in the years since it started?

Well, companies and organizations and technologies sprang up to do what IMC had been doing but making money at it, because big companies spent big money on pushing this field. So now IMC is lagged, stuck, and hurting. They need volunteers, they need resources, they need content, they need software.

So maybe I'll finally pitch in and lend a hand.

posted at 15:23 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , ,
You Had to Be There

I didn't write anything about the keynote pieces because there were thousands of people watching them, many in person, and I didn't think I could add anything to them. I was most excited by the world-changing bits, like the Architecture for Humanity and the Potenco talks and I wish I could have had more time to ask the representative from Instructables about how the killings at Virginia Tech changed her presentation ... but not enough to have actually asked her when I saw her in the lunch space and again on the escalator.

Or at the Fred von Lohmann presentation. Man, she's everywhere.

OK, I had ten minutes so I went and asked her and she gave me a robot sticker!

robot sticker

Also, she superbly explained the impact and how it changed things. It's this: because the community on Instructables is all about building guns out of K'Nex, the point she wanted to make clear is that the valuable part here is that they're making stuff, that they're developing engineering and social skills. It's not chiefly about the guns. They're engineering guns because that's what teenage boys are in to.

Because they're building a community, each person involved is one less loner. So it can be a great liberator, giving people in isolated areas a sense of connection, of belonging, of making and doing and sharing and learning. When they develop new interests, they'll take those making skills with them. So now I (think I) know what she was saying and so do you.

posted at 07:12 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as:
Less Disgusting Than Anticipated

Session: Top 5 Do's and Don'ts for Measuring Web 2.0 (Yes, that's really how it was punctuated.)

With: Akin Arikan, Unica

I tried to branch out and go see some less technical presentations. This one turned out to be a really fun one and gave me the idea that I understood what was going on in the head of Marketing people. Yeah, that illusion won't last. What follows is an improvisational summary of what was said.


Who here likes to work with salespeople? Nobody? Why is that? Because they're pushy. But they're paying attention to you all the time, even reading your body language. So who should you hate? Right, us. Marketing people. Because who makes spam? We do. Cramming messages down the throat of prospects.

Web 2.0 != Spam 2.0

What is it? Build brand through amplifying customers. Give unique value through social intelligence. Create better user experiences.

DO WEB ANALYTICS. Don't Just Measure to Improve Usability and Conversion Rates -- there's much more you can do with it.

Unica sells software to marketing departments, campaign managing software, web analytics NetInsight.

Levels of metric analysis and use.

  1. Optimize Web 2.0 applications
    • for usability
    • for conversion rates
    • for engagement
  2. Market Insight
    • Capture Social Intelligence
      • watch how people use the tools
      • figure out what they really want
  3. Relationship Marketing
    • Build a Profile
    • Act on it
    • This is Marketing trying to be more like a good salesperson, listening

Case-study: Imagine a product review and participation site, where users can review, on feature level, respond to each other's review, score what matters most to the user. Overlay number collecting interface on the unstructured data.

How to proceed?

  • think of measurement from beginning
  • don't think of page views, it doesn't matter in web 2.0
  • don't use server log files

Business goals of Web 2.0 application

  • drive traffic
    • get more visitors
      • unique visitors
      • engagement metrics
        • session length
        • comments
        • uploads
        • invitations
    • viral buzz
    • repeat visits
  • drive revenue
    • convert visitors to buyers
      • revenue
      • conversions
    • up-sell & cross-sell
  • build brand
    • create customer relationships
    • get direct feedback

When page views won't cut it, use event tagging to record actions. ActionScript, Javascript, Pixel tag. Like page bugs, zero-size images.

Measure the contribution of web 2.0 applications to your revenue, conversion, things you want out of your site. Segmentation of data is your friend.

Click-stream analysis becomes event-stream analysis. What actions did the visitor take, since it's no longer tied to page views.

Use analytics to measure community, commerce and engagement. Segment, segment, segment.

Measure to learn about market & demand. Capture social intelligence.

Measure to serve individual customers. Crown jewel of web analytics. Funnel reports are the most important report in web analytics.

Don't ignore off-line effects of online activity.

Jupiter Research says more people are doing online research and then buying off-line. (The bastards!) Try to measure if the online stuff is influencing their off-line behavior. But how?

  • correlate trends, online + off-line
  • display & retrieve customer codes
  • display unique 800 numbers
  • buy online, pick up in store
  • promotional coupons, encode the source of the visit or a visit handle

How to measure individuals off-line conversions triggered by online marketing?

  • direct response
  • inferred response

    • match up contacts
    • loyalty cards
    • accounts
  • entice online registration

  • feed user activity into CRM or SFA
  • prioritize off-line treatment
  • entice identification in stores off-line

Audience Q&A

  • What about RSS?
    • unique cookies in the feed, that's the only nice thing
    • if you're syndicating through feedburner or something, read without information feedback, but feedburner provides some data back
  • How about widget?
    • dark spot in Akin's knowledge
    • no one stays at the same web site, need to be able to measure widget impact
    • think about tagging the widget
    • run into third-party cookie problem
  • How to make sense of user-generated content
    • don't stick to categories
    • inject ways of making data numeric
    • try to find heuristics to measure unstructured content
  • What to do about flash video
    • uniquely craft content to result in unique action
      • like a special URL
      • tagging
    • many providers for embedding video in many places
    • maybe set a cookie during video viewing
    • common wisdom is that reaction is 2-3 weeks lag
  • If you don't sell anything, and it's not commercial, how can you measure if a change is working? What's key performance indicator?
    • engagement: are people staying longer, reading more, scrolling down?
    • reach: unique visitors
posted at 07:04 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , ,

Tue, 17 Apr 2007

At Last the Circle is Complete

Some time ago my friend JDD showed me some code he was building and asked me to do some compiling of it to see if it would build on Debian. Eventually it did. Now I guess that code is all grown up.

I passed by the nook where he was giving a demo of it today. It looks like the kind of code which you'll really like if it's the kind of thing you like. It's called mod_ndb and it seems to do something Web 2.0ish. If I were more elite, I think I could understand it.

UPDATE 2007/12/30: Or Maybe Not
How dare he! JDD has developed a second software project. So this isn't the one I'd seen before. I just can't tell different mysql apis apart from even close up.

posted at 18:20 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: ,
This is Comparing, Not Contrasting

So, low battery and fried brain, but I went to a panel about different frameworks named Comparing Web Application Frameworks. Dustin Whittle couldn't make it so I learned nothing about Symfony. I'm not sure I mind. PHP, fooey.

The panel representatives talked about Django, Seaside, and Rails and they even ran a little over time, they had so much to say.

The description of Django was a recap of an earlier presentation I saw so I won't repeat myself. Nor Adrian. Seaside is in Smalltalk and is positioned deliberately as a heretical web framework. It throws away a lot of the sacred cows of web UI, such as an underlying relational database, such as human readable pretty URLs, such as keeping the user interaction stateless, such as using a templating language. Rails you probably already know about even if you don't know anything about it.

Avi says that Seaside uncouples designer from developer. Developers should create HTML using a framework and the designers should concern themselves solely with CSS. Seaside is better for web applications than it is for web sites, perhaps. What's the distinction? Hard to say, but Adrian takes a stab at defining it in a narrow way. He cites DabbleDB as an excellent web application, in that it logically extends the desktop paradigm onto the web but is not a website, with all which that implies. Primarily, with a website, hypertext is a first class citizen.

Avi talks about Seaside not supporting a default persistence strategy, unlike both Rails and Django which are conceptually coupled to the idea of a relational database of some sort.

Adrian talks about the Washington Post, he works with Django every day there, sees all the places he wants it to be better. A strength of Django is that reads like standard Python. If you can work in Python, you can work with Django. The idioms conform to expectations.

The first version of Seaside was a port from other languages of things Avi wanted, largely inspired by WebObjects and Tapestry and that was a mistake because Smalltalkers hated it. Version two was a rewrite much more in line with Smalltalk ideals.

Several questions were answered with glibness and all of the speakers handled themselves with aplomb. Some questioners in the audience seemed interested in how to force their developers to use one of these frameworks. All the speakers opposed that idea. Let the developers use the framework which excites them.

Lots more talked about but I didn't capture it, for better or worse.

posted at 18:15 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , , , , , ,
We're Not Saying No Because We Can't Do It

Session: Building Awesome Web Sites & Services Using the Power of Happy Users

A loose transcript, full of errors in hearing and typing. More like an impressionist version of the actual panel.

Question by Rheingold: How do you know customers want to be involved, how do you let customers know about product? Have product first? Have community first?

Biz Stone: need product first; has phone number on twitter page, probably won't scale, Jack has # there too. Always blogging, reading blogs, emailing. If you have a product you love, your enthusiasm is contagious

Joshua Schachter: expansion of a single-user system to be multi-user so building something intrinsically useful to himself, was first customer, after opening it up, users started showing up. Product was its own marketing, useful to connect product to other things in the ecosystem

Stewart Butterfield: problem with how to keep same inter-activeness while scaling

Biz: but amazon / ebay must have user forums for feedback, places users can react to them

Joshua: yeah, but off-brand, so easier for them to ignore, dismiss. no forums on delicious, but mailing list of 1000s users, blogs


Question by Rheingold: how deeply do you commit to a public api, does it matter? is lots of feedback the key indicator that it will work

Joshua: easy to have a closed feedback loop when he was the first customer and developer, still sees all incoming customer support email but doesn't respond

Biz: it's not completely necessary; if you love it, feedback doesn't matter

Rheingold: users didn't provide code but were happy to beta test, spell-check, give feedback

Joshua: have api so that others can build stuff with it that are interesting, create new features which aren't as interesting to the core creators, turn potential competitors into creators using your apis

====

Question by Rheingold: what drives people to help you, create for you, is it wanting to boost your corporate bottom line?

Stewart: one of the biggest motivators is recognition of accomplishment. Anyone with a blog or who has written to a large group about something you feel strongly about, recognition for that is good. Want to share their cool ideas, and people respond to that. Some flickr api users are doing it for monetary gain, creating applications which make money for them. Others are just helpful and nice and get recognition for doing good deeds

Biz: if they were doing it to increase value of company, they wouldn't care.

Rheingold: dogster, catster contribute because there's nothing else like it.


Question by Rheingold: do you think of users they're contributing members of your team, worry that you're letting down the community / users because of your decisions?

Joshua: you always have to evaluate each decision, is it a positive step? many decisions which seem wrong at first glance are driven by deeper understanding of problem. People have wanted to scrape entire site's data, but when they were a small site, that would take the whole thing down. Newspapers have delicious this! button but want to pre-stock tags, to dictate what tags users put on the link. It's easier for the users to get what they want out of it if you don't force it like that, let them tag it however they want to tag it.
Despite it making extra work for them, it's better in the aggregate

Stewart: speaking only for himself, it's a big obligation. Meet-ups in far off places, flickr became part of their life. He sweeps his streets twice a week, to improve his corner of the world. People have always made contributions, this isn't new with web 2.0


Question by Rheingold: concerns about acquisition affecting user / response

Stewart: there's a part of big companies called corporate development, looks for companies to acquire. flickr appealing to different groups in yahoo, photo group, search group because of meta-data. Lots of headaches but one of them is not company direction, still doing what he wants.

Biz: blogger acquired by google, enabled him to work at google, stepped up interaction with users...

Followup question by Rheingold: was this a concern going in?

Biz: I'm sure there was, wasn't on the team at that time. Even after acquisition, struggle to switch over infrastructure, gave him exposure to users.


Question from audience: flickr forced users to merge with Yahoo account, delicious not forcing that, wtf?!

Stewart: it was a trade-off: 6.5M new users gained, 1500 unhappy emails

Joshua: one thing that acquisition highlighted is how company engineers see identity, very different from how users see it. Delicious uses identity different than flickr, tied closely to login, needs to tease apart before they would merge; they haven't merged because they can't in the short term. They want to and will at some point when their account / identity information are teased apart


Question by Rheingold: do you have soothsayers, special group of users, rely on for feedback, keep you on track?

Joshua: user group on yahoo groups, toss out ideas to them, get different viewpoints It used to be more active when he was doing it alone, because he'd be up all night developing, release, go to bed, leave bad bugs, have lots of feedback from it when he woke up

Stewart: large number of users in different categories who are vocal, provide feedback. (He takes an audience straw-poll which indicates lots of people have flickr account, maybe 25% of them really really like it.) The majority of flickr account creations don't spend a lot of time with their account. There's a big danger in listening to only the people who love it because then you don't know what's wrong, why the uptake isn't higher; freaked out by possibility of public, using shutterfly or something

Biz: do analysis, have a friends of twitter group, friends and family they can release half-baked feature to and get feedback.


Question by Rheingold: early on did you release half-baked features to get direction or was the concern to release bullet proof?

Joshua: always as fast as possible, you have to get quick feedback in order to learn. Do three releases a week, to get quick revs, at this point. Mostly scaling UI recently, pushing an entirely new UI sometime, expected to be very painful. When started, coded on live site, because no stage server, very fast feedback indeed, when he made a bug. Nice to turn stuff around fast, harder on a flickr scale. If a feature can't be made to have very fast use cases, can't be done at all. Several hundred machines, hard to turn them over on short notice. Yahoo has resources for QA, so now do some testing before it goes out the door, difficult to get full coverage, even with.

Biz: release REALLY half-baked features, blogger is labs anecdote

Stewart: half-baked stuff, flickr three years old, completely different service now. only stuff same is the profile page but that's just because they haven't gotten to it yet. hard to do feature progression because priority changes so rapidly. Going down a path where you do feature A so you can build on it next to make feature B so you can make feature C doesn't work because after A, you'll get pulled in another direction and never get to B or C. When they changed the UI early on they got for the first time an email which is now a common response to any change: "I had a screwed up childhood, I don't adapt to change well, you have to change it back."

Biz: early days of blogger, lacked photo feature, button for photo pointed people to flickr, lots of excitement, because they were able to interact with the flickr creators

Stewart: early days, flickr founders spent lots of time / effort giving social love to the new users, build this strong community, large number of the users are still there

Biz: spends time reading twitter feeds, when seeing new users who are unsure, give them attention to help rapport, grow the user community


Question from Audience: how to not get the boilerplate email, get the attention of companies with your great idea for making something new or a killer feature for their site?

Joshua: api are part of the means, for delicious, it's not about the code, it's dealing with the scale. Code is not the constraint, it's get features faster. Delicious works the way it does because of limitations in MySQL. People ask for things which they could do, things in their lexicon, which they expect. Example: want to alphabetize, so they can sort. That's their model for how to organize information. People ask for stars to rate items, they've seen it elsewhere. Why bookmark something which is one star? People ask for features they expect, even if they're not useful. apis can be a weakness. Users have a problem, don't know the solution, so will ask for something 'nearby'. Things succeed by being simpler. On the social side, #1 request is people want to see most book-marked sites. But: 1) it's not surprising/interesting, 2) if they do it, someone will try to game it


Question by Rheingold: What's this twitter wiki thing?

Biz: twitter fan wiki, user created. they struggle within twitter to not build features, to not complicate things. Fans/users have organized all the things using twitter apis. When someone wants to build something, Biz points them to Google twitter group, then points them at twitter fan wiki, things people have already done. Point people away from the twitter company, so they can focus on core goals


Question from Audience: what did you do to spread word early?

Joshua: RSS is hugely useful api/marketing tool so anything they could give an rss feed has one. More than half traffic today is rss requests.

Biz: RSS yeah

Stewart: all the different widgets and upload tools they made to make it painless to put pictures in and drop images into other contexts


Question from Rheingold: have you hired from within the user community, how have you found those users, communicated to them?

Stewart: many people, have a qa guy they found in their forums when they needed to have someone. lead designer for flickr was someone who played the mmorpg flickr company started out making. Cal Henderson gets props, built a bunch of stuff on the mmorpg game's api, hacked into dev mailing list, read it for several months, suggested new features. Always prefer to hire from within the community, now, especially for public facing positions.

Biz: everyone hired has been an end-user. It's not that they saw them as a super user and sought them. It's just that those people get it.
Red flag if someone came in to interview and had never used the service.
Have people do 10-20 hour project first as a trial run

Joshua: consult for a month and then hire. After acquisition, dude wrote a book, so they hired him. His first job was changing the api which invalidate his book. Oops. (Well, I laughed.)

Biz: hired by blogger because he was a user


Question from Audience: (long and inaudible, sorry)

Joshua: don't tend to think that far in advance. a lot is about how I feel about the future of the thing. how does it fit in with future vision, how easy to do. struggling with scale, firefox extension, a million users every second. It's all about what scales, what they can do. A lot of features are easy with underlying technology, a lot of things they will never do. Try to be relatively communicative, when asked for something they won't do, try to explain why they won't. One email a day requesting ability to vote on tags that other people are allowed to use. A lot of features tend to be glosses on things they've already done.

posted at 17:43 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , , ,
That's a Nice Site You Got There; It Would Be a Shame if the Digg Army Happened to It

Session: Case Study: Digging into the Technology Behind the Development of Digg

Presenter: Owen Byrne

This is my stream of mistyping impression of the session.

  • Introduction
    • Kevin Rose had the idea in October 2004
    • Inspired by sites Rose liked
      • slashdot, intended as slashdot killer
      • del.icio.us
      • friendster
      • macrumors.com page 2
      • great content that didn't make it to the home page of many news sites
      • built around the core concept of links
    • democratizing media for all
      • passionate community
      • mainstream editors and reporters sit on the Digg homepage, news sites have digg! buttons
      • readily adaptable to other media - kind of a youtube highlighter, among others
    • stats
      • 1M registered users
      • 10M pages / day
      • 15M unique visitors / month
      • 10% monthly growth in users & pageviews
      • 100 Linux boxes
        • web servers
        • memcache servers
        • mysql slaves
        • development
  • Case Study - Interactive / How to

    • Byrne used to teach, had the stick to motivate participation with 25% of their marks
    • launching
      • Kevin spent $2k to launch
      • met on elance doing other work
      • Kevin developed spec
      • Owen did work on there
      • built on LAMP
      • simple utilitarian design because had no full time designer
      • hosted at $99/month with Rackshack
    • feature decisions
      • innovate - take stuff liked from flickr or whatever but not too much like
      • simple and rewarding, give users quick fix of satisfaction
      • use "AJAX" where it made sense, this was before it was called that
      • tools to connect to other sites - blog this, JS widget with digg top headlines
      • experiment with the data visualization - digg spy, cloud view, etc
    • pre-funding
      • no need for monitoring because someone was working on it most of the time
      • standard LAMP, where P is php
      • growth constrained by hardware
        • optimize queries, when they start to slow site
        • denormalization, because database is fastest then
        • Jeremy Zawodny's High Performance MySQL was a big boon to them
    • growth
      • Paris Hilton - doubled traffic when they became 1st & 3rd on yahoo's return for "paris hilton phonebook"
      • Thomas Hawk / Price Rite Photo - digg army crushed it
      • Word of mouth, PR, minimal advertising - 1 ad on boingboing for 1 month, $100
      • new categories added in June 2006
      • steady growth with occasional insanity
    • problems
      • log files
      • myisam bad, innodb good
      • mysql full-text search doesn't scale
      • javascript compression
    • seed funding investment
      • used to buy small number of servers - web server, mysql master, mysql slave, this quadrupled their server stack
      • ad-hoc monitoring
      • hired 1 dedicated operations person
      • silverorange design
      • growth outstrips hardware but more constrained by developer resources
    • July 2005, digg 2.0
    • Series A investment in August 2005
      • spend a bunch on servers
      • everyone in the same location, finally
      • operations department - add dba, ops manager
      • hired senior developers - 2, one of the PHP 5 gurus, tripled number of developers
    • new architecture, digg 3.0
      • released july 2006
      • LAMP + memcached
      • MySQL 5 using innodb
      • Lucene full-text search, search was broken for a long time, fixed a few months ago
    • open source is the win
      • monitoring
        • nagios
        • cacti
      • javascript
        • prototype
        • scriptaculous
    • dev process
      • LAMP stack
      • subversion
      • bugzilla - bad, but nothing better
      • wiki
      • virtual hosts - each dev has 1-3 for testing
  • Conclusion

    • engage with the users
      • energize your efforts, reduce sense of isolation
    • don't forget about the business model
      • interested in getting ads in early and make money right away
    • be frugal
      • maximizes luck by letting you last longer
    • think about scaling early
      • site rewritten from scratch 3 times
      • if he had planned for growth earlier, not so much
      • make it more object oriented for loose coupling
  • Q&A

    • decentralized dev because developers were originally all over
    • code is written to run in any environment with minimal changes
    • listening to users early on shaped it, Kevin's audience that he brought with him from TechTV
    • not using php frameworks, Kevin wanted it all from scratch
    • going from the $99 hosting solution to their own boxes, it was a joy because Owen was a Debian user. :)
    • hardware transitions were much less of a problem than software transitions, like the mysql query engine going myisam to innodb
    • future plans - architecture is scalable now, but to a factor of 10
    • stay tuned for a possible public API to the digg/bury functionality for other sites
    • distinction from competition - they were first, large user community
    • two cycles
      • large cycle feature, full acceptance test, 2-5 days
      • emergency pushes for small tweaks or need now, 30 minutes
    • http://digg.com/jobs
    • Not yet profitable but close to it.
    • Secret of success was the cold winter which kept him in at a computer.
    • Originally doing work at $20/hr, gave Kevin a discount.
    • Digg User Celebration
      • Thursday 7-11 pm
      • mezzanine in SF 444 Jessie @ Mint
      • Details & RSVP link on the Digg Blog
      • Must RSVP to get in
      • Announcements are made by Kevin
  • Goal: provide with insights

    • yeah, I'd say he managed that
posted at 14:38 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: ,
Try to Have Friends Who Aren't Where You Are

Session: Geographic Distribution for Global Web Application Performance With: Jacob Rosenberg, AOL

These are my self-important (in several senses) records of the presentation. YMMV. Refer to the official presentation slides for facts.

Two basic ways to handle geographic distribution.

  • content delivery networks
  • multiple physical sites

It's hard to find a page created of less than forty distinct objects. Craigslist is an outlier with three.

You need something like Keynote to see what response times are like from different parts of the world.

Time to load pages quadruple from west coast to east, double again once you cross an ocean, double again from the opposite site of the globe from where it's located. So that's something like x16 in India under optimal conditions.

Akamai paper and Nielsen's paper have some analysis of user impatience.

How to check it

  • Latency emulator, firefox has some plug-ins
  • Recruit geographically diverse beta community
  • Remember: nobody ever complains that a site runs too quickly

Rough guidelines

  • get site entry point under 2 seconds, because it's first impression
  • AJAX feels sluggish ~100 ms / request
  • big multimedia? 25k loader, to fetch the rest
  • test on low speed links, crowded wifi worse than historical modem speeds

Two paths to resolution.

  • CDN
    • caching system
      • need multiple distribution points in geographically distributed places
      • technology for localizing to point user at nearest point
    • Akamai the 800 pound gorilla, Wikipedia rolled their own using squid
    • not only a performance improvement, but takes load off origin servers, where content originates, reducing hosting costs or capacity use
    • there are the starts of open source DNS localization but it's not great yet
    • how to implement
      • form a relationship with a CDN provider
        • alternate static content name, like cdn.mysite.com
        • maybe use an alternate base domain and keep it cookie free
      • provision the name on the CDN provider
        • origin server name
        • serving server name
        • cache duration
      • make dns changes needed on your site
      • modify src to point at CDN site
      • modify JS & CSS which might have links
        • can put JS & CSS on CDN if they're not dynamically generated
      • version files and set very high expiration so the browser never requests it, this can save you on load times, transfer costs
      • use all possible expiration headers because different proxies respect different ones
        • Cache-Control-Max-Age
        • Expires
      • may want to pre-load cache if you know you'll be pushing a big slice to them
    • Why use CDN?
      • most content is probably static
      • if it's the same for everyone it's static, even if it changes every 5 minutes
      • small objects are more hindered by latency than big objects
      • many early page objects download serially
      • CDN is relatively cheap
      • it's pretty painless to use
    • Why not use CDN?
      • some content doesn't cache well
        • personalized content
        • ad delivery which depends upon cache bursting
      • secure stuff you don't trust others with
        • SSL
        • private information
  • Multi-site

    • serve from more than one site
      • DR
      • performance
    • requires design analysis because applications need to work as expected across multiple distant locations
    • problems
      • session keeping applications, with sticky or cookie based load balancing are difficult to maintain with multiple locations
      • extremely large high-volume content repositories, keeping data consistent is difficult
      • back-end communications for clustering via broadcast or anycast may be difficult
    • how to get there
      • select appropriate site
        • coverage with adequate latency
        • network connectivity can be more important than physical geography
        • use diverse providers and networks to reduce risk
      • deploy application
        • strive to keep congruent, complexity drives cost
        • design to tolerate network interruptions between sites
        • make the web tier as stateless as practical
      • add Global Server Load Balancing to localize users
        • DNS-based with performance localization
        • available in some form on most every switch vendor
        • available as a service, many CDNs, others
        • make GSLB the DNS authority for your sites
        • can route users to address capacity peaks, lulls
        • going multi-site makes DNS application-critical
        • this is the secret sauce of scaling your application to multinational
      • why go multi-site?
        • entire product localized, dynamic and static
        • some regions require a physical presence for legal reasons
          • privacy / data retention laws in the EU
          • network filter requirements in China
        • reduce provider risk
          • multiple unrelated backbones and power grids
          • protect against provider disputes, financial instability, growing pains
          • reduce impact from natural or man-made disaster
      • why not go multi-site?
        • cost
          • more sites = more money
          • need local staff
        • application design
          • some applications just don't work well distributed
          • a few applications don't benefit from lower latency
  • Questions

    • presentations will be up on the site, check page 11 link in a bit
    • Akamai competitors
      • limelight networks
      • level-3 owns sandpiper, used to be owned by savvis, cable & wireless
    • can use both multi-site and CDN in tandem to good effect
    • edge-side include: neat idea, that almost no one uses, push chunks of page assembly to edge caches, page could assemble itself as requested; problems: Akamai only implementor of it of note, not huge performance win, required lots of retool to applications
    • what about multi-site database replication?
      • if db is mostly read, use single master with read-only slaves
      • more complicated if db is read-write
      • teracloud perhaps has something open for db replication
      • no easy way
    • caching and personalization mostly in opposition
      • probably site is partially personalized, edge deliver the rest
posted at 10:29 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , , , , ,
Are You Good Oxygen or Bad Oxygen?

The keynotes, the spectacle of the day, did not disappoint me. But my expectations were low bordering on nonexistent. There were some presenters I didn't know of, but who seemed to serve as proxies for Tim O'Reilly and John Battelle. They had some laugh lines about everyone being gathered there to take down the wireless network and then march onward to destroy Twitter.

Then it was on to show time. Tim O'Reilly came out and had some remarks, about how this bubble isn't a bubble this time, we swear. Then Jeff Bezos gave a presentation on why everybody should bend over for Amazon. Then Tim had a conversation while Jeff remained coy and declined to answer the interesting questions.

Then John Battelle came out and spoke with Joe Kraus, Mena Trott, and Jay Adelson about the premise that creating a company to flip requires a different approach than creating a company to keep. Interestingly, none of the three strongly took the obvious counterpoint other than to suggest that there is a window of in-opportunity where a company is too valuable to obtain cheaply but too poor to be worth spending extravagantly on.

Then there were some five minute pitches by some companies. The point of it being that some sort of straw poll popular response would be recorded afterward via sending a text message to MOZES to show something or other. I wasn't clear on the point of that. Was there money or candy involved? Not for me!

  • Spock, a people search engine. Because there aren't enough ways to find "red hair fashion model naked" on the existing search engines.
  • WebEx Business Applications, because if mashing up useful data is fun, mashing up marketing data should be extra fun. With sufficient mojitos, it probably is. Too bad I hate mojitos.
  • inpowr, which is ... something. Possibly a cult. Possibly yet another self-help pseudo-cult. But the presenter had the best patter and the least evident business plan, so I voted for his dog and pony.

It's too bad that the mozes interface ignored all of our voting for the five minutes I sat there and then later sent me a text message to tell me that my choice didn't exist. Thanks, mozes! I feel extra validated, now.

The keynotes were where all the hype went to live and it was probably the thing most likely to disillusion a skeptic about the business plans of companies sponsoring it. I don't think the web is dead. I do agree with Tim O'Reilly that we're in the VisiCalc era. I just associate that with tedium, hype, over-promising, overpricing, scarcity, frustration and interminable wait.

Some irrelevancies

  • the discussion John Battelle had with the three founders was superb
  • the post title is a riff on a line Battelle used there, about Google being the oxygen, now
  • I suspect the real successes to emerge from this time are going to be the people consciously not doing what ``everyone'' is doing / knows you should do / believes is vital
posted at 10:11 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: ,
Things I Wish I'd Known Earlier

... but managed to figure out as I went along.

  • there's a secret room on the second level which simulcasts the keynotes
  • if everyone is on one floor, the facilities on other floors are vacant
    • bathrooms
    • the wifi network
    • the oxygen
  • it's cool to see stuff from other tracks
posted at 09:48 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: ,

Mon, 16 Apr 2007

Stuff It!

Cal Henderson on Beyond the Filesystem: Designing Large-Scale File Storage and Serving

Needs to be:

  • Scalable
  • Reliable
  • Cheap

Four buckets for this talk:

  • Storage
    • Layers of storage
      • hardware
      • volumes
      • filesystems
      • network access
    • types of devices
      • DAS
      • NAS
      • SAN
    • enormous filesystems
  • Serving
  • BCP(Business Continuity Planning)
  • Cost

Storage tidbits:

Google File System, designed by Google, proprietary. Designed to store huge files and read back fast. Uses chunked filesystem which drops files across nodes in 64 MB sized chunks, has one single master node which knows where everything is. There's a shadow master for fail-over purposes. Duplicate chunks on to a pair of nodes [r more] so can read back from any given server]. Reading is fast but requires a lease.

MogileFS - anagram OMG Files. Developed by Danga / SixApart. Open source. Designed for scalable web app storages. Single metadata store, on top of MySQL, using MySQL cluster to avoid single point of failure. Multiple tracker / storage nodes. Tracker knows where things are, storage nodes store it. Not in a grid like GFS. Uses classes of files so you can establish some types of files are more precious than others. Replication is piecemeal. Read/write managed by trackers but performed directly by storage nodes.

Flickr File System, designed by Flickr, also proprietary. Designed for large web app storage. No metadata store. Multiple storage master nodes, multiple storage nodes. Client talks to SM, SM talks to individual storage nodes or to another SM (like in another data center). Application stores metadata. File writes are done to multiple places, read is done from a known node. Read and write scale separately.

Amazon S3, big disk in the sky. Multiple buckets, user-defined keys. No idea of max bucket size. Individual files can be 5G but can't be between 2-4G (bug). Buckets seem to be limitless in size. Because it's cross http, users can get it directly from Amazon, without putting a burden on your site/servers. Cost to serve data from it is linear, cheaper for earlier traffic than having your own data center.

Serving:

Tends to be data hotspots, a small set of highly demanded data. Caching helps here, by putting the most important pieces in fast/front places, optimize them. Can use slower cheaper stuff for all data behind caches. Layer 4 cache, simple balanced cache, few objects, multiple places. Layer 7 URL balances cache, one cache per object.

Replacement policies. LRU, GDSF, LFUDA, etc. Performance varies a lot depending upon which caching policy you use. Benchmark the replacement policies because it makes a huge difference based on your work load.

Cache churn. The shorter it gets, the worse performance. Want objects to stay in cache longer than the span between requests for it. Invalidation is hard, replacement is dumb.

Two models of CDN:

  • simple, you push, they serve
  • reverse proxy, you publish on an origin, they proxy and cache

Problems with CDN are that you don't control the caches. Once it's cached, it can't be changed. (Guess Cal doesn't know that limelight will let us purge cache.) Solution to this is versioning, so we can expire content by changing name, using headers, whatever. Simple rule of thumb: if an item is modified, change its name (URL) so that caches will update / expire. You can advertise a URL with a version number in it, then strip that version off in rewrite to point at versioned-image.

BCP:

  • replication
  • redundancy

Recovery times: Now long to get everything back if we need to recover from failure? Replication queuing.

Phew! He talks fast and covered a lot so these notes are kind of all over the place, incomplete, as well as redundant with the slide-set he has online. Good talk, though.

posted at 14:20 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , ,
Alternately Bleak and Hilarious

Second block of the day, I went to Building Web 2.0: Next-generation Web Platforms which I thought was going to be about current data centers but was more about data centers of THE FUTURE. So when you hear about the future from Microsoft, Amazon, Crescendo Networks and MySQL AB, well, it's a little like THX 1138; alternately bleak and hilarious.

As near as I can tell, Microsoft's existing best practices of restart, reboot, relicense is adding a fourth step: re-image. Crescendo wants network devices to know more about what the application is doing and vice versa. Amazon is big on virtualization and on-demand virtual server start / stop. And MySQL, well, they love the LAMP stack. Mmm, that's good open source.

Also, Amazon foresees data-center consolidation, Microsoft thinks client side caching will solve everything and network engineers are terrified by COMET because it will be broken by and thus break proxying. Did the MySQL guy mention open source, yet? Because it's good. Especially a LAMP stacked application.

I didn't blog this as it happened because it was a panel discussion and I don't take dictation well.

Now I've got some down time since nothing in this time slot tickled my fancy.

posted at 11:27 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , ,
Lawrence!

Adrian started creating Django when he worked at a newspaper in Lawrence, Kansas. Because they were under insane deadline pressures, they needed something speedy to publish. At this point the talk rolls back because there were no slides on the screen. Now there are.

So they created lawrence.com under a deadline of a couple weeks. Started out site with PHP, too hairy. Went to Python after reading Dive Into Python and then went through creating a framework by making an app, making second app, abstracting shared code. About two years ago, after two years of working on it, they decided to release it open source. Named for Django Reinhardt, jazz guitarist. Damned hippies.

Django works by mapping regular expressions to methods, parses request url. Single place, URLconf, lets you see all the things to handle and how they're handled. Keeps URLs pretty, decoupled from code, can arbitrarily change them. Calls first match in the regular expression list. Pass arguments to method from the capture parentheses of your regular expression. Standard Python notation for regular expressions.

Models use ORM abstraction so you can develop against SQLite and deploy on postgres without changes, for example. Doesn't do runtime introspection on purpose. Explicit code definitions. Gains performance and keeps it database engine agnostic. No field name assumptions, there's no black magic. Magic is rare in Django, on purpose. If you know Python, you can use Django right away.

Once you write the model, Django will generate CREATE TABLE statements, so introspection but only for set up, not at application runtime.

In order to cooperate with designers, Django has a template language which lets you return your results through a template which is boilerplate HTML with substitutions. Templates are inherited, sort of backward server side includes. Child templates indicate what they append / amend from the parent templates. No depth limit. Template filters act like Unix pipes and modify the output as it hits the template.

Intentionally don't allow python in template to preclude site crashing typos.

There are generic views for common idioms so you don't have to repeat yourself to handle common use cases. Things like iterating to display returns from selects. Uses the same pattern match idea to delegate to provided methods for things Everybody Does. There's also built in automatic administration page generation by hooking a URL pattern to the built in admin package. It has the smarts to know what to prompt for in the data inputs based on the data types you've told it your object model uses. You can use custom filters; if you put them in the model, it happens throughout the application, if you just want it in the admin interface, you can hang it off of that. Admin interface is completely dynamic. Edit the model code and admin interface updates automatically.

Django used to be code generating but that was evil so they did away with that. It's all now entirely dynamic. There is a branch of code under development now to let you give more granular permissions to users, it's table-wide at present.

(Tangentially, Adrian is using KDE on his laptop.)

If you screw up your Python, it gives you very pretty full stack trace informative debug output when you hit the site. If you're running it locally, you can play with all the bits, interactively, akin to the Python runtime interpreter. Running in production, the error will instead generate a pretty developer designated error page.

Then Adrian debuts a brand new Django feature, Databrowse. Abstracts database creation. Adrian is going to commit this code right after this talk. You visit url hooked to the databrowse piece. Visit the URL, lets you view all object models, auto-generates relationships, conveys with links. Lets you navigate the database via web GUI. Creates clever ways to view data, generates calendar, for example, on date fields. Functionally a little like phpmyadmin, lets you browse data, not the public view but can suggest interesting ways to make information available via website.

Databrowse has plugin potential, so while it lacks ranges, aggregation, fuzzy match, graphing, but those are probably coming from other people who want it. Just hang them on the Databrowse.

Django has no support for blobs but does support file upload, stores file on the filesystem, puts path to it in the database.

Free online book coming from Apress.

posted at 09:53 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as: , ,
Coffee Achieving

I'm at web2expo and so far my impression is that the coffee is good and the pastries okay. That's right. Complementary breakfast if you make it here at an unseemly hour. The expo really started yesterday but I only came by long enough to grab my badge and materials pack.

I charted out what I thought I'd probably be going to and then when I arrived this morning, they'd shoe-horned in a new session with greater interest. In theory I'm about to find out All You Need to Know About Django.

Perhaps more soon.

posted at 09:02 PDT (-0700)     (comments disabled)   permanent link   Technorati tagged as:

Tue, 15 Aug 2006

Temple Prostitutes

I went to Linuxworld. Or, at least the Exhibit Hall, which is as much of it as one can see without paying money for the privilege. I saw a lot of vendors, some of them in suits. I saw a lot of companies I recognized the products of and one which I had thought had died.

Ingres. That's right. They're around again. Or still. Depending upon how you measure you it. Once upon a time I was an Ingres DBA. The last time I saw it, CA had rushed out some half-assed almost-worked-on-Linux version of the Ingres database engine but I wasn't able to make a case for using it to my boss of the time when Oracle had a less broken engine available to run on Linux.

I also saw a bit of holy war humor. Unfortunately the resolution on my camera-phone wasn't high enough for anyone to read the signs on the booths so you'll just to take my word for who was there.

I came away from the show with some weak schwag [mostly stickers] and only one disturbing moment. I chanced to be near the Debian booth when some visitor asked what the relationship is between Debian and Ubuntu.

The representative of Debian said that the chief differences are

  • Ubuntu focuses more on the desktop presentation
  • Ubuntu configures different default options for the user
  • ... which restricts users unnecessarily
  • Ubuntu doesn't do as much to insure security of the software

I've been a Debian system administrator, personally and professionally, for years. I've been an Ubuntu system administrator for a year, in parallel. I haven't given up my Debian systems. But I don't put Debian on any new systems I install.

Because while the first point might be true, it's done using the task system for bundling packages, inherited from Debian. While the second point is true, it certainly doesn't lead to the third point. The options are still there, still configurable. If a person uses Ubuntu and doesn't like the options they started with, there are a number of sources of information they can use to find out how to change their system.

As for the fourth stated difference, I just don't see how that can be true. The apt repositories of security updates is virtually identical to the system Debian has in place. The source code for changes is all available so it's not as if the Ubuntu developers have to guess what changes were made to a Debian package to secure it. It's not as if there isn't some overlap in the development communities and tools and mailing lists and concerns between the two projects.

So how are they different? Here's what I see as the differences

  • Ubuntu releases every six months
  • Ubuntu airs less of their dirty laundry in public
  • ... but that may be entirely subjective as I used to subscribe to a lot of Debian mailing lists and I only subscribe to Ubuntu announcements and security announcements, currently
  • Ubuntu is more active about supporting commercial applications for end-users

That's it. I can do anything with Debian I can with Ubuntu with almost equal ease. I don't feel notably less secure with either distribution. I could perhaps make a case if I were a more buzzword compliant developer that having new libraries and tools available every six months was somehow better than the less regularly scheduled Debian updates but with my system administrator decoder ring on, I could go either way on it. The things I like in Debian, I like in Ubuntu.

The things I didn't like in Debian have less to do with the software and more with the ceaseless flame-wars. I'm as much a moralist as anyone, probably more. But I still got tuckered out just trying to read past them to the actual technical informations.

posted at 23:12 PDT (-0700)     (comments disabled)   permanent link  
August
Sun Mon Tue Wed Thu Fri Sat
         
28 29 30
31            
2008
Months
AugSep
Oct Nov Dec