#statknowledge, Day One
There's no wifi, and the 3G signal is weak, but I will be attempting to live-blog the first day of the Seminar on Innovative Approaches to Turning Statistics into Knowledge from my iPhone. Check back for updates.
Order: newest first | oldest first
4:35 p.m.: That's it for today. I hope to write a summary post in the next few days.
4:23 p.m.: OnTheMap is available from the Census Bureau's homepage.
4:20 p.m.: OnTheMap is both a data set (jobs and spaial data) and a tool for visualizing queries of the data.
4:18 p.m.: He combines state data with Census data to get a full picture.
4:16 p.m.: Final speaker is Matthew Graham, a geographer at the Census Bureau. He's going to talk about using OnTheMap to show economic data.
4:12 p.m.: The IMF built Data Mapper to draw more attention to its data.
3:49 p.m.: Spruijt proposes a new map that has no geographic basis (much more like a chart than a map).
3:46 p.m.: Desmond Spruijt of Mapping Worlds talks about showing life, not countries, in creating maps.
3:41 p.m.: They used Processing to visualize the telecom data.
3:39 p.m.: Looked at telecom data and saw how time zones affect how cries interact with each other.
3:36 p.m.: They looked at a visualization of when people used cell phones during a soccer game and could tell the points at which goals were scored.
3:33 p.m.: They also tracked cell phone SIM cards to see which sites in Rome people were visiting.
3:32 p.m.: MIT is putting tracking chips in trash.
3:31 p.m.: The rest of today's presentations are about mapping tools. The first is about real-time visualization of urban data.
3:01 p.m.: OECD is considering creating a wiki that would allow people to upload their own data.
2:51 p.m.: Audience question: Will any of this data be useful in 20 years? Wieland says data might be put into a book.
2:39 p.m.: In one example, traditional publishing took 3 months; wiki took a few days.
2:35 p.m.: Solution: Use a wiki for content creation, with a procedure for content validation.
2:33 p.m.: Problem: publications contain lots of good content, but publications do not easily reach audience and production process is cumbersome.
2:31 p.m.: Ulrich Wieland of Eurostat is speaking about new ways of dosseminating statistical content.
2:26 p.m.: Ogris wants to build a REST and AJAX API for outside developers.
2:23 p.m.: Ogris explaining how tool was built: Backend: data provision only; synthetic data engine; stateless servlet layer RESTful data API goes between front and back ends Client side built with Google Web Toolkit. (No Flash!)
2:16 p.m.: This visualization came from the need to put data from a PDF online in a friendlier format.
2:14 p.m.: Make it easy to let users move things around and compare.
2:11 p.m.: Australian government gathers lots of data on prisoners. data.alc.gov.au/duma/doma.html
2:09 p.m.: If the end user doesn't understand data, it's not necessarily their fault. Needs to be created so they understand.
2:07 p.m.: Interactivity adds a new dimension to data.
2:06 p.m.: Data dissemination is the most difficult part. What ype of chart to use, what colors, etc. Is it about conveying information or looking nice? Not an either or.
2:04 p.m.: Says interactive data viz usually means the Internet.
2:02 p.m.: Julia Ogris of Space-Time Research are next, speaking on web-based dissemination for storytellers.
2 p.m.: Back from lunch break.
12:39 p.m.: Ros says it's not necessarily their place to determine whether data uploaded to Many Eyes is completely accurate
12:36 p.m.: Instructions to watch the #statknowledge conference: http://bit.ly/il92D. Presentations listed here[PDF]: http://bit.ly/RR4Te #statknowledge (via @myersnews)
12:28 p.m.: Cox: NYT graphics department has three staffers who spend all or most of their time on Flash graphics. Other departments have people doing Flash as well.
12:19 p.m.: All the presentations and URLs will be posted after the seminar.
12:13 p.m.: Check out www.durham.ac.uk/smart.centre
12:05 p.m.: Next: Jim Ridgeway of Durham University, on empowering individuals for social progress.
12:03 p.m.: Ros: Visualization doesn't have to be used only for statistics.
noon: Text visualizations can give Semantic clues to what to look into further
11:56 a.m.: Text is Data: speeches, debates, interviews can all be used to create visualizations.
11:55 a.m.: All data and visualizations on the site are public. Anyone can remix anything anyone else created.
11:54 a.m.: Another user used wedding RSVP results to create a tree map of all the responses.
11:53 a.m.: And they did it within hours of the bill passing
11:52 a.m.: After stimulus bill passed, users used Many Eyes to make a tree map of where the money would go.
11:50 a.m.: Allows users to upload, visualize, discuss and embed data.
11:49 a.m.: http://many-eyes.com
11:48 a.m.: Next is Irene Ros of the Many Eyes project at IBM.
11:47 a.m.: Ferster is considering using R to analyze the data.
11:45 a.m.: If only newspapers had students to turn narratives into structured data.
11:44 a.m.: Students entered in data on a historical neighborhood, allowing queries to be performed. Shows the value of structured data.
11:43 a.m.: See it at historybrowser.org
11:41 a.m.: System shows where he traveled, what he spent money on, whom he met with.
11:39 a.m.: Students put data on Jefferson's travels into system, based on documents.
11:36 a.m.: The History Browser was created from need to show generic data; it's not tied to any one subject. Handles geographic data but isn't a GIS system. Very new approach for historians, who don't usually use quantitative data.
11:35 a.m.: Bill Ferster from U Va talking about tools for working with historical data.
11:33 a.m.: Data for most NYT graphics over past decade is from Census; next is BLS
11:32 a.m.: I'll try to update later with links to the cool NYT graphics Amanda Cox is showin off.
11:31 a.m.: Cox: Distributions are often more interesting than averages.
11:28 a.m.: Amanda Cox: "Data isn't like your kid; you don't have to preten to love it all equally."
11:23 a.m.: Amanda Cox and team are making about 200 charts a month at NYT
10:46 a.m.: The media is very important to the presenters because it helps distribute the data.
10:43 a.m.: If someone loads their own data into OECD Explorer, the logo disappears because OECD isn't responsible for the data.
10:35 a.m.: Here's that school choice wizard: http://bit.ly/QdoWj
10:32 a.m.: How do creators if these systems determine whether they change how people make decisions? Spiegelhater hasn't looked at this yet.
10:16 a.m.: Charles Naumer: A TV station in Colorado (9 News) is hosting a data project that helps parents make decisions about school choice. Search by radius and other factors, do interactive analysis, look at race and ethnicity. (I'll add a link later.)
10:01 a.m.: Trevor Fletcher talking about OECDeXplorer: http://stats.oecd.org/oecdfactbook/
9:54 a.m.: Analysis is weakest point in the chain. Two approaches: let the users do analysis themselves or have experts do it.
9:52 a.m.: Check out books by Tufte and William Cleveland for more on quality of presentation.
9:45 a.m.: 3 quality aspects: quality of data, quality of analysis, quality of presentation. Yearly data doesn't describe changes within a year (quarterly is usually better)
9:42 a.m.: Next is Anders Walgren speaking about time series data.
9:40 a.m.: They're aiming for adaptable, embedable animations. There's no correct format for risk presentations, but there are known biases to avoid.
9:39 a.m.: Graphing the data brings out different stories -- epidemics, wars, etc.
9:35 a.m.: Life Table database is at lifetable.de (thanks,Derek)
9:33 a.m.: Graph they built shwiing lifespan data can be customized with individual user's data. Also allows users to explore what their life expectancy would have been 100years ago, based on age and other factors.
9:32 a.m.: Data can be personalized with users' photos on icons.
9:30 a.m.: Icons often work better than graphs in showing risk.
9:28 a.m.: People tend to use population and frequency when describing risk, not %
9:26 a.m.: Their program (Winton program) has animated national lotrries, soccer matches, travel data.
9:24 a.m.: First presenter are David Spiegelhapther and Mike Pearson of Cambridge Univ. On Visualizing Risk
9:22 a.m.: Eric Swanson, World Bank: 20 % of visitors to world bank site look for stats and data.
9:20 a.m.: "Classical media are very important. ... If we can engage media, we can make a big jump."
9:18 a.m.: OECD official: data is only valuable if people use it. Web 2.0 has the potential to get new generations interested.
9:15 a.m.: Today's sessions are on "storytelling" and mapping tools.
9:14 a.m.: Trevor Fletcher of OECD will be moderating. This is the third Statknowledge seminar. Over 400 people registered.
9:09 a.m.: International visitors have to be escorted everywhere.
9:06 a.m.: Lots of security restricions here, especially for non-US citizens.
9:01 a.m.: Nancy Gordon from the Census is opening the meeting. Going over administrative details.
8:58 a.m.: Each presenter gets only 15 minutes. There are 15 speakers today, plus Q&As.
8:43 a.m.: At the Census Bureau. Waiting for the opening of the seminar.