GlobeLab @ The Boston Globe
in partnership with
NULab for Texts, Maps, and Networks
and The Boston Area Research Initiative,
with the help of
a distinguished advisory board,
present a new kind of data collaboration.

Watch the survivors of the ramstein air show disaster crash celebrate their 20th anniversary.

Data Swap 2013 is an exclusive opportunity to work on complex, real-world problems, with rich and large-scale datasets and individuals with diverse skills and backgrounds from research, government, and civic organizations throughout Boston.

This isn't your mother's hackathon.

There's no conference room full of over-caffeinated and under-deodorized engineers, no 72 hour time limit, and no room for shoddy prototypes. This is an opportunity for a select number of gifted researchers to join interdisciplinary teams to work on the pressing and meaningful problems facing Boston communities.

Unlike hackathons, meant to generate quick ideas and prototypes in a short period of time, DataSwap is about forging and supporting long-term collaborations between researchers, communities and data guardians. Groups sharing common interests and complementary skills will collaborate around specific problems. Each problem will be proposed by the owners of one of the datasets who present. On day one at The Boston Globe, you'll learn more about that dataset and others to help you in your research. You'll be given a community facilitator to help you craft useful research that is relevant outside the bounds of academia. Then, it's up to you! Over the next several months, you and your team are challenged to craft a presentation around the problem you were given. At the conclusion of the time frame, we'll reconvene to share our findings with one another and choose a winner.

We are looking for passionate analytical minds, interested in investigating real problems with the data we are providing. You are enthralled by the idea of making an impact with real research and engaging data science. You don't have to be a computer scientist, statistician, or engineer, but you need to be comfortable working with data. Likely, you are a graduate level student, but advanced undergraduates or enthusiastic professionals will be considered as well. In order to keep working groups evenly distributed in interest and skill level, we are asking everyone to fill out a quick application when signing up. Selected participants will be notified by October 1st.

Get Inspired!

On October 17, Northeastern University will host a "skill-a-thon" where you will have the opportunity to get an overview of state-of-the-art data analysis methods from prominent researchers and scholars. The goal of these 20-30 minute sessions is to expose you to a wide range of approaches your team can use to analyze the various types of data with which you'll be working. Topics include:

  • Best practices for extracting and cleaning data
  • Sentiment analysis and topic modeling using natural language processing
  • Classification and prediction using machine learning
  • Relational data using network analysis methods
  • Analyzing geo-spatial data using GIS methods
  • Developing interactive data visualizations

There is limited space but open to anyone interested in hearing about novel ways to manipulate data. Register for the event here!

Introducing...The Data

We have assembled a team of Data Guardians who will explain and share their exciting datasets at the event. Each team will be focused a specific set, but will have access to all the datasets for supporting material. Aside from our official Data Guardians, there will be an opportunity for anyone to offer a dataset to share with the group. Anything is "in bounds" for your research, but the group that best showcases their dataset will have an advantage.

The Boston Globe

Full articles and metadata from The Boston Globe online, January 2011-present.

Enigma empowers the discovery of hidden facts and connections across the universe of big public data. Access everything from import bills of lading, to aircraft ownership, lobbying activity, spectrum licenses, financial filings, liens, government spending contracts and much, much more.

Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online media. Using Media Cloud, researchers, journalism critics and interested citizens can examine what media sources cover which stories, what language different media outlets use in conjunction with different stories, and how stories spread from one media outlet to another

Digital Public Library of America

The DPLA “offers a single point of access to millions of items—photographs, manuscripts, books, sounds, moving images, and more—from libraries, archives, and museums around the United States.” More pertinently for this meeting, the DPLA “contains metadata records” for 4.5 million “photographs, manuscripts, books, sounds, moving images, and more from libraries, archives, and museums around the United States.”

2012 Presidential Campaign Contributions

Using public data from the FEC since 2001, we track the contributions from all donors in the Boston area and link these contributors to street addresses, employers, and occupations.

City of Boston Tax Assessor

The City of Boston's Assessing Department is responsible for the administration of property tax records, tracking for all parcels (i.e., the smallest ownable unit) a variety of details, including address, current owner, square footage, land use, assessed value, and whether it is owner-occupied, as well as other related details. The database is updated yearly. The City of Boston and the Boston Area Research Initiative have teamed up to construct a version of this database that runs from 2000 to present, providing a longitudinal snapshot of the physical and economic landscape of Boston and its neighborhoods. The database has been mapped in such a way that it is compatible with both City administrative databases (e.g., 911 calls) and census geographies (e.g., block groups, tracts).

And Many Thanks to our Planning Committee!

  • Northeastern University
  • GlobeLab
  • Northeastern University
  • Northeastern University
  • Boston Area Research Initiative
  • MIT

You've got questions? We've got answers!