OpenNews Code and projects

OpenNews supports developers inside and outside of newsrooms in creating code that helps journalism thrive on the open web. We believe that the code being written in news transforms not only the industry, but the web itself.

Projects by the Knight-Mozilla Fellows

Our Fellows spend ten months hacking in some of the best newsrooms in the world, following their passions and creating compelling open-source projects. Here are just a few of the many things they've developed:

  • Tabula
    Tabula is a tool to extract tabular data from PDFs. The project was created by 2013 Knight-Mozilla Fellow Manuel AristarĂ¡n, with the help of 2013 Knight-Mozilla Fellow Mike Tigas and his colleague at ProPublica Jeremy B. Merrill.
    Key libraries :
    On the web | On GitHub
  • Hyperaudio
    Mark Boas' Hyperaudio is a tool to facilitate the easy assemblage of audio and video programs from their underlying transcripts. The ongoing aims are to create something usable that works with both audio and video and allows transitions and overlaying to be specified via in-pad natural language instructions and to build up a library of material and to integrate with other third-parties such as Amara (formerly Universal Subtitles). Hyperaudio is currently a project with Mozilla's WebFWD. In addition to Mark Boas, 2012 Knight-Mozilla Fellow Daniel Schultz and Matteo Spinelli are on the development team.
    On the web | On GitHub
  • Dataset
    Dataset is a tool from 2013 Knight-Mozilla Fellow Friedrich Lindenberg and Gregor Aisch to make it easier to manage databases in Python. Dataset makes it easier to import and export databases: "databases for lazy people."
    On the web | On GitHub
  • Learning Lunches
    2013 Knight-Mozilla Fellow Noah Veltman began organizing informal discussions with colleagues about technical topics. He's shared the materials from these discussions on GitHub. Topics have included databases, maps, and web scraping.
    On GitHub

Code from Code Sprints

We developed Code Sprints to help create some of the small, simple tools that can have a big impact in newsrooms.

Our Code Sprint projects include:

  • Sheetsee.js: Easy data visualizations using a simple spreadsheet backend.
  • Dedupe: A library for deduplication, entity resolution, record linkage, and author disambiguation of big datasets.
  • A parser and API for the daily cash balance updates from the US Treasury.
  • California Election Parser: A parser for election data used by over 200 California news sites in 2012.

We’d love to develop more Code Sprints. Learn more about the program and apply.

Code from Hack Days

We’ve sponsored more than 40 hack days around the world where journalists and developers have worked with data from censuses, elections, campaign finance, and more.

Some projects that got their start at hack days include:

  • CivOmega: this project got its start at the 2013 Knight-MIT-Mozilla hack day. It allows people to ask questions of legislative data and was recently awarded a Sunlight Foundation OpenGov Grant.
  • HackDash: was originally developed at the 2012 Hacks/Hackers Buenos Aires Media Party and a year later, this tool for organizing hackathon projects powered the hack day at the 2013 event.
  • NewsDiffs: began at the 2012 Knight-MIT-Mozilla hack day as a way to track changes to articles and headlines. It now tracks an archives changes to articles on five news sites.
  • began as a project called FMS parser at the Bicoastal Datafest. It was developed by the CSV Soundsystem hacker team, which includes 2013 Knight-Mozilla Fellow Brian Abelson, to help track the US government's virtual checkbook.

Code Convenings

Starting in 2014, OpenNews plans to gather groups of journalism developers and open-source contributors to collaborate on shared codebases and libraries so that we can stop continually reinventing the wheel on needed infrastructure, like election parsers, opsec, visualizations, and more. We’ll have more information soon and look forward to this new way to create code that will help news organizations.