Copy
23 February 2021 
#410: quantum of sollazzo – The data newsletter by @puntofisso

Read it in your browser


Made with data ❤️
 
I hope you will excuse the very long issue. The past week has brought quite a few excellent examples of data use in journalism, and I struggled to make any cuts.
·
Let me also give a little announcement: Open Data Camp is bringing you... Open Data Cafe! This awful pandemic might have destroyed our hopes of running the 8th Open Data Camp this year, but the community is still alive and kicking. So we've decided to run a reduced and more relaxed virtual event on April 10th, from 12:00 to 13:30. Come, bring your own coffee, and join us in break-out rooms for conversations about all things data. If you haven't been to previous Open Data Camps, head to our blog to read about earlier events of the UK unconference about Open Data.
 

·
I engaged with this brilliant Twitter thread on database management as the lens through which we should see how effective is government at running operations. The Centre for Policy Studies' Robert Colville suggested that "every single UK policy success has been built on a good database", with Commons' data scientist Oli Hawkins suggesting that this can only happen when the data layer is fixed. Quite a few interesting replies have followed.

My take on this is that "data first" or "application first" are, ultimately, both right and wrong, and have often been an expression of the tribal nature of many discussions in this space. Fundamentally, though, it's only by recognising that the problem of "good data" is iterative, and often needs to go hand in hand with that of data use, that we can truly make progress and find ourselves in a situation like that described by Robert Colville. As I say in my tweet, it's a never-ending cycle of showing the gap, fixing the plumbing, showing the benefit, and repeat.

A related point: for many organisations the best way to realise the potential benefits of data science is to show actionability of existing vs non-existing data (sometimes using synthetic data), which can be the catalyst to fix the data infrastructure itself.
·
I've sent Ofcom a Freedom Of Information request to understand if they can release (or why they can't release) the address-level broadband availability data that powers their broadband checker. Long story short, I'm helping my road campaign for an upgrade (we're the only sub-10MBps area within several square miles), and I realised that this data is not available (it is available at postcode-level). I will keep you posted on the outcome. 
·
To close, here's a few work opportunities that might be of interest:
  • I'm hiring a Data/Technology Lead for my team at NHSX. We have 2 vacancies in band 8d (£63,751 - £73,664), based in London or Leeds (but still, fundamentally, remote). Come and join the AI Skunkworks!
  • The Centre for Humanitarian Data is launching the fourth cohort of their Data Fellows Programme, funding 2 months' worth of work in data journalism, predictive analytics, and strategic communications.
  • Homes England is looking for a Data Architect, and nothing excites me more than the puns I'm going to be making about it.
  
Till next week,
––Giuseppe @puntofisso
 

 

 
COVID-19

Excess Deaths – Europe’s COVID-19 divide
"...the loss of life has been uneven between Eastern and Western Europe..."
Early lockdowns made Eastern Europe suffer much less in the early stages. I remember how the Czech Republic being praised for their early action on masks, too. Sadly, I also remember them for their "good-bye COVID" parties. This article looks at what happened and how the death peaks differ between East and West. By Reuters Graphics.

How a sluggish vaccination program could delay a return to normal and invite vaccine-resistant variants to emerge
"Experts warn that the current pace of vaccinations won’t just prolong coronavirus restrictions.
It could also make it more likely that new variants will infect the previously immune."

Oh come on Corona, give us a break!
This article by the Washington Post covers the US, but that doesn't mean it's not also concerning elsewhere. TL;DR: the speed of vaccination might be more important than we think.
 


Politics

The Ten Most Misleading Charts During Donald Trump’s Presidency
"Over the course of four years as President, Donald Trump made more than 30,000 false or misleading claims, according to the Washington Post Fact Checker. It should be no surprise, then, that some of these took the form of data visualizations. Here are the top ten most misleading charts, graphs, maps, and tables from the Trump Administration over the past four years."
From Sharpiegate to misquoted approval numbers, here's a good article by the PolicyViz's Jon Schwabish.
(via Massimo Conte)
 

Hexes, Tiles, and Districts
"A conversation with Daniel Donner on designing a cartogram of United States congressional districts."
This article on Nightingale describes the work that went into designing the PrimaryCast tool.
 

Data thinking

A thread on disability, race, and patriarchy in data visualization
"Data visualization cares disproportionately far too much about designing for colorblindness relative to other disabilities that are more common (visual impairments included)."
By Frank Elavsky.
(h/t to the Journalism++ newsletter)

Tools

Xploria
Some time ago, I saw a great presentation at Geomob of something called Illustreets. That was a prototype which has now evolved into a full-fledged product called Xploria, a freely available service helping users to explore data about locations in the UK.
Technical director Manuel Timita says: "We made use of many known – nowadays almost famous – open datasets (2011 Census, Price Paid, NaPTAN, Edubase, strategic noise maps, OS green spaces, and more). That is over 30 million data points, which I like to think we managed to put in an easy to use format."
Best seen on desktop or tablet, but it works on mobile. The data sources are here.
 

On This Day in Twistory
A tool by Terence Eden allows you to search any user's Twitter history for today's date. The source code is openly available.

Dither Me This | Image Dithering Tool
"Use this tool to reduce the file size of an image… but in a stylish old-school way."
Dithering is a technique used to reduce colours in an image by using dots to emulate the shades that are not in the palette. It was very common in 8-bit computer graphics and the early Internet was full of dithered images. It is still in use, and there are several techniques, each with a different set of properties. Enjoy.
 

{osmextract}
Brilliant piece of work by Robin Lovelace and Andrea Gilardi, who've released a new tool for R, called {osmextract}, that does what it says on the tin: "to make it easier for people to access OpenStreetMap (OSM) data for reproducible research".
 

Ploomber
"Ploomber is the simplest way to build reliable data pipelines for Data Science and Machine Learning. Provide your source code in a standard form and Ploomber will automatically construct the pipeline for you. Tasks can be anything from Python functions, Jupyter notebooks, Python/R/shell scripts, and SQL scripts."

US

PowerOutage.US
"PowerOutage.US collects, records, and aggregates live power outage data from utilities all over the United States, with the goal to create the single most reliable and complete source of power outage information available."
 

Reddit Is America’s Unofficial Unemployment Hotline
"As unemployment claims shot up early in the pandemic, so did posts on r/Unemployment, one of the many topic-based forums on the site known as subreddits. The subreddit once typically had fewer than 10 posts a day, but it quickly ballooned to nearly 1,000 posts a day in April and May. As the crisis wore on, posts and comments spiked in the weeks following changes to benefit programs. In January, nearly 10 months after the first lockdowns, the forum had one of its busiest weeks ever, driven by delays in payments and uncertainty around legislation signed late last year."
I love the New York Times' ability to find data analysis in unexpected places.
 

What 120 Executions Tell Us About Criminal Justice in America
"The Marshall Project tracked every execution in America for more than five years. For condemned people, the path to death grew longer, more winding and erratic."
 


 

Data analysis

Introducing the LIVE MUSIC JUKEBOX
The Pudding presents another excellent data analysis: a comparison of the studio version of a song with its live counterpart. I do miss live music.
 

Americans say U.S. can learn a lot from other countries on handling the coronavirus outbreak, other issues
Pew Research is not new to particularly fascinating survey analysis, and this recent analysis well deserves a spot. In general, "Americans believe that the U.S. government can learn a lot from other countries around the world about handling the outbreak and improving health care domestically", with some interesting demographic differences.
 

Is science still a man’s world?
"Hi, this is Aya from the support team. This week, I want to show you some arrow plots. They are simple, easy to create, and great for showing change.
Another great Datawrapper tutorial, using an important topic to show how to make a chart.
 

How fires have spread to previously untouched parts of the world
"Fires have always been a part of our natural world. But they’re moving to new ecosystems previously untouched by fire – and this is concerning scientists"
By Ashley Kirk and colleagues at The Guardian.
 

Everything else...

Online Culture Wars
"The map Online Culture Wars is an overlay of hundreds of politicized memes, along with influential political figures and symbols. It is designed as a discussion starter, intended to visualize and contextualize the ongoing online culture wars, and some of the main political references, actors, and influencers."

How we built an application to share data in real time across DIT’s services
Michal Charemza, Tech Lead at the Department for International Trade, tells the story of how they build a middleware that pools data from a variety of real-time systems.
 

OBJECTIVE OR BIASED
"On the questionable use of Artificial Intelligence for job applications."
We'll see more of this in the years to come – much more.
 

 
 
Support this newsletter & spread the word

Become a GitHub Sponsor :) It costs about a coffee per month, and you'll get an Open Data Rottweiler sticker (and other stuff). 

If you're a supporter of this newsletter, thanks a lot for your support. Share this e-mail with a friend, or via social media


    


"In other news" is supported by ProofRed, who offer an excellent proofreading service. If you need high-quality copy editing or proofreading, head to www.proofred.co.uk. Oh, they also make really good explainer videos.
Supported by my GitHub Sponsors 
Steve Parks
Naomi Penfold
Chris Weston
Fay Simcock
Chris Noden
Jeff Wilson
& others


Copyright © 2021 Puntofisso, All rights reserved.



unsubscribe from this list    update subscription preferences 

Email Marketing Powered by Mailchimp