Copy
26 January 2021 
#406: quantum of sollazzo – The data newsletter by @puntofisso

Read it in your browser


Post-Trump
 
What Makes Soup, Soup? This fun video that Guy Lipman sent me is, well, a piece of comedy. But it points to an interesting discussion in the realm of data, that of definitions. We don't talk enough of this problem, which is one of my pet peeves: how do we define in data concepts that are, by nature, fuzzy and imprecisely defined?

I recently gave a lecture (video, podcast, and writeup are here) on the use of data in public communications and the issue of data definitions was one of my central points: unemployment is defined in six different ways in the US, and the definition of "being employed" used in the UK for statistical purpose is somewhat far from what the common sense experience of being meaningfully in work (it is: "having completed one hour of work in the previous week").

In a world in which data is becoming increasingly part of our lives, with COVID related stats being used daily by leaders and the media, it is becoming urgent that we make an effort to raise awareness in the public that the things we call "death", "employment", "average", "income", and so on, are often not exactly the same thing as the data labelled "death", "employment", "average", or "income". How can we do this well? I'm still pondering.

-

This week I've sent you too many links, once again. I promised I'd do better. Sorry. Maybe I should move to bi-weekly?
 
-

On the data jobs front, two juicy roles.

The first is at Carbon Tracker, a London-based NGO that carries out financial analysis on the impact of climate change and the energy transition on capital markets. They are looking for a web developer specialised in data visualisation to create an interactive portal on the economics of Oil and Gas companies. The end result will be similar to their power utilities company profiles.
Deadline: quite stringent, as the publication of the portal is due by end of February.  
Skills required: they are hoping to continue to use plotly/dash for ease of maintenance, but are open to other suggestions.
If you're interested, send them an email at hello@carbontracker.org with an example of your past work.

The second is from the BBC Shared Data Unit. They are looking for journalists who can code. It means you'll work with data journalism legends Alex Homer and Pete Sherlock.

-

Speaking of journos, fake news superstar warrior James Ball is moderating a panel with Prof Sir David Spiegelhalter, Marianna Spring, Kate Wilkinson, Nina Jankowicz and Anjana Ahuja, on the topic "Giving it our best shot: how can journalists tackle Covid-19 misinformation".

-

There was a flow of new subscribers and GitHub sponsors last week, so let me welcome you all :) Your support is much appreciated. My intention is to keep this newsletter free for as long as I can. GitHub and content sponsors (see below) are a good signal that there is good value in this newsletter, without it being (yet? :P) a full-time job. 

And so we have again some some sponsored geotastic content – Ed Freyfogle, who's the organiser of location-based service meetup Geomob, co-host of the Geomob podcast, and co-founder of the OpenCage Geocoder, has offered to introduce a series of points around the topic of geodata. His fifth entry, on geocoding annotations,  is below.

Till next week,
––Giuseppe @puntofisso
 

 

 
--- Sponsored content by Open Cage ---

Geocoding is just the first step

Most data projects involve tedious cleaning and enriching before they can actually be "used".

At OpenCage, we are firm believers that laziness is one of the virtues of a great developer. We’ve thought a lot about making geocoding with open data dead simple, but also how to make the total journey to using the data easier. Our geocoding API returns "annotations" - extra information about the location that developers might find useful, thus saving work.

An example is United Nations M49 codes, standard codes commonly used for linking datasets and statistical analysis. Looking up the relevant codes for a region is not particularly complex, but it is the kind small tasks that need to be done correctly (and maintained) in a larger data processing project. So, as a simplification for our users, we already return the correct codes as an annotation.

As an example, a request to the OpenCage geocoder for 17.028,-25.214 (in the Cape Verde islands) returns the annotation

"UN_M49": {
  "regions": {
    "AFRICA": "002",
    "CV": "132",
    "SUB-SAHARAN_AFRICA": "202",
    "WESTERN_AFRICA": "011",
    "WORLD": "001"
  },
  "statistical_groupings": [
    "LEDC",
    "SIDS"
  ]
},

We also return many other types of information, for example: the local timezone, calling code, currency information, other reference systems like geohash, what3words, MGRS, Maidenhead, the time of sunrise and sunset, the qibla angle, and much more.

We hope all of this helps make developers’ lives easier.

If you need to process geodata, give the OpenCage geocoding API a try.

 
Politics

Cities Say They Want to Defund the Police. Their Budgets Say Otherwise.
As you remember, after the Floyd murder there were calls for reducing budgets to the police forces in the US (they are interestingly much bigger than UK budgets).
"Even as the 50 largest U.S. cities reduced their 2021 police budgets by 5.2% in aggregate—often as part of broader pandemic cost-cutting initiatives—law enforcement spending as a share of general expenditures rose slightly to 13.7% from 13.6%, according to data compiled by Bloomberg CityLab."
Austin, Texas, saw a whopping -33% budget reduction.
 

Is there any room for a personal Trump party in the U.S. Congress?
"A Games Theory application says no."
This is a quirky approach, by ex-POLITICO.EU journalist Francesco Piccinelli Casagrande, looking at the US political divide. The (R) source code is available too.

Environment

A minimal chart about global warming for the hottest year on record
Datawrapper's Simon Jockers shows how to create "climate stripes", a type of charts that became popular a few years back as they intuitively capture the shift in average temperatures produced by climate change.

Where 2020's Record Heat Was Felt the Most
Last summer, London experienced a week of highs that exceeded 35C (I remember the peak of 38C). This was definitely above average and felt unbearable, but for other locations around the world the experience was even more extreme, making 2020 tied with 2016 as the hottest year on record.
 
COVID

Chalabi illustrates Kucharski
Twitter at its best: US Guardian Data Editor Mona Chalabi takes a thread about COVID infections by Prof Adam Kucharski and creates brilliant illustrations.
 

Europe is becoming more pro-vaccine
This is what a YouGov poll suggests.
 

Do your neighbors want to get vaccinated?
"If you live in Gregg county, Texas or Terrebonne parish, Louisiana, you might be out of luck."
America is at the same time amazing for the ability of its academics to run nationwide granular survey like these ones, and depressing for what those surveys often tell you...

 

Data thinking

None Of Us Are Free If Some Of Us Are Not: Catherine D’Ignazio on Data Feminism
Author of "Data Feminism" D'Ignazio spoke with Jason Forrest for Nightingale, resulting in this thought-provoking interview.
"What are some of the things that we would have to shift in order to make data visualization feminist?"
"[...] think about maps. Maps come about because of, basically, European nation-states trying to dominate and exploit the world and commit genocide. That’s our history of maps that we inherit. That doesn’t mean that those tools and technologies can not be re-engineered for other kinds of purposes. While we inherit a very flawed history when we use these flawed tools, that doesn’t mean that we can’t take steps towards justice or more emancipatory uses of those same tools."


The Data of Long-lived Institutions
"The story of which institutions have lasted the longest throughout history, and why."
Over 5,500 companies are at least 200 years old, most of them are based in Japan. Other institution are really long-lived (the Catholic Church, etc). What does this teach us about setting up organisations intended to last for 10,000 years?


Tools, Resources & Tutorials 

Make Your Own Internet Archive With Archive Box
The Internet Archive is a useful website that makes timed copies of millions of websites. But it has some limitations: as it complies with the directive on a website's robots.txt file, not all pages on a website might be archived. In addition, it doesn't archive embedded rich media. 
This article explains how you can build your own Internet Archive if there are websites you need to store, and I can think of a few journalistic projects that could make good use of it.
 

Tools for Podcasting
Literally hundreds of pages with tutorials and resources.

Making a digital clock in Google Sheets
This is entirely pointless and over-engineered, but it has a certain appeal to it, because this is the way we used to code things in the 1980s. ;-)


COVID-19 Coronavirus Data Dashboard
A fantastic, fully-customisable dashboard by InformationIsBeautiful.
(via Massimo Conte)
 

NUMBEO's Cost of Living database
Numbeo is a crowd-sourced global database of information including cost of living, housing, crime, healthcare, and more.
With all natural caveats for a crowd-sourced solution, a presentation of their methodology is here.
(via Nicola Del Monaco)
 

Finding and Visualising Interactions
"Analysing interactions using feature importance, Friedman’s H-statistic and ICE Plots."
 

One chart at a time
A new video series on YouTube. Each video is 6-7 minutes long.
"Welcome to the daily One Chart at a Time series that will expand your graphic literacy. With over 50 videos released on a daily basis in early 2021, this series will help you learn about more than just the standard bar, line, and pie chart."
 
 
Maps & data viz

Where's the nearest football team?
One of those brilliant maps by Alasdair Rae, with an accompanying blog post that includes a lot of extra facts.
 

Counting
"Not all languages compose numbers higher than 10 just adding “ten” to a simple number. We can classify languages according to the way numbers higher than 10 are made."
The map of 99 below is one of the best.
(via Guy Lipman)
 

Three-dimensional model of electricity consumption in Manchester
This isn't just an extraordinary example of a "physical" data visualization. It's also an example of the amazing cataloguing and annotation work done by the Science Museum, which powers this website of items from their archive, which include a packed JSON file. Definitely to be explored more.
This came via a conversation on Twitter that also made me discovery the – unrelated to the Science Museum – Gallery of Physical Visualizations and Related Artifacts.
(via Jeni Tennison)
 
Figuring out orbital positions from orbital elements
Have you ever wondered how to build accurate visualizations of planet motions like, for example, this famous one, or maybe an ephemeris?
This article explains how to do it.
Slightly related, this Forbes article with details on how to calculate the position of the Sun over the year for any location.
 


 
Support this newsletter & spread the word

Become a GitHub Sponsor :) It costs about a coffee per month, and you'll get an Open Data Rottweiler sticker (and other stuff). 

If you're a supporter of this newsletter, thanks a lot for your support. Share this e-mail with a friend, or via social media


    


"In other news" is supported by ProofRed, who offer an excellent proofreading service. If you need high-quality copy editing or proofreading, head to www.proofred.co.uk. Oh, they also make really good explainer videos.
Supported by my GitHub Sponsors 
Steve Parks
Naomi Penfold
Chris Weston
Fay Simcock
Chris Noden
Jeff Wilson
& others


Copyright © 2021 Puntofisso, All rights reserved.



unsubscribe from this list    update subscription preferences 

Email Marketing Powered by Mailchimp