I Introduction: No more long-form census
II Open data
III Big data
IV Health promoters and data
--Submitted by Robyn Kalda, Health Nexus
I Introduction: No more long-form census
In 2010, the Canadian government cancelled the long form of the census, which previously was sent to 20% of Canadian households and which – critically – was mandatory. If you received the long form, you heaved a deep sigh, brewed yourself a fresh pot of tea and settled yourself at the kitchen table with more than one sharp pencil. You had no choice: a determined, persuasive person would eventually arrive at your door to follow up should you fail to complete it. Aside from a few dedicated pranksters such as a friend of mine who used to ensure she was deep in the woods on the day upon which census questions focused so she could honestly answer “Electricity: no. Indoor bathrooms: zero” and the like, the long-form census data represented Canadians’ household situations fairly well.
On the other hand, the National Household Survey, intended to replace the long form, is voluntary. It is sent to 30% of Canadian households but, unsurprisingly, given the rather nosy nature of many of the questions and the sheer size of the survey, many fewer households choose to complete it.
We simply don’t have the data we once did.
Which means, of course, that surfacing differences and potential issues across Canada and making evidence-based programs, policies, and decisions has become more difficult – and sometimes impossible. Is data a health promotion issue? Yes, it is.
With such low response rates, are the data we do have reliable? The previous head of Statistics Canada thinks not (http://www.huffingtonpost.ca/2015/05/08/canada-national-household_n_7243...). It’s possible that the data we have misrepresent the population. Thus, it’s also possible that if we make decisions based on these data, our decisions may be incorrect even if our intentions are good.
Furthermore, neither the census nor the National Household Survey asks health-related questions. Without reliable data on some of the determinants of health, such as income, developing evidence-based policy or even raising issues of inequality becomes yet more difficult. For example, back in May the Mowat Centre released its map of the Hardest Places to Live in Canada and highlighted the difficulties it faced in obtaining adequate data for its report (http://mowatcentre.ca/where-are-the-hardest-places-to-live-in-canada/). Health is covered only in the much smaller Canadian Community Health Survey.
Unable to rely on federal data, provinces, regions, municipalities, and organizations are forced to collect their own data. While this ensures that they collect the exact data they need, it’s problematic. It’s more expensive than the census was and it produces data that may or may not be comparable across provinces, regions, and municipalities. Perhaps they all collect data on the same things and organize them in the same way, but odds are they don’t. Making comparisons and spotting trends across regions thus becomes impossible.
II Open data
On the other hand, a number of data-coordination efforts at various levels have arisen, generally falling under the rubric of Open Data. Perhaps due to resource limitations and fiscal pressures, governments, coalitions, and individual organizations are becoming more willing to release the data they do have for public use. See the Resources section for a few examples.
“Open data” is the idea that if you produce data which are reasonable to share, you ought to make them available to others. A key point is that data you release need to be usable by others: they need to be able to manipulate the data easily on their own, either to reproduce analyses you have done or to do further analyses of their own. Here is a two-minute music video which excellently describes open data principles: https://www.youtube.com/watch?v=J180r2U2KnY&feature=youtu.be
A brief diversion on the word “free”: it is used in two different senses, gratis (“free as in beer” – something that is without charge to the user) and libre (“free as in speech” – something with which the user can do what s/he likes). (https://en.wikipedia.org/wiki/Gratis_versus_libre). Ideally, we’d like open data to be both gratis and libre.
What we would not like, however, is for open data to impinge on privacy. Not all data can be or should be open; risks versus benefits must be considered and weighed. Most personal health data naturally falls into the “too risky” category. The lowest level of collation at which data are not personally identifiable needs careful consideration: a level that anonymizes people effectively in a large city due to sheer numbers may leave people easily identifiable in a rural area. Health promoters have a role both in using open data and in being privacy watchdogs. A population lacking adequate informational privacy is not a population empowered to make the best decisions about its health.
Another area of concern for health promoters is the commercialization of data analysis. Private companies may use open data for their own purposes and may not disclose their analytic processes, preferring to sell only their results. Can we trust these commercial analyses? Do they furnish sufficiently valid evidence for decision-making or planning purposes? It is difficult to know.
III Big data
Widespread computerization by businesses and institutions has created massive datasets – generally not open data -- which can be mined for trends and other insights. Nearly everything a person does now creates a data trail, which can be sold by the collector and combined with any other data for any purpose.
Big data can be used to inflict modern, more insidious forms of redlining, such as presenting higher prices to those perceived as higher risk. It can imperil privacy. It can be intrusive and/or unwelcome: famously the US chain Target outed a pregnant teenager to her father via the coupons it sent to their house (http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html?pagewant...), and many women who miscarry continue to receive ads for pregnancy- and baby-related items (http://www.theglobeandmail.com/life/relationships/big-data-is-watching-y...).
Of course, big data has its upsides, even for health promoters. Using current tools and analytic techniques, we can use big data to both spot trends in determinants of health and to model potential solutions. For example, we can combine multiple smaller datasets in ways that can be more informative than any would be on their own (http://bds.sagepub.com/content/2/1/2053951715589418). At a practical level, large, combined datasets can help focus everything from literacy programs to disaster planning and recovery efforts on areas of greatest need. Repurposing data which may offer proxy measures on determinants of health could be a valuable avenue of inquiry for health promoters looking to replace or augment census data.
IV Health promoters and data
Health promoters have roles at all levels -- as data producers, consumers, and interpreters.
If resources permit, health promoters can be valuable data collectors. As part of our own evaluation and planning efforts we may create data repositories that collect information known nowhere else.
With all data analysis, no matter the data source, often we have many onlookers and few players. Most people don’t wish to analyze data; they wish for someone else to do that difficult work and to present them with results they can use or publicize. Health promoters can do some of those difficult analysis pieces on their own data and/or on open data produced by others.
Health promoters also have a strong role as intermediaries, helping interpret the strengths, gaps, and biases in their own analyses and those of others. Another intermediary role for health promoters is as convenors of or contributors to open data initiatives, helping multiple producers of data work together for collective impact.
In an environment increasingly focused on evidence, health promoters are well placed to consider healthy and unhealthy uses of data and its many uses in enabling people to increase control over, and to improve, their health.
Canadian Centre for Policy Alternatives: How the government’s census strategy keeps us in the dark http://behindthenumbers.ca/2015/08/24/how-the-governments-census-strateg...
Munir Sheikh: Bad Info From NHS Will Lead To Bad Planning http://www.huffingtonpost.ca/2015/05/08/canada-national-household_n_7243...
Access denied: Why I can’t report where B.C.’s immigrants come from http://blogs.vancouversun.com/2015/06/25/access-denied-why-i-cant-report...
Mowat Centre: Where Are the Hardest Places to Live in Canada? http://mowatcentre.ca/where-are-the-hardest-places-to-live-in-canada/
That time they tried to do a study but gave up for lack of data http://www.theglobeandmail.com/globe-debate/editorials/that-time-they-tr...
On big data
Small Big Data: Using multiple data-sets to explore unfolding social and economic change http://bds.sagepub.com/content/2/1/2053951715589418
Deconstructing the cloud: Responses to Big Data phenomena from social sciences, humanities and the arts http://bds.sagepub.com/content/2/2/2053951715594635
Redlining for the 21st Century http://www.theatlantic.com/business/archive/2014/03/redlining-for-the-21...
How Companies Learn Your Secrets http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html?pagewant...
Big Data is watching you. Has online spying gone too far? http://www.theglobeandmail.com/life/relationships/big-data-is-watching-y...
On open data
Gratis vs libre https://en.wikipedia.org/wiki/Gratis_versus_libre
Ontario Nonprofit Network Data Strategy http://ontariononprofitnetwork.onefireplace.org/Towards-a-Data-Strategy-...
Canada Social Report: A Compendium of Social Information http://www.canadasocialreport.ca/
Canadian Partnership for Tomorrow Project (data for cancer and chronic disease research) http://www.partnershipfortomorrow.ca/
Open Government Canada http://open.canada.ca/en
Ontario Open Government Initiative http://www.ontario.ca/page/open-government
Toronto Open Data Portal http://www1.toronto.ca/wps/portal/contentonly?vgnextoid=9e56e03bb8d1e310...
The Open Data Song https://www.youtube.com/watch?v=J180r2U2KnY&feature=youtu.be
Datafication and empowerment: How the open data movement re-articulates notions of democracy, participation, and journalism http://bds.sagepub.com/content/2/2/2053951715594634
Privacy By Design https://www.privacybydesign.ca/
Why Not Privacy By Default? (75-page PDF) http://scholarship.law.berkeley.edu/btlj/vol29/iss1/3/
We’ll see you, anon: Can big databases be kept both anonymous and useful? http://www.economist.com/news/science-and-technology/21660966-can-big-da...