21/11/2016

Notes from Effective Altruism "Global" "x" in Oxford, in 2016


(This is about this thing. The following would work better as a bunch of tweets but seriously screw that: )


###########################################################################################################


Single lines which do much of the work of a whole talk:

"Effective altruism is to the pursuit of the good as science is to the pursuit of the truth." (Toby Ord)

"If the richest gave just the interest on their wealth for a year they could double the income of the poorest billion." (Will MacAskill)

"If you use a computer the size of the sun to beat a human at chess, either you are confused about programming or chess." (Nate Soares)

"Evolution optimised very, very hard for one goal - genetic fitness - and produced an AGI with a very different goal: roughly, fun." (Nate Soares)

"The goodness of outcomes cannot depend on other possible outcomes. You're thinking of optimality." (Derek Parfit)



###########################################################################################################


Owen Cotton-Barratt formally restated the key EA idea: that importance has a highly heavy-tailed distribution. This is a generalisation from the GiveWell/OpenPhil research programme, which dismisses (ahem, "fails to recommend") almost everyone because a handful of organisations are thousands of times more efficient at harvesting importance (in the form of unmalarial children or untortured pigs or an unended world).

Then, Sandberg's big talk on power laws generalised on Cotton-Barratt's, by claiming to find the mechanism which generates that importance distribution (roughly: "many morally important things in the world, from disease to natural disasters to info breaches to democides all fall under a single power-law-outputting distribution").

Cotton-Barratt then formalised the "Impact-Tractability-Neglectedness" model, as a precursor to a full quantitative model of cause prioritisation.



Then, Stefan Schubert's talk on the younger-sibling fallacy attempted to extend said ITN model with a fourth key factor: awareness of likely herding behaviour and market distortions (or "diachronic reflexivity").

There will come a time - probably now tbf - when the ITN model will have to split in two: into one rigorous model with nonlinearities and market dynamism, and a heuristic version. (The latter won't need to foreground dynamical concerns unless you are 1) incredibly influential or 2) incredibly influenceable in the same direction as everyone else. Contrarianism ftw.)


###########################################################################################################


Catherine Rhodes' biorisk talk made me update in the worst direction: I came away convinced that biorisk is both extremely neglected and extremely intractable to anyone outside the international bureaucracy / national security / life sciences clique. Also that "we have no surge capacity in healthcare. The NHS runs at "98%" of max on an ordinary day."

(This harsh blow was mollified a bit by news of Microsoft's mosquito-hunting drones - used for cheap and large-sample disease monitoring, that is, not personalised justice.)


###########################################################################################################




Anders Sandberg contributed to six events, sprinkling the whole thing with his hyper-literate, uncliched themes. People persisted in asking him things on the order of "whether GTA characters are morally relevant yet". But even these he handled with his rigorous levity.

My favourite was his take on the possible expanded value space of later humans: "chimps like bananas and sex. Humans like bananas, and sex, and philosophy and competitive sport. There is a part of value space completely invisible to the chimp. So it is likely that there is this other thing, which is like whoooaa to the posthuman, but which we do not see the value in."


###########################################################################################################


Books usually say that "modern aid" started in '49, when Truman announced a secular international development programme. Really liked Alena Stern's rebuke to this, pointing out that the field didn't even try to be scientific until the mid-90s, and did a correspondly low amount of good, health aside. It didn't deserve the word, and mostly still doesn't.


###########################################################################################################


Nate Soares is an excellent public communicator: he broadcasts seriousness without pretension, strong weird claims without arrogance. A catch.


###########################################################################################################


What is the comparative advantage of us 2016 people, relative to future do-gooders?

  • Anything happening soon. (AI risk)
  • Anything with a positive multiplier. (schistosomiasis, malaria, cause-building)
  • Anything that is hurting now. (meat industry)


###########################################################################################################


Dinner with Wiblin. My partner noted that I looked a bit flushed. I mean, I was eating jalfrezi.


###########################################################################################################


Most every session I attended had the same desultory question asked: "how might this affect inequality?" (AI, human augmentation, ) The answer's always the same: if it can be automated and mass-produced with the usual industrial speed, it won't. If it can't, it will.

It was good to ask (and ask, and ask) this for an ulterior reason though, see the following:


###########################################################################################################


Molly Crockett's research - how a majority of people* might relatively dislike utilitarians - was great and sad. Concrete proposals though: people distrust people who don't appear morally conflicted, who use physical harm for greater good, or more generally who use people as a means. So express confusion and regret, support autonomy whenever the harms aren't too massive to ignore, and put extra effort into maintaining relationships.

These are pretty superficial. Which is good news: we can still do the right thing (and profess the right thing), we just have to present it better.

(That said, the observed effects on trust weren't that large: about 20%, stable across various measures of trust.)



* She calls them deontologists, but that's a slander on Kantians: really, most people are just sentimentalists, in the popular and the technical sense.



###########################################################################################################


Not sure I've ever experienced this high a level of background understanding in a large group. Deep context - years of realisations - mutually taken for granted; and so many shortcuts and quicksteps to the frontier of common knowledge. In none of these rooms was I remotely the smartest person. An incredible feeling: you want to start lifting much heavier things as soon as possible.


###########################################################################################################



Very big difference between Parfit's talk and basically all the others. This led to a sadly fruitless Q&A, people talking past each other by bad choice of examples. Still riveting: emphatic and authoritative though hunched over with age. A wonderful performance with an air of the Last of His Kind.

Parfit handled 'the nonidentity problem' (how can we explain the wrongness of situations involving merely potential people? Why is it bad for a species to cease procreating?) and 'the triviality problem' (how exactly do tiny harms committed by a huge aggregate of people combine to form wrongness? Why is it wrong to discount one's own carbon emissions when considering the misery of future lives?).

He proceeded in the (lC20th) classic mode: state clean principles that summarise an opposing view, and then find devastating counterexamples to them. All well and good as far as it goes. But the new principles he sets upon the rubble - unpublished so far - are sure to have their own counterexamples in production by the grad mill.


The audience struggled through the fairly short deductive chains, possibly just out of unfamiliarity with philosophy's unlikely apodicticity. They couldn't parse it fast enough to answer a yes/no poll at the end. ("Are you convinced of the non-difference view?")

The Q&A questions all had a good core, but none hit home for various reasons:

  • "Does your theory imply that it is acceptable to torture one person to prevent a billion people getting a speck in their eye?" Parfit didn't bite, simply noting, correctly, that 1) Dostoevsky said this in a more manipulative way, and 2) it is irrelevant to the Triviality Problem as he stated it. (This rebuffing did not appear to be a clever PR decision, though it was, since he is indeed a totalarian.)

  • "What implications does this have for software design?" Initial response was just a frowning stare. (Sandberg meant: lost time is clearly a harm; thus the designers of mass-market products are responsible for thousands of years of life when they fail to optimise away even 1 second delays.)

  • "I'd rather give one person a year of life than a million people one second. Isn't continuity important in experiencing value?" This person's point was that Parfit was assuming the linearity of marginal life without justification, but this good point got lost in the forum. Parfit replied simply - as if the questioner was making a simple mistake: "These things add up". I disagree with the questioner about any such extreme nonlinearity - they may be allowing the narrative salience of a single life to distract them from the sheer scale of the number of recipients in the other case - but it's certainly worth asking.


We owe Parfit a lot. His emphasis on total impartiality, the counterintuitive additivity of the good, and most of all his attempted cleaving of old, fossilised disagreements to get to the co-operative core of diverse viewpoints: all of these shine throughout EA. I don't know if that's coincidental rather than formative debt.

(Other bits are not core to EA but are still indispensable for anyone trying to be a consistent, non-repugnant consequentialist: e.g. thinking in terms of degrees of personhood, and what he calls "lexical superiority" for some reason (it is two-level consequentialism).)

The discourse has definitely diverged from non-probabilistic apriorism, also known as the Great Conversation. Sandberg is of the new kind of philosopher: a scientific mind, procuring probabilities, but also unable to restrain creativity/speculation because of the heavy, heavy tails here and just around the corner.



17/11/2016

data jobs, tautologies, bullshit, $$$


(c) Tom Gauld (2014)
When physicists do mathematics, they don’t say they’re doing “number science”. They’re doing math. If you’re analyzing data, you’re doing statistics. You can call it data science or informatics or analytics or whatever, but it’s still statistics... You may not like what some statisticians do. You may feel they don’t share your values. They may embarrass you. But that shouldn’t lead us to abandon the term “statistics”.

Karl Broman


what makes data science special and distinct from statistics is that this data product gets incorporated back into the real world, and users interact with that product, and that generates more data: a feedback loop. This is very different from predicting the weather...

– Cathy O'Neil / Rachel Schutt


"Data science" is the latest name for an old pursuit: the attempt to make computers give us new knowledge. * In computing's short history, there have already been about 10 words for this activity (and god knows how many derived job titles). So: here's an anti-bullshit exercise, a genealogy of some very expensive buzzwords.

The following are ordered by the year the hype peaked (as estimated by maximum mentions in books). You can play with proper data here.




  • "Expert systems"
    The original, GOFAI craft. Painstaking, manually-built rule stacks. 69% accuracy in certain medical tasks, which beat out human experts.


  • "Business intelligence"
    The most transparently hokum. I include the rigorous but dead-ended world of MDX and OLAP in here, perhaps unfairly: they're certainly still in use by some organisations who you'd expect to know better.


  • "Data mining". Originally a pejorative, among actual statisticians, meaning "looking for fake patterns to proclaim". Now reclaimed in industry and academia.** Compared to ML, data mining has a lot of corporate dilution, proprietary gremlins and C20th crannies in, from what I can tell. (Basically the same as "Knowledge discovery"?)


  • "Predictive analytics". See machine learning but subtract most.


  • "Big data". Somewhat meaningful as a concept, extremely tangible as an engineering challenge, and tied to genuinely new results. But still highly repugnant. Has captured much of the present job market, but the hype train has headed off well and truly.


  • "Machine learning". Applied statistics, but recast by computer scientists into algorithms. Goal: Getting systems that work fast rather than inferring the calibrated convergent truth. Along with stats, ML is the heart of the actual single phenomenon underlying all this money and hype.


  • "Data science". Recent high-profile successes in the AI/ML/DS space are largely due to the data explosion - not to new approaches or smarter protagonists. So this is at least half job title inflation. Still, it is handy to have a job title with enough elbow-room to be statistician and developer and machine teacher at once.


You might have hoped that nominally scientific minds would shun the proliferation of tautologous or meaningless terms. But stronger pressures prevail - chiefly the need of job security, via bamboozling clients or upper management or tech conference attendees.



##################################################################################################

* As always, we settle for optimal guesses instead of 'knowledge'.

If I'd said "the attempt to get knowledge from data" then of course I would just be describing statistics.^ This near miss doesn't bother me - despite the fact that statisticians computerised before any other nonengineering profession or field - and despite their building much of the theory and even implementations described in this piece (besides expert systems and GOFAI). Their gigantic century of work is a superset of what I'm talking about.
^ Obviously my initial definition is pretty close to "narrow artificial intelligence" too: at the limit, AI is "building a system for automatically getting knowledge from arbitrary input". Many of the successes described above also belong to them (particularly expert systems and GOFAI). "Data jobs", as I blandly put it in the header, are "jobs dealing with the fact that we don't quite have AI". There are a lot of terrible data jobs, and I'm not talking about them either. The full specification, if you bloody insist, is: "cultures, largely applied or industrial ones, which use cool data processing methods which are not really A.I. in the wide or strong sense, but which aren't standard 70s drone analyst work either. Nor have they anything to do with the very similar work of information physicists or electronic engineers or anything."

(But then all work in applied maths and stats shares a lot, since it's all based on the same world using the same concepts and logics. Only the goals and technologies really vary.)

I'm speaking as generally as I am - that is, almost speaking nonsense - so I can cut through the mire of terms, the effluent of the academic-industrial complex. In intellectual terms, it is pretty easy to refer to all the things I am trying to refer to: they are 'the formal sciences'. But I'm trying to tease out the practitioners, and the way-downstream economics.


** I had been dismissing "data mining" as just a 90s business way of saying "machine learning", but the distinction is actually fairly well-defined:
Data mining: Direct algorithm design for already well-defined goals - where you know what features to use. (e.g. "What kind of language do CVs use?")

Machine learning: Indirect algorithm design, via automated feature engineering, for a ill-defined goal. (e.g. "How do we distinguish a picture of a cat from a picture of a far away lynx ?")