20 Takeaways from the CogX event

8 min readJun 13, 2018

20 Takeaways from the CogX event

First, a clarification. I visit events such as the CogX events to be stimulated, to have new thoughts and to have the neurons in my brain fired in new ways. I go to learn, not to network. So I agonise over sessions to attend, and importantly the sessions I miss. Of which there is an overwhelming majority, as this event was running 5–7 conference tracks at any time, as well as a lot of other small stage events. By and large though it’s a weirdly monastic experience, surrounded by people, but very much alone in my head, to the point where I’m actually a little bit annoyed when somebody wants to talk to me! This then is the list of things that made me think.

If there was one session that made attending the event worthwhile for me, it was Zavain Dar’s session on the New Radical Empiricism (NRE). His argument is that the traditional scientific method is based on certain rational assumptions — which are now challenged. In the classic method, you would hypothesise that the earth was round, find the right experiments to run, collect data and prove/ disprove your hypothesis. This runs into trouble when the computational models are too complex and / or changing too often — such as gene sequencing or macroeconomic data. Also this is not efficient when the range of options is vast and we don’t know what data might be relevant — e.g. curing cancer. The traditional methods may yield results, but it might take a lifetime of research and work to get there. What Dar calls the NRE is the opposite — a data driven view which allows machine learning to build hypotheses based on patterns it finds in the data. So in the NRE world, rather than starting with whether the earth is round, you would share a lot of relevant astronomical data and ask the machine to discover the shape of the world. This approach works best in areas where we have a data explosion such as genomics and computational biology. Or where there is plenty of data but is shackled by traditional hypotheses based methods, such as macroeconomics. An additional problem that NRE solves is where the problem space is simply to complex for human minds to compute — both the examples above are instances of this complexity. You may know that Radical Empiricism is by itself a construct from the late 19th century by William James — which eschews intuition and insists on physical experiences and concrete evidence to support cause and effect. Its worth noting that there are plenty of examples of environments where quantifiable data is not yet abundant, where experts still follow the traditional method driven by hypotheses. VC investing, ironically, is such an area!
This also led to a discussion on Deeptech led by Azeem Azhar of Exponential View and panelists from Lux, Kindred Capital and Episode1 Ventures. Deeptech is defined from an investment perspective as companies and start ups who are building products which involve technical risk. Not using existing tech to solve new problems. Usually involving products and ideas which a few years ago would have to subsist on research grants and be housed by academic institutions.
Jurgen Schmidhuber’s session on LSTM was another highlight. Schmidhuber’s PhD thesis on LSTM (Long Short Term Memory), in 1997 was a foundation of the AI advancement which was used by a number of technology products and subsequent development. Schmidhuber presented an excellent timeline of the evolution of AI in the past 20 years and ended with a long view where he explored the role of AI and ML in helping us reach resources that were not on earth but scattered across the solar system, the galaxy and beyond. And how we might perceive today’s technology and advancement in a few thousand years.
One of Schmidhuber’s other points was around curiosity driven learning. Mimicking the way an infant learns, by exploring his or her universe. This is the idea that a machine can learn through observation and curiosity, about it’s environments.
Joshua Gans, the author of Prediction Machines, and professor of Economics and Tech Innovation, talked about AI doing to prediction what computers did to arithmetic. Essentially they dramatically reduced the cost of complex arithmetical operations. AI does the same for prediction or inference. Which is essentially making deductions about the unknown based on the known. And bringing down the cost of prediction has a massive impact on decision making because that’s what we’re doing 80% of the time, at work, as managers.
Moya Green, the CEO of Royal Mail talked about the transformation that Royal Mail went through — including an increase in technology team size from 60 to over 550 people. She also made the comment that most managers still under-appreciate the value of tech, and overestimate their organisations capability to change, and absorb new tech.
Deep Nishar of Softbank used an excellent illustrative example of how AI is being used to provide personalised cover art for albums by digital streaming and media providers, based on users choices and preferences.
Jim Mellon, long time investor and current proselytiser of life-extending tech suggested that Genomics would be a bigger breakthrough than semiconductors. He was joined by the chief data officer for Zymergen, which works on bio-manufactured products, based on platforms which work with microbial and genetic information.
A very good data ethics panel pondered the appropriate metaphors for data. We’ve all heard the phrase data is the new oil. Yet that may be an inadequate descriptor. Experts on the panel posited metaphors such as ‘hazardous material’, ‘environment’, ’social good’ etc. because each of these definitions are useful in understanding how we should treat data. Traditional property based definitions are limited and it was mentioned that US history has plenty of examples of trying to correct social injustice via the property route (reservations for native Americans), which have not worked out. Hence we need these alternative metaphors. For example, the after-effects of data use is often misunderstood, and sometimes it needs to be quarantined or even destroyed, like hazardous material, according to Ravi Naik of ITN Solicitors.
Michael Veale of the UCL suggested that ancient Greeks used to make engineers sleep under the bridges they built. This principle of responsibility for data products needs to be adopted for some of the complex products being built today by data engineers. Data use is very hard to control today, so rather than try and control it’s capture and exploitation, the focus perhaps should be on accountability and responsibility.
Stephanie Hare made the excellent point that biometric data can’t be reset. You can reset your password or change your email, phone number, or even get a completely new ID. But you can’t get new biometrics (yet). This apparent permanence of of biometrics should give us pause to think even harder about how we collect and use it for identification, for example in the Aadhaar cards in India.
Because of the inherently global flows of data and the internet, the environmental model is a good metaphor as well. Data is a shared resource. The lines of ownership are not always clear. Who owns the data generated by you driving a hired car on a work trip? You? Your employer? The car company? The transport system? Clearly a more collective approach is needed and much like social goods, such as the environment, these models need to validate the shared ownership of data and it’s joint stewardship by all players in the ecosystem.
Stephanie Hare, who is French Historian by education provided the chilling example of how the original use vs ultimate use of data can have disastrous consequences. France had a very sophisticated census system and for reasons to do with it’s muslim immigrants from North Africa captured the religion of census correspondents. Yet, this information was used to round up all the jewish population and hand them over to the Nazis because that’s what the regime at the time felt justified in doing.
On a much more current and hopeful note, I saw some great presentations by companies like Mapillary and SenSat, and Teralytics which focus on mapping cities with new cognitive tools. Especially for cities which are of less interest to tech giants, and using crowdsourced information and data, which may include mobile phone and wifi usage, or street level photographs all used with permission, for example.
At a broader level, the smart cities discussions, strongly represented by London (Theo Blackwell) and TFL (Lauren Sager Weinstein) shows the transition from connected to smart is an important one. Very good examples by TFL on using permission based wifi tracking at platforms to give Line Managers for each of the tube lines much more sophisticated data on the movement of people, to make decisions about trains, schedules and crowd management, over and above the traditional ways which include CCTV footage or human observation on platforms.
At a policy level, a point made by Rajiv Misra, CEO of Softbank Investment Advisors (aka the Vision Fund) is that while Europe leads in a lot of the academic and scientific work being done in AI, it lags in the commercial value derived by AI, notably to China and the US. A point echoed by the House of Lords report on AI which talks about the investments and commitment needed to sustain the lead the UK enjoys in AI, currently. Schmidhuber’s very specific solution was to mimic the Chinese model — i.e. identify a city and create an investment fund of $2bn to put into AI.
I also sat through a few sessions on Chatbots and my takeaway is that chatbots are largely very much in the world of hype machines. There is very little ‘intelligence’ that it currently delivers. Most platforms rely on capturing all possible utterances and coding them into the responses. Even NLP is still at a very basic stage. This makes chatbots basically a design innovation — where instead of finding information yourself, you have a ‘single window’ through which to request all sorts of information. Perhaps its a good thing that the design challenges are getting fixed early, so that when intelligence does arrive, we won’t stumble around trying to design it.
Within the current bot landscape, one useful model that I heard is ‘Treat a bot like a new intern that doesn’t know much’ and let it have a similar personality so that it provides responses that are appropriate and also sets expectations accordingly. It might just start with a ‘hello, I’m new so bear with me if I don’t have all the answers’, for example.
Dr Julia Short, who has built Spot — a chatbot to handle workplace harassment provided a very interesting insight about the style of questions such a bot might ask. A police person’s questions on the one hand are all about capturing in detail exactly what happened and making sure that the respondent is clear and lucid about events, incidents, and the detail. A therapists questions and line of discussion on the other hand is all about helping a victim get over some the details and get on with their lives. This suggests that you need to be clear whether your bot is an extension of the law enforcement or a counselling body. It also suggests that you might want to do the former before the latter.
A really important question that will not leave us is: what do we do if the data is biased? If we are conscious of certain biases which are to do with gender, race or age, then we can guard against them either at the data level or at the algorithmic level, but we also need to be able to detect biases. For example, the example which I’ve now read in a few places of how the leniency of sentences handed out by judges in juvenile courts in the US vary inversely with the time since the last meal of the judge.

Clearly all of this really represents under 20% of the great discussions over the 2 days. Please do add your own comments, takeaways and thoughts.

Written by Ved Sen